Abstract
The Albumentations package is an API written specifically for data enhancement. It basically contains a large number of data enhancement tools.
Albumentations supports all common computer vision tasks such as classification, semantic segmentation, instance segmentation, object detection, and pose estimation.
2. The library provides a simple unified API for handling all data types: images (RBG images, grayscale images, multispectral images), segmentation masks, bounding boxes, and key points.
3. The library contains over 70 different enhancements to generate new training samples from existing data.
4, Albumentations fast. We benchmark each new release to ensure that enhancements provide maximum speed.
5. It works with popular deep learning frameworks like PyTorch and TensorFlow. Albumentations, by the way, are part of the PyTorch ecosystem.
6. Written by experts. The authors have experience both in the production of computer vision systems and in competitive machine learning. Many of the core team members are Kaggle Masters and Grandmasters.
7. The library is widely used in industry, deep learning research, machine learning competitions and open source projects.
PIP installation for Albumentations
pip install albumentations
The problem
There is a small problem in this part. I have installed Opencv-Python on my machine, and when WE tried to install albumentations, there was a problem. Our Opencv-Python blocked the installation of albumentations. # Could not install packages due to anEnvironmentError: [WinError 5] Access is denied because opencv-python-headless is required when installing albumentations. This library conflicts with OpencV.
The solution
The solution to this problem is to install albumentations in our new virtual environment. When we install albumentations, we will install opencv-python-headless, which can replace opencv-Python. So we don’t have to install OpencV.
Benchmark results
The test used ImageNet to validate the results of the benchmark test run on an Intel Xeon Gold 6140 CPU for the first 2000 images of the set. All output is converted to a continuous NumPy array with NP. Uint8 Data type. The table shows the number of images that can be processed per second on a single core; The higher the better.
Python and library versions: Python 3.8.6 (default, Oct 13 2020, 20:37:26) [GCC 8.3.0], numpy 1.19.2, Pillow 7.0.0.post3, Opencv-python 4.4.0.44, scikit-image 0.17.2, scipy 1.5.2.
Spatial level transforms
Spatial level transformations will change both the input image and additional targets, such as masks, bounding boxes, and key points. The following table shows what additional targets each transformation supports.
Spatial-level transforms change both the input image and additional targets, such as masks, boundary boxes, and keypoints. The following table shows what additional targets each transformation supports.
List of supported
- Blur
- CLAHE
- ChannelDropout
- ChannelShuffle
- ColorJitter
- Downscale
- Emboss
- Equalize
- FDA
- FancyPCA
- FromFloat
- GaussNoise
- GaussianBlur
- GlassBlur
- HistogramMatching
- HueSaturationValue
- ISONoise
- ImageCompression
- InvertImg
- MedianBlur
- MotionBlur
- MultiplicativeNoise
- Normalize
- Posterize
- RGBShift
- RandomBrightnessContrast
- RandomFog
- RandomGamma
- RandomRain
- RandomShadow
- RandomSnow
- RandomSunFlare
- RandomToneCurve
- Sharpen
- Solarize
- Superpixels
- ToFloat
- ToGray
- ToSepia
Simple use cases
import albumentations as A
import cv2
import matplotlib.pyplot as plt
# Declare an augmentation pipeline
transform = A.Compose([
A.RandomCrop(width=512, height=512),
A.HorizontalFlip(p=0.8),
A.RandomBrightnessContrast(p=0.5),])# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = transform(image=image)
transformed_image = transformed["image"]
plt.imshow(transformed_image)
plt.show()
Copy the code
Original image:
Running results:
Detailed Use Cases
VerticalFlip flips the input vertically around the X-axis
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.VerticalFlip(always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('Image after Blur')
plt.imshow(transformed_image)
plt.show()
Copy the code
Blur Input image
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.Blur(blur_limit=15,always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('Image after Blur')
plt.imshow(transformed_image)
plt.show()
Copy the code
The HorizontalFlip flips input horizontally around the Y-axis
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.HorizontalFlip(always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('Images after HorizontalFlip')
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
Flip horizontal, vertical or horizontal and vertical Flip inputs
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.Flip(always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('Images after the Flip')
plt.imshow(transformed_image)
plt.show()
Copy the code
The running results have certain randomness, as shown in the following figure:
Transpose, which transposes input by swapping rows and columns
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.Transpose(always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('Transpose images')
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
RandomCrop
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.RandomCrop(512.512,always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('Image after RandomCrop')
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
RandomGamma gray scale coefficient
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.RandomGamma(gamma_limit=(20.20), eps=None, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('Image after RandomGamma')
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
RandomRotate90 will enter a random rotation of 90 degrees N times
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.RandomRotate90(always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('RandomRotate90 images')
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
ShiftScaleRotate randomly panning, scaling, and rotating inputs
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.ShiftScaleRotate(shift_limit=0.0625, scale_limit=0.1, rotate_limit=45, interpolation=1, border_mode=4, value=None, mask_value=None, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('ShiftScaleRotate post image ')
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
CenterCrop crops the center portion of the image
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.CenterCrop(256.256, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Image behind CenterCrop")
plt.imshow(transformed_image)
plt.show()
Copy the code
GridDistortion
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.GridDistortion(num_steps=10, distort_limit=0.3,border_mode=4, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Picture behind GridDistortion")
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
ElasticTransform ElasticTransform
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.ElasticTransform(alpha=5, sigma=50, alpha_affine=50, interpolation=1, border_mode=4,always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Image after ElasticTransform")
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
RandomGridShuffle cuts images into grid cells and arranges them randomly
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.RandomGridShuffle(grid=(3.3), always_apply=False, p=1) (image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Image after RandomGridShuffle")
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
HueSaturationValue Randomly changes the color, saturation, and value of the image
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.HueSaturationValue(hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Image after HueSaturationValue")
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
PadIfNeeded fills the image
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.PadIfNeeded(min_height=2048, min_width=2048, border_mode=4, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Image after PadIfNeeded")
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
RGBShift, for each channel of the RGB image to move the value randomly
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.RGBShift(r_shift_limit=10, g_shift_limit=20, b_shift_limit=20, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Images after RGBShift")
plt.imshow(transformed_image)
plt.show()
Copy the code
GaussianBlur uses gaussian filter with random kernel size to blur the image
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.GaussianBlur(blur_limit=11, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("GaussianBlur after image")
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
CLAHE adaptive histogram equalization
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.CLAHE(clip_limit=4.0, tile_grid_size=(8.8), always_apply=False, p=0.5)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("The image after CLAHE")
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
ChannelShuffle Randomly rearranges the channels for input RGB images
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.ChannelShuffle(always_apply=False, p=0.5)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Images after ChannelShuffle.")
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
InvertImg color
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.InvertImg(always_apply=False, p=0.5)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("InvertImg image")
plt.imshow(transformed_image)
plt.show()
Copy the code
Cutout Randomly erases
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.Cutout(num_holes=20, max_h_size=20, max_w_size=20, fill_value=0, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Image after Cutout")
plt.imshow(transformed_image)
plt.show()
Copy the code
RandomFog
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.RandomFog(fog_coef_lower=0.3, fog_coef_upper=1, alpha_coef=0.08, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Images after RandomFog")
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
GridDropout Grid erase
import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.GridDropout(ratio=0.5, unit_size_min=None, unit_size_max=None, holes_number_x=None, holes_number_y=None,
shift_x=0, shift_y=0, always_apply=False, p=0.5)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original') # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Images after GridDropout")
plt.imshow(transformed_image)
plt.show()
Copy the code
Running results:
Composition transformation (Compose)
Transformations can be used not only alone, but in combination, using the Compose class, which inherits from BaseCompose. The Compose class contains the following parameters:
- Transforms: Transforms an array of classes, of type list
- Bbox_params: Parameter for bounding boxes conversion, of type BboxPoarams
- Keypoint_params: Parameter used for keypoints conversion, of type KeypointParams
- Additional_targets: specifies the new target name of the key. The value is a dict of the old target name, for example, {‘image2’: ‘image’}
- P: The probability of using these transforms. The default value is 1.0
image3 = Compose([
# Contrast restricted histogram equalization
# (Contrast Limited Adaptive Histogram Equalization)
CLAHE(),
# Random rotation 90°
RandomRotate90(),
# transpose
Transpose(),
# Random affine transformation
ShiftScaleRotate(shift_limit=0.0625, scale_limit=0.50, rotate_limit=45, p=75.),
# fuzzy
Blur(blur_limit=3),
# Optical distortion
OpticalDistortion(),
# Mesh distortion
GridDistortion(),
Change the HUE, saturation, and value of the image randomly
HueSaturationValue()
], p=1.0)(image=image)['image']
Copy the code
Random selection (OneOf)
It’s composed, like Compose, and it has a probability. The difference lies in the fact that the transformations in the Compose combination are performed in the order next to each other, while in the OneOf combination, the system automatically selects OneOf them to perform the transformations, and the probability parameter p here refers to the probability that the selected transformation will be performed. Ex. :
image4 = Compose([
RandomRotate90(),
# flip
Flip(),
Transpose(),
OneOf([
# Gaussian noise
IAAAdditiveGaussianNoise(),
GaussNoise(),
], p=0.2),
OneOf([
# Fuzzy related operations
MotionBlur(p=2.),
MedianBlur(blur_limit=3, p=0.1),
Blur(blur_limit=3, p=0.1),
], p=0.2),
ShiftScaleRotate(shift_limit=0.0625, scale_limit=0.2, rotate_limit=45, p=0.2),
OneOf([
# Distortion related operation
OpticalDistortion(p=0.3),
GridDistortion(p=1.),
IAAPiecewiseAffine(p=0.3),
], p=0.2),
OneOf([
# Sharpening, embossing and other operations
CLAHE(clip_limit=2),
IAASharpen(),
IAAEmboss(),
RandomBrightnessContrast(),
], p=0.3),
HueSaturationValue(p=0.3),
], p=1.0)(image=image)['image']
Copy the code
Use in programs
def get_transform(phase: str) :
if phase == 'train':
return Compose([
A.RandomResizedCrop(height=CFG.img_size, width=CFG.img_size),
A.Flip(p=0.5),
A.RandomRotate90(p=0.5),
A.ShiftScaleRotate(p=0.5),
A.HueSaturationValue(p=0.5),
A.OneOf([
A.RandomBrightnessContrast(p=0.5),
A.RandomGamma(p=0.5),
], p=0.5),
A.OneOf([
A.Blur(p=0.1),
A.GaussianBlur(p=0.1),
A.MotionBlur(p=0.1),
], p=0.1),
A.OneOf([
A.GaussNoise(p=0.1),
A.ISONoise(p=0.1),
A.GridDropout(ratio=0.5, p=0.2),
A.CoarseDropout(max_holes=16, min_holes=8, max_height=16, max_width=16, min_height=8, min_width=8, p=0.2)
], p=0.2),
A.Normalize(
mean=[0.485.0.456.0.406],
std=[0.229.0.224.0.225],
),
ToTensorV2(),
])
else:
return Compose([
A.Resize(height=CFG.img_size, width=CFG.img_size),
A.Normalize(
mean=[0.485.0.456.0.406],
std=[0.229.0.224.0.225],
),
ToTensorV2(),
])
Copy the code