Summary of use of image enhancement library Albumentations

Abstract

The Albumentations package is an API written specifically for data enhancement. It basically contains a large number of data enhancement tools.

Albumentations supports all common computer vision tasks such as classification, semantic segmentation, instance segmentation, object detection, and pose estimation.

2. The library provides a simple unified API for handling all data types: images (RBG images, grayscale images, multispectral images), segmentation masks, bounding boxes, and key points.

3. The library contains over 70 different enhancements to generate new training samples from existing data.

4, Albumentations fast. We benchmark each new release to ensure that enhancements provide maximum speed.

5. It works with popular deep learning frameworks like PyTorch and TensorFlow. Albumentations, by the way, are part of the PyTorch ecosystem.

6. Written by experts. The authors have experience both in the production of computer vision systems and in competitive machine learning. Many of the core team members are Kaggle Masters and Grandmasters.

7. The library is widely used in industry, deep learning research, machine learning competitions and open source projects.

PIP installation for Albumentations

pip install albumentations

The problem

There is a small problem in this part. I have installed Opencv-Python on my machine, and when WE tried to install albumentations, there was a problem. Our Opencv-Python blocked the installation of albumentations. # Could not install packages due to anEnvironmentError: [WinError 5] Access is denied because opencv-python-headless is required when installing albumentations. This library conflicts with OpencV.

The solution

The solution to this problem is to install albumentations in our new virtual environment. When we install albumentations, we will install opencv-python-headless, which can replace opencv-Python. So we don’t have to install OpencV.

Benchmark results

The test used ImageNet to validate the results of the benchmark test run on an Intel Xeon Gold 6140 CPU for the first 2000 images of the set. All output is converted to a continuous NumPy array with NP. Uint8 Data type. The table shows the number of images that can be processed per second on a single core; The higher the better.

Python and library versions: Python 3.8.6 (default, Oct 13 2020, 20:37:26) [GCC 8.3.0], numpy 1.19.2, Pillow 7.0.0.post3, Opencv-python 4.4.0.44, scikit-image 0.17.2, scipy 1.5.2.

Spatial level transforms

Spatial level transformations will change both the input image and additional targets, such as masks, bounding boxes, and key points. The following table shows what additional targets each transformation supports.

Spatial-level transforms change both the input image and additional targets, such as masks, boundary boxes, and keypoints. The following table shows what additional targets each transformation supports.

List of supported

Blur
CLAHE
ChannelDropout
ChannelShuffle
ColorJitter
Downscale
Emboss
Equalize
FDA
FancyPCA
FromFloat
GaussNoise
GaussianBlur
GlassBlur
HistogramMatching
HueSaturationValue
ISONoise
ImageCompression
InvertImg
MedianBlur
MotionBlur
MultiplicativeNoise
Normalize
Posterize
RGBShift
RandomBrightnessContrast
RandomFog
RandomGamma
RandomRain
RandomShadow
RandomSnow
RandomSunFlare
RandomToneCurve
Sharpen
Solarize
Superpixels
ToFloat
ToGray
ToSepia

Simple use cases

import albumentations as A
import cv2
 
import matplotlib.pyplot as plt
 
# Declare an augmentation pipeline
transform = A.Compose([
    A.RandomCrop(width=512, height=512),
    A.HorizontalFlip(p=0.8),
    A.RandomBrightnessContrast(p=0.5),])# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
 
# Augment an image
transformed = transform(image=image)
transformed_image = transformed["image"]
plt.imshow(transformed_image)
plt.show()
Copy the code

Original image:

Running results:

Detailed Use Cases

VerticalFlip flips the input vertically around the X-axis

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.VerticalFlip(always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')   # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('Image after Blur')
plt.imshow(transformed_image)
plt.show()
Copy the code

Blur Input image

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.Blur(blur_limit=15,always_apply=False, p=1)(image=image) 
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')   # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('Image after Blur')
plt.imshow(transformed_image)
plt.show()
Copy the code

The HorizontalFlip flips input horizontally around the Y-axis

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.HorizontalFlip(always_apply=False, p=1)(image=image) 
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')   # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('Images after HorizontalFlip')
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

Flip horizontal, vertical or horizontal and vertical Flip inputs

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.Flip(always_apply=False, p=1)(image=image) 
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')   # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('Images after the Flip')
plt.imshow(transformed_image)
plt.show()
Copy the code

The running results have certain randomness, as shown in the following figure:

Transpose, which transposes input by swapping rows and columns

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.Transpose(always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')   # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('Transpose images')
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

RandomCrop

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.RandomCrop(512.512,always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')   # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('Image after RandomCrop')
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

RandomGamma gray scale coefficient

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.RandomGamma(gamma_limit=(20.20), eps=None, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')   # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('Image after RandomGamma')
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

RandomRotate90 will enter a random rotation of 90 degrees N times

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.RandomRotate90(always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')   # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('RandomRotate90 images')
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

ShiftScaleRotate randomly panning, scaling, and rotating inputs

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.ShiftScaleRotate(shift_limit=0.0625, scale_limit=0.1, rotate_limit=45, interpolation=1, border_mode=4, value=None, mask_value=None, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')   # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title('ShiftScaleRotate post image ')
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

CenterCrop crops the center portion of the image

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
 
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.CenterCrop(256.256, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')  # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Image behind CenterCrop")
plt.imshow(transformed_image)
plt.show()
Copy the code

GridDistortion

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
 
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.GridDistortion(num_steps=10, distort_limit=0.3,border_mode=4, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')  # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Picture behind GridDistortion")
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

ElasticTransform ElasticTransform

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
 
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.ElasticTransform(alpha=5, sigma=50, alpha_affine=50, interpolation=1, border_mode=4,always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')  # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Image after ElasticTransform")
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

RandomGridShuffle cuts images into grid cells and arranges them randomly

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
 
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.RandomGridShuffle(grid=(3.3), always_apply=False, p=1) (image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')  # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Image after RandomGridShuffle")
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

HueSaturationValue Randomly changes the color, saturation, and value of the image

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
 
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.HueSaturationValue(hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')  # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Image after HueSaturationValue")
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

PadIfNeeded fills the image

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
 
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.PadIfNeeded(min_height=2048, min_width=2048, border_mode=4, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')  # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Image after PadIfNeeded")
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

RGBShift, for each channel of the RGB image to move the value randomly

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
 
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.RGBShift(r_shift_limit=10, g_shift_limit=20, b_shift_limit=20, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')  # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Images after RGBShift")
plt.imshow(transformed_image)
plt.show()
Copy the code

GaussianBlur uses gaussian filter with random kernel size to blur the image

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
 
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.GaussianBlur(blur_limit=11, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')  # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("GaussianBlur after image")
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

CLAHE adaptive histogram equalization

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
 
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.CLAHE(clip_limit=4.0, tile_grid_size=(8.8), always_apply=False, p=0.5)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')  # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("The image after CLAHE")
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

ChannelShuffle Randomly rearranges the channels for input RGB images

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
 
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.ChannelShuffle(always_apply=False, p=0.5)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')  # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Images after ChannelShuffle.")
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

InvertImg color

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
 
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.InvertImg(always_apply=False, p=0.5)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')  # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("InvertImg image")
plt.imshow(transformed_image)
plt.show()
Copy the code

Cutout Randomly erases

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
 
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.Cutout(num_holes=20, max_h_size=20, max_w_size=20, fill_value=0, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')  # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Image after Cutout")
plt.imshow(transformed_image)
plt.show()
Copy the code

RandomFog

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
 
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.RandomFog(fog_coef_lower=0.3, fog_coef_upper=1, alpha_coef=0.08, always_apply=False, p=1)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')  # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Images after RandomFog")
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

GridDropout Grid erase

import albumentations as A
import cv2
import numpy as np
import matplotlib.pyplot as plt
 
# Solve the Chinese display problem
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("aa.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Augment an image
transformed = A.GridDropout(ratio=0.5, unit_size_min=None, unit_size_max=None, holes_number_x=None, holes_number_y=None,
                            shift_x=0, shift_y=0, always_apply=False, p=0.5)(image=image)
transformed_image = transformed["image"]
plt.subplot(1.2.1)
plt.title('original')  # First picture title
plt.imshow(image)
plt.subplot(1.2.2)
plt.title("Images after GridDropout")
plt.imshow(transformed_image)
plt.show()
Copy the code

Running results:

`Composition transformation (Compose)`

Transformations can be used not only alone, but in combination, using the Compose class, which inherits from BaseCompose. The Compose class contains the following parameters:

Transforms: Transforms an array of classes, of type list
Bbox_params: Parameter for bounding boxes conversion, of type BboxPoarams
Keypoint_params: Parameter used for keypoints conversion, of type KeypointParams
Additional_targets: specifies the new target name of the key. The value is a dict of the old target name, for example, {‘image2’: ‘image’}
P: The probability of using these transforms. The default value is 1.0

image3 = Compose([
        # Contrast restricted histogram equalization
            # (Contrast Limited Adaptive Histogram Equalization)
        CLAHE(),
        # Random rotation 90°
        RandomRotate90(),
        # transpose
        Transpose(),
        # Random affine transformation
        ShiftScaleRotate(shift_limit=0.0625, scale_limit=0.50, rotate_limit=45, p=75.),
        # fuzzy
        Blur(blur_limit=3),
        # Optical distortion
        OpticalDistortion(),
        # Mesh distortion
        GridDistortion(),
        Change the HUE, saturation, and value of the image randomly
        HueSaturationValue()
    ], p=1.0)(image=image)['image']
Copy the code

Random selection (OneOf)

It’s composed, like Compose, and it has a probability. The difference lies in the fact that the transformations in the Compose combination are performed in the order next to each other, while in the OneOf combination, the system automatically selects OneOf them to perform the transformations, and the probability parameter p here refers to the probability that the selected transformation will be performed. Ex. :

image4 = Compose([
        RandomRotate90(),
        # flip
        Flip(),
        Transpose(),
        OneOf([
            # Gaussian noise
            IAAAdditiveGaussianNoise(),
            GaussNoise(),
        ], p=0.2),
        OneOf([
            # Fuzzy related operations
            MotionBlur(p=2.),
            MedianBlur(blur_limit=3, p=0.1),
            Blur(blur_limit=3, p=0.1),
        ], p=0.2),
        ShiftScaleRotate(shift_limit=0.0625, scale_limit=0.2, rotate_limit=45, p=0.2),
        OneOf([
            # Distortion related operation
            OpticalDistortion(p=0.3),
            GridDistortion(p=1.),
            IAAPiecewiseAffine(p=0.3),
        ], p=0.2),
        OneOf([
            # Sharpening, embossing and other operations
            CLAHE(clip_limit=2),
            IAASharpen(),
            IAAEmboss(),
            RandomBrightnessContrast(),            
        ], p=0.3),
        HueSaturationValue(p=0.3),
    ], p=1.0)(image=image)['image']
Copy the code

Use in programs

def get_transform(phase: str) :
    if phase == 'train':
        return Compose([
            A.RandomResizedCrop(height=CFG.img_size, width=CFG.img_size),
            A.Flip(p=0.5),
            A.RandomRotate90(p=0.5),
            A.ShiftScaleRotate(p=0.5),
            A.HueSaturationValue(p=0.5),
            A.OneOf([
                A.RandomBrightnessContrast(p=0.5),
                A.RandomGamma(p=0.5),
            ], p=0.5),
            A.OneOf([
                A.Blur(p=0.1),
                A.GaussianBlur(p=0.1),
                A.MotionBlur(p=0.1),
            ], p=0.1),
            A.OneOf([
                A.GaussNoise(p=0.1),
                A.ISONoise(p=0.1),
                A.GridDropout(ratio=0.5, p=0.2),
                A.CoarseDropout(max_holes=16, min_holes=8, max_height=16, max_width=16, min_height=8, min_width=8, p=0.2)
            ], p=0.2),
            A.Normalize(
                mean=[0.485.0.456.0.406],
                std=[0.229.0.224.0.225],
            ),
            ToTensorV2(),
        ])
    else:
        return Compose([
            A.Resize(height=CFG.img_size, width=CFG.img_size),
            A.Normalize(
                mean=[0.485.0.456.0.406],
                std=[0.229.0.224.0.225],
            ),
            ToTensorV2(),
        ])
Copy the code

Summary of use of image enhancement library Albumentations

Abstract

PIP installation for Albumentations

The problem

The solution

Benchmark results

Spatial level transforms

Simple use cases

Detailed Use Cases

VerticalFlip flips the input vertically around the X-axis

Blur Input image

The HorizontalFlip flips input horizontally around the Y-axis

Flip horizontal, vertical or horizontal and vertical Flip inputs

Transpose, which transposes input by swapping rows and columns

RandomCrop

RandomGamma gray scale coefficient

RandomRotate90 will enter a random rotation of 90 degrees N times

ShiftScaleRotate randomly panning, scaling, and rotating inputs

CenterCrop crops the center portion of the image

GridDistortion

ElasticTransform ElasticTransform

RandomGridShuffle cuts images into grid cells and arranges them randomly

HueSaturationValue Randomly changes the color, saturation, and value of the image

PadIfNeeded fills the image

RGBShift, for each channel of the RGB image to move the value randomly

GaussianBlur uses gaussian filter with random kernel size to blur the image

CLAHE adaptive histogram equalization

ChannelShuffle Randomly rearranges the channels for input RGB images

InvertImg color

Cutout Randomly erases

RandomFog

GridDropout Grid erase

Composition transformation (Compose)

Random selection (OneOf)

Use in programs

Related Posts

Text classification remains BERT? The dual contrast learning framework is also too strong

Machine Learning decision Tree ID3(Python implementation)

FCN Network – based on Bestrivern’s blog and my own understanding

`Composition transformation (Compose)`