background

Recently, I was working on a project related to object recognition. Because the technical stack in the team prefers JavaScript, after the object recognition server has been built with Python and Tensorflow, in order not to increase the maintenance cost of the team members, So as much as possible, delegate tasks other than training and recognition to Node.js, and today’s image preprocessing is one of them.

Here is a brief explanation of a few concepts for those of you who are not familiar with deep learning.

  1. Object recognition: Object recognition can be understood as a computer finding one or some specified objects in a picture, such as finding all the dogs in it.
  2. Training: Learning to recognize objects is like learning to speak. It takes practice, a process called “training” in deep learning.
  3. Training set: Human beings need to watch what others say and listen to other people’s voices to learn to speak, etc. These information that can enable them to learn to speak is called training set in deep learning, but the training set required for object recognition is only pictures.

The purpose of image preprocessing is to solve the problem of insufficient training set in object recognition. This problem occurs when object recognition is applied to a specific domain. If you are identifying a dog, there are plenty of them, and someone has trained them and is ready to serve them. If you’re identifying the t-shirts on the team, there are so few of them that it’s still a very small amount of data to shoot 100 of them. For mature AI services on the Internet, training sets can easily be measured in tens of thousands, or even billions. Special field demand is compared commonly simple, of course, need to identify what kind is not much, characteristic is obvious, but still hope that the greater the training set, the better, at this time you can do to have pictures of some processing, to generate a new image, to expand the current training set, this process is called image preprocessing.

Common image preprocessing methods are as follows:

  1. Rotation. Since the rotation Angle can be any value, it is necessary to generate some random angles to rotate, which is also called random rotation.
  2. Reverse it. Equivalent to putting a mirror next to the picture, the new picture is the picture in the mirror, generally there are two kinds of horizontal flip and vertical flip.
  3. Adjust brightness. Adjust the brightness of your phone to see what that means.
  4. Adjust saturation. Adjust the traditional TV can realize this meaning, the higher the saturation, the more bright color display, on the contrary, give a person a kind of cool color feeling.
  5. Adjust the hue. This is the equivalent of changing the color of the whole picture, just think about the green TV you used to have.
  6. Adjust contrast. This will make the bright parts of the picture brighter and the dark parts darker. You can also think of contrast adjustment on television, and it’s fair to say that television inspired these terms.

Each of the above operations needs to be selected according to the situation, and these are the main processing methods applicable to our team at present. There are also whitening, Gamma processing, and other operations that are not intuitive enough for those interested to learn about themselves.

The installationgm

GraphicsMagick is an NPM library for GraphicsMagick. The node.js library uses GraphicsMagick by default, so you need to install GraphicsMagick directly on your Mac.

brew install graphicsmagick
Copy the code

For details about how to install other systems, go to the official website.

If you need to add text to images, you will also need to install Ghostscript, which can be installed on a Mac using Brew Install Ghostscript. Since this feature is not covered in this article, you don’t need to install it.

Also, you need to install the GM under your project:

npm i gm -S
Copy the code

pretreatment

For the sake of intuition, I chose an image as the preprocessing object:

In addition, in the sample code of this paper, the function name of each pre-processing method is determined by referring to the method of the same name in the Image module of Tensorflow. For more Image processing methods, you can visit the official website of Tensorflow document and look for methods with the same function in the official gm document.

flip

Flipping along the Y-axis uses gm’s.flip method:

import gm from 'gm';

* @param inputPath input image file path * @param outputPath output image file path * @param callback after processing callback function */
function flip(inputPath, outputPath, callback) {
    gm(inputPath)
        .flip()
        .write(outputPath, callback);
}
Copy the code

The flip effect is as follows:

Flip along the X-axis using gm’s.flop method:

import gm from 'gm';

* @param inputPath input image file path * @param outputPath output image file path * @param callback after processing callback function */
function flop(inputPath, outputPath, callback) {
    gm(inputPath)
        .flop()
        .write(outputPath, callback);
}
Copy the code

The flip effect is as follows:

You can also combine.flip and.flop to make a diagonal flip:

If the original picture is regarded as a front-end component, that is, a shopping button group, in which the background of each button can be customized and the button is composed of text, dividing line and text, then the flipped picture above can be regarded as the same component, that is, it can be used as a training set.

Sometimes, the flip is not the desired effect, and after the flip, the original image should not be regarded as the same thing, in this case, this method has limitations.

Adjust the brightness

After that, adjusting the brightness is more universal, no matter what the image is, after adjusting the brightness, the object inside is still the same as the original object.

Adjust brightness using GM’s.modulate method:

@param inputPath = @param outputPath = @Param Brightness = @Param Brightness = 100 Anything below 100 reduces brightness * @param callback */
function adjustBrightness(inputPath, outputPath, brightness, callback) {
    gm(inputPath)
        .modulate(brightness, 100.100)
        .write(outputPath, callback);
}
Copy the code

. The Ulate method is a multi-functional method that allows you to adjust the brightness, saturation and hue of an image at the same time. These three parameters correspond to the modulate method. The other 100 reference values remain unchanged.

I generated all the pictures with luminance ranging from 0 to 200, compared them and selected a suitable range for luminance processing. Look at the difference between adjacent images with a brightness difference of 10 between 0 and 200 (hint: the brightness of each image is indicated in the upper left corner) :

You can see that any image below 60 is too dark and not very detailed, and any image above 150 is too bright and not very detailed. After comprehensive comparison of multiple images, I think the image quality of [60, 140] is relatively good, and it will not lose too many details compared with the original picture.

Then look at the two pictures with brightness of 50 and 60. They actually look like one picture, which does not conform to the principle of diversity of training set, let alone the two pictures with brightness difference of 1 between adjacent pictures. Therefore, the brightness difference of the two adjacent pictures as the training set is determined to be 20, so that the difference is obvious, such as the two pictures with brightness of 80 and brightness of 100.

Eventually, adjusting the brightness will result in four new images. Starting with the image with brightness of 60, each 20 brightness increase was selected and added to the training set until the image with brightness of 140, and the image with brightness of 100 was not counted.

Regulating saturation

The. Modulate method is also used to adjust saturation, but with the second parameter:

/** * Adjust saturation * @param inputPath input image file path * @param outputPath output image file path * @param saturation value, the reference value is 100. Anything above 100 increases saturation and anything below 100 desaturates * @param callback */
function adjustSaturation(inputPath, outputPath, saturation, callback) {
    gm(inputPath)
        .modulate(100, saturation, 100)
        .write(outputPath, callback);
}
Copy the code

The range of saturation and the difference of saturation between two adjacent images in the training set were also determined by adjusting brightness. Look at the difference between adjacent images with a saturation difference of 10 (note: The saturation of each image is indicated in the upper left corner) :

The details of the pictures generated by adjusting saturation are not lost, and most of them can be used as pictures in the training set. As for brightness, the difference between the two pictures with a saturation difference of 20 is obvious. In addition, the image does not change significantly when the saturation is above 140. So the new images generated by adjusting the saturation will be six. Starting from the image with saturation of 0, each 20 increase in saturation is selected and added to the training set until the image with saturation of 140, of which 100 is not counted.

Adjust the hue

The hue adjustment method is the most useful method in this scene and produces the most training sets. Let’s take a look at the difference between the adjacent pictures with a hue of 10 (hint: the hue of each picture is marked in the upper left corner) :

Almost every picture can be used as a new training set, since the hue adjustment range can only be between 0 and 200, so starting from the picture of hue 0, every 10 more hue will be selected and added to the training set, until the picture of hue 190, among which the picture of hue 100 is not counted. This produces 20 images as a training set.

The code for adjusting hue is the same as for brightness and saturation, but the third parameter is changed:

/** * Adjust hue * @param inputPath input image file path * @param outputPath output image file path * @param Hue Image hue value. Anything below 100 reduces the hue * @param callback after processing the callback */
function adjustHue(inputPath, outputPath, hue, callback) {
    gm(inputPath)
        .modulate(100.100, hue)
        .write(outputPath, callback);
}
Copy the code

Adjusting the hue is not a panacea, but only suitable for this scene, of course, our team’s needs are similar to this scene. But if you’re training an AI to recognize pears, telling it that there’s a blue pear is obviously not appropriate.

Adjusting contrast

Contrast was adjusted using gm’s.contrast method:

* @param outputPath * @param outputPath * @param Multiplier Specifies the contrast multiplier. The default value is 0. N means increase contrast n times, -n means decrease contrast n times * @callback after processing param callback */
function adjustContrast(inputPath, outputPath, multiplier, callback) {
    gm(inputPath)
        .contrast(multiplier)
        .write(outputPath, callback);
}
Copy the code

Below are images with contrast factors ranging from -10 to 10. You can see that the range of good image quality is [-5, 2], while other images lose some detail. In addition, the difference between adjacent images with contrast factor is also obvious, so each image can be used as a training set, so there are 7 more pictures.

conclusion

By using the above five methods, you can get an additional 40 images on the basis of one image, that is, the training set is 40 times larger than the original. This is in the absence of a variety of ways to mix, if mixed use, I’m afraid hundreds of times more.

Gm also supports other ways to manipulate images that you can explore on your own, each of which has its own limitations for a particular scene and requires you to select. I hope you all have a satisfying training set.

propaganda

Welcome to Star author’s Github, in addition, welcome to follow the author’s public account, get the latest article push: