Transcendental and posterior

The common experience states can be divided into two kinds: transcendental and posterior.

A posteriori usually requires experience to know, for example, how difficult the first-year exam of a course is. Or this new cup of milk tea is not good.

A priori is established by experience or logical argument. For example, boiling water is boiling hot, so we don’t drink/touch boiling water directly. To explain a priori with a posteriori example, if we have a test paper of a certain course in recent years, then we have a rough idea of the difficulty of the exam.

And a priori and a posteriori completely is not necessarily a correct state of experience, that is to say, although there are a priori or a posteriori support such as the basis, but there are still may be the wrong results or conclusions, this need to be clarified, both has its own disadvantages, this will not focus on in this article, interested friends can understand.

The birth of Image Style Transfer requirement

Different images have a “style”, for the painting, refers to a “painting style”, for the photos taken, also has its own photo style. With the increase of mobile shooting devices (mobile phones), there are more and more requirements for image processing, such as common filters, beauty and texture, which are essentially computed and processed on images.

In recent years, with the emergence of a variety of video software, shooting software, also produced a variety of requirements, image style conversion is one of the very interesting.

For example, the following figure (c) has the outline of (a), and the content of the corresponding position is almost the same. But the attributes of the whole image (such as color and texture) are changed to (B). For the Mona Lisa in the avatar style, the original panda (a) has been transformed into a real panda (C). This series of image styles (including texture and gamut tone) change but maintain the basic shape and contour of the image processing, known as image style conversion.

Of course, we can also reverse this process by using the above image (b) as the image to be converted and (a) as the style reference image. The result is also very interesting, as shown in the figure below.

There are many methods to achieve image style conversion. At present, the mainstream direction is mainly based on deep learning neural network framework. For example, adversarial generative network GAN has many applications in image style conversion. In essence, the process of data set and parameter training used in deep learning is also the application of priors, but it is not typical or clever enough.

The image processing algorithm to be introduced in this paper is a fixed style transformation, which is a traditional algorithm not based on deep learning.

Problems and analysis of image conversion into hand-painted style

For hand-painted style image style conversion, I believe that some readers have been used in the relevant shooting software, hand-painted style is a common style in image conversion.

But behind the seemingly simple effect, it faces a lot of troublesome problems. To better illustrate the problem, I found a hand-drawn image from the friend circle of my recent major (because the blogger happens to be a student majoring in computer + Art design).

Through observation, it is not difficult to find that line draft and tone local texture are prominent features of hand-painted draft. A hand-drawn image can be simply divided into texture and line draft, which correspond to color (texture) and shape (line draft) in visual cognition respectively, as shown in the figure below.

(a) it’s an actual hand-drawn sketch, and (b) it’s a scribbled line sketch. (b) IT’s my drawing. You can understand what line draft is by the picture.

Therefore, how to achieve the conversion of image hand-painted style can be more specific, that is, how to transform the color of the image into the texture of hand-painted style and transform the edge of the image into the line draft of hand-painted style. Here, we used to the idea of A problem, the problem of A general abstract (how to translate style) through the analysis of actual problems, into more of the realization problem (how to implement A to B, C to D conversion), after the conversion by transformation problem and solve problem to solve the problem of the original, This is a very common method of research and problem solving).

Line art

Artists often draw a curve into multiple short and frequent line segments when they are hand-painted. Therefore, these lines not only have intersecting lines, but also the curve in the hand-painted image looks like a curved arc, which is actually the result of frequent drawing together of multiple short and broken lines, as shown in the figure below.

Thus, the first problem is how to extract the lines in the painting style. An easy solution is to use the gradient information of the image and take the edge of the image as the lines of the hand-painted style. This is naturally effective, but it also has problems.

In sketch, the attributes of stroke include thickness, wiggliness, brightness, and brightness. Lines usually end at points of curvature or intersection and have almost no continuous long curves.

However, if the gradient information of the image is directly used as the line draft, it is obviously impossible to get hand-painted intersecting lines and broken lines. The first question is further refined as whether there is a method to convert the gradient information of the image into the lines of the hand-painted line draft style.

The answer is certainly there. In fact, the method is very similar to non-maximum Suppression in Canny operator. It is easy to understand the subsequent operations if you can understand the non-maximum Suppression in Canny operator.

Firstly, the gradient information of an image is extracted, as described in the formula below


Among them,Is the grayscale image,Respectively are the gradients of the image in two directions. The image gradient map information obtained by the above formula is usually noisy and does not contain continuous edges used for line generation.

The second step, line direction estimation, is very similar to non-maximum suppression. 8 reference directions with 45° intervals were selected as line segments, a response graph in a certain direction


Among them,For the firstAnd expressed in convolution kernel form; The length of line segment is 1/30 of the length (width) of the input image.The symbol represents the convolution operator, calculates the gradient projection on each side, and constructs the filter response graph.

The third step, classification. To classify gradients according to the maximum projection of gradients in all directions, it should be noted that the mathematical expression of the original paper (see Eq.(3), and the link is placed at the end of this paper) is


And actually, because we want the maximum projection, this isShould be changed toIt should be the author’s negligence. The correct statement is as follows:


Among them,Is the pixel index;isThe amplitude diagram in the direction is shown in the figure below (i.e., the amplitude diagram), and satisfies. The classification step can have better anti-noise ability and a certain degree of robustness.

The last step in line draft generation, line generation, is expressed as:


Convolution smoothing along a given direction can connect disconnected edge pixels in the original gradient graph.After reverse processing and normalization to [0,1], the output line graph is obtained.

We can compare the results of line drawings obtained by this method (see figure (c) below).

On the other hand, in the process of the second and third steps, which are very similar to non-maximum suppression, the noise of the image texture has a good suppression effect, as shown in Figure (c) below.

Tone transfer with local texture rendering

In this section, we analyze the problems in converting images into hand-painted styles. The problem of lines has been solved in the above treatment, so the other problem, how to reproduce the hand-drawn style of texture? In solving this problem, we finally use the priors in the problem. I almost forgot… This article is about transcendental wisdom

Shades, or dense lines, are used to show shadows and darker objects.

In fact, it’s this pencil texture.

Through the statistics of histograms of natural images and hand-drawn images, the statistical law as shown in the figure below can be obtained. An image pixel histogram of a natural scene is shown in (c) and a hand-drawn sketch in (d).

For real images, the histogram usually does not have a significant law, while the histogram of the sketch image seems to have a certain law.

The author describes it with the following text: The hue of natural images usually varies significantly, as in the histogram of (c), while the hue histogram of sketches follows certain patterns, as in the histogram of (d). This is because there are two basic tones for sketching :(1) highlight areas; (2) Shaded area. In between the two basic tones is the transition area, used to enrich the sense of layers in the picture.

Based on this, a parameter model of tonal distribution is proposed to realize tonal transfer.

Model-based Tone Transfer

  • Hypothesis: All hand-drawn sketches basically conform to the histogram distribution pattern shown in (d).
  • Basic idea: force a histogram of a given image into a histogram of type (D)
  • Based on three – segment function fitting model
    • In the image performance, the light and shade level is clear, consistent with the hand-drawn sketch, which can meet the needs of style transformation
    • From the function curve fitting model, the transition of pixel values is smooth
    • Ideas are easy to understand and produce good results

The mathematical expression of the fitting tonal distribution parameter model is as follows:


Among them,Denotes tone value;Denotes pixel tone value isThe probability of;In order to makeNormalized factor of; A weightIs related to the number of pixels at each layer. In addition, tonal values need to be normalized to ensure their dynamic range withinIn the.

As shown in the figure below, (a) is a hand-drawn sketch, and (b) the author divides (a) into three layers according to pixel values, marked with green, orange and blue respectively (green is the darkest part of the region, orange). According to the author’s observation, the tonal distribution of three layers is given.

Here is a guess, the author should have tried the two-layer and four-layer distribution model, but the effect of three-layer is the easiest to explain and better

I don’t know if you have noticed that (c) in the figure above is actually the separation of prior statistical results of histogram of a hand-drawn sketch image, as shown in the figure below.

Specifically, the dark layer (3) and the light layer (1) have distinct peaks, while the light-dark intermediate layer (2) does not correspond to a unique peak due to the nature of graphite in contact with white paper on charcoal.

Now look at the shapes of the curves of (1), (2), and (3), and if you were to fit these three curves with the best function you’ve ever seen, what would you do?

For (2) and (3), the answer is clear. (3) The curve can be fitted with gaussian function (i.e. normal distribution), and (2) if the curve is basically on a horizontal line, ignoring the fluctuation at both ends, it can be fitted with the corresponding function of uniform distribution.

So, the only question is what function to use to fit the curve (1). Those of you who majored in mathematics, physics, statistics, or communications probably know the answer: use half of the Laplace distribution, the unilateral Laplace distribution response function.

The fitting models of three kinds of curves are given:

1. Firstly, the Gaussian function model describing the shaded area


Among them,Is the scale parameter of the Laplace distribution function,Is the pixel value.

2. Uniform distribution model corresponding to texture region between highlights and shadows


Among them,,Are upper and lower bounds of uniform distribution.

3. Unilateral Laplacian model describing the specular region


The above formula is modified, and the original formula Eq.(7) is wrong, the original wrong formula is as follows:


Among them,Is the mean value of pixels in the shaded area,Is the scale parameter.

Parameter Learning

After having three specific functional models, what needs to be done is to estimate their parameters respectively, and realize the histogram prior curve fitting of hand-painted images by this model. The specific implementation steps are as follows:

Input grayscale mapSlight Gaussian smoothing. Set the tonal threshold and weight of each layerDepends on the number of pixels in each layer; Maximum likelihood estimation is used to calculate the parameters of each layer. The mean and standard deviation of each layer are denoted as,, the analytical solutions of parameters of each layer are:




Among them,Is pixel value; N is the number of pixels in this layer;.

Maximum likelihood estimation process

The maximum likelihood estimation process is as follows:

      (1)Represents the collection of all pixels in the highlight layer


make,


The available


      (2)Represents the set of all pixels in the middle layer, yesInterval estimation


      (3)Represents the collection of all pixels in the shadow layer


Through maximum likelihood estimation, the author gives three groups of different parameters and their corresponding effects, but their effects are not different, as shown in the figure below.

It should be noted that the learning result of the parameter in the figure above is also in (a)The ratio is wrong, so it needs to be swappedwith(All of them in the picture abovewithBoth need to be swapped), as shown below.

Texture mapping

The rendering of the texture is the final step.

Tonal texture is a line pattern that has no obvious direction and only represents tone information. The author learned hand-painted tone patterns using tone maps, as shown in the picture.

Manual hand-drawn tone texture is drawn repeatedly in the same position. The author uses multi layers of strokes to simulate the process, and its output is expressed as exponential combinationor) is to fit the image to be convertedThe local tone will reference the tone textureRepeatedly drawTimes.

In addition, β must satisfy local smoothing, which can be solved by minimizing the following equation:


Among them,At 0.2, the above equation can be transformed into the strand Linear equation and solved by conjugate gradient method.

Finally, the resulting texture map is drawn:


Final hand-drawn drawing generation

In the last step, we need to combine the line map and the texture map, and obtain by multiplying the corresponding positions of the texture map and the line map, as follows:


Therefore, the theoretical part has been thoroughly introduced, and a flow chart is used to represent all steps to help you sort out your ideas, as follows:

Actual code

The code generated by the line draft is as follows

% matlab2017 a
% GenStroke.m

functionS = GenStroke (im, ks, width, dirNum) % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = % of image line structure'S'% % % % @im: input image whose pixel value dynamic range is between [0,1] % @ks: convolution kernel size % @width: generated line sketch thickness % @dirnum: The number of line directions, used in the paper for 8 %% % initialization, calculate the image size [H, W, ~] = size(im); %% smooth, where the median filter im = medFilt2 (im, [3 3]); % % edge detection, image gradient calculation took imX = [abs (im (:, 1: (end - 1)) - im (:, 2: end)), zeros (H, 1)); imY = [abs(im(1:(end-1),:) - im(2:end,:));zeros(1,W)]; imEdge = imX + imY; KerRef = zeros(ks*2+1); kerRef = zeros(ks*2+1); kerRef(ks+1,:) = 1; %% classification, corresponding to the third step in line draft generation response = zeros(H,W,dirNum);for n = 1 : dirNum
        ker = imrotate(kerRef, (n-1)*180/dirNum, 'bilinear'.'crop');
        response(:,:,n) = conv2(imEdge, ker, 'same'); end [~, index] = max(response,[], 3); C = zeros(H, W, dirNum);for n = 1 : dirNum
        C(:,:,n) = imEdge .* (index == n);
    end

    kerRef = zeros(ks*2+1);
    kerRef(ks+1,:) = 1;
    for n = 1 : width
        if (ks+1-n) > 0
            kerRef(ks+1-n,:) = 1;
        end
        if (ks+1+n) < (ks*2+1)
            kerRef(ks+1+n,:) = 1;
        end
    end
    
    Spn = zeros(H, W, dirNum);
    for n = 1 : dirNum
        ker = imrotate(kerRef, (n-1)*180/dirNum, 'bilinear'.'crop');
        Spn(:,:,n) = conv2(C(:,:,n), ker, 'same');
    end

    Sp = sum(Spn, 3);
    Sp = (Sp - min(Sp(:))) / (max(Sp(:)) - min(Sp(:)));
    S = 1 - Sp;
end
Copy the code

The code for hue map generation is shown below

% GenToneMap.m

functionJ = GenToneMap (im) % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = % calculation is tonal'T'%% % @im: input image whose pixel value dynamic range is between [0,1] %% % Ub = 225; Ua = 105; Mud = 90; DeltaB = 9; DeltaD = 11; % Omega1 = 42; % Omega1 = 42; % Omega1 = 42; % Omega2 = 29; % Omega3 = 29; % second group % Omega1 = 52; % Omega2 = 37; % Omega3 = 11; Omega1 = 76; Omega2 = 22; Omega3 = 2; %% Calculate the histogram of the style image to be converted histgramTarget = Zeros (256, 1); total = 0;for ii = 0 : 255
        if ii < Ua || ii > Ub
            p = 0;
        elsep = 1 / (Ub - Ua); end histgramTarget(ii+1, 1) = (... Omega1 * 1/DeltaB * exp(-(255-ii)/DeltaB) + ... Omega2 * p + ... Omega3 * 1 / SQRT (2 * PI * DeltaD) * exp (- (2 - Mud) ^ 2 / (2 * DeltaD ^ 2))) * 0.01; total = total + histgramTarget(ii+1, 1); end histgramTarget(:, 1) = histgramTarget(:, 1)/total; %% % median smoothing % im = medFilt2 (im, [5 5]); J = HIsteq (IM, histgramTarget); %% filter smoothing G = fspecial('average', 10);
    J = imfilter(J, G,'same');
end
Copy the code

The code for local texture rendering is as follows

% GenPencil.m


functionT = GenPencil (im, P, J) % = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = % calculation hand-painted style figure'T'%% % @im: input image whose pixel value dynamic range is between [0,1] % @p: hand-painted texture reference image % @j: generated good tone map %% % parameter definition theta = 0.2; % This threshold parameter is given in the paper [H, W, ~] = size(im); P = imresize(P, [H, W]); P = reshape(P, H*W, 1);logP = log(P);
    logP = spdiags(logP, 0, H*W, H*W);
    
    J = imresize(J, [H, W]);
    J = reshape(J, H*W, 1);
    logJ = log(J);
    
    e = ones(H*W, 1);
    Dx = spdiags([-e, e], [0, H], H*W, H*W);
    Dy = spdiags([-e, e], [0, 1], H*W, H*W); A and b A = theta * Dx * Dx' + Dy * Dy') + (logP)' * logP; b = (logP)' * logJ; Beta = PCG (A, B, 1E-6, 60) by %% conjugate gradient method; 0 0 0 0 0 0 P = reshape(P, H, W); %% local texture map calculation T = p. ^ beta; endCopy the code

Operational embodiment

Take hiddleston images as an example of the problems faced in converting images to hand-painted styles and the analysis section.

% = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = % parameters show % @ im: input image % @ ks: the length of the convolution line % @ width: the width of the line in the figure the handmade % @ dirNum: CLC clear close all im = imread(CLC clear close all im = imread('C:\Users\as\Desktop\juejin\IMG_4511.png'); ks=8; width=1; dirNum=8; GammaS = 1.0; GammaI = 1.0; figure imshow(im)Copy the code

%% preprocessing im = im2double(im); [H, W, sc] = size(im); %% For color images, convert RGB to YUV color space. % is briefly mentioned in the paper, that is, RGB is converted to YUV, and all operations in the paper are performed on the Y brightness channel alone. % will eventually convert the YUV processed back to RGB, that is, a color hand-painted image is obtained. % only Y is processed here, and UV represents the chroma component in YUV color space, so it is not changedif (sc == 3)
    yuvIm = rgb2ycbcr(im);
    lumIm = yuvIm(:,:,1);
elselumIm = im; S = GenStroke(lumIm, ks, width, dirNum).^ gammaS; % Darken results with gamma correction figure, imshow(S)Copy the code

%% Generate tonemap J = GenToneMap(lumIm).^ gammaI; % Darken results with gamma correction figure, imshow(J)Copy the code

P = im2double(imread('C:\Users\as\Desktop\juejin\pencil.jpg'));
P = rgb2gray(P);
figure, imshow(P)
Copy the code

T = GenPencil(lumIm, P, J); figure, imshow(T)Copy the code

LumIm = S.* T;if (sc == 3)
    yuvIm(:,:,1) = lumIm;
%     resultIm = lumIm;
    I = ycbcr2rgb(yuvIm);
else
    I = lumIm;
end

figure, imshow(I)
Copy the code

The image above is the final result of running the code. This is what we get if we set the parameter gammaI to 5.0.

The relevant codes are given above, so the debugging and results of parameters will not be discussed in this paper. The initial parameters are the parameters of the experimental part of the paper.

We can look at the comparison of real hand-drawn images with algorithmic style-transformed images.

The result of the code can restore the shape information of the image more faithfully. Meanwhile, portrait is not the most suitable test image for hand-painted style transfer, and the algorithm will perform better in buildings and landscapes, as shown in the figure below.

Analysis of existing problems

This method has the problem of parameter dependence, such as decolorization caused by improper parameters. The following figure uses the experimental parameters of the paper, but decolorization is caused by histogram matching.

In the above figure, the clothes at the upper right corner of “Xia”, the color of hair and the color of arm, have very significant tearing and decoloration in the resulting image, while the change of the original image is smooth. The direct cause of these problems lies in the forced stretching step of the histogram in the link of histogram matching. The fundamental reason lies in the inherent limitations of the prior method, that is, if the image histogram is very different from the conventional natural image (refer to the (D) histogram of(a) in the section of hue Transfer and local texture rendering), such problems will occur.

Another problem in the test image is the convolution of the edge of the image by this method. Due to the need to form crossing lines and breaking lines during hand-drawing, most edge areas will generate lines, resulting in the problem as shown in the figure below.

The highlighted reflective areas on the lips were painted with black lines, while dense edges such as eyes and hair produced varying levels of noise.

reference

At the end of the paper attached link, interested friends can study by themselves. Paper on Combining the Sketch and Tone for Pencil Drawing Production:www.cse.cuhk.edu.hk/~leojia/pro…

If you like this blog post, please give it a thumbs up! If you like my style and want to know more about it, please follow me. I will update you occasionally with interesting graphics and visual knowledge and add my own unique understanding

Your support is the biggest affirmation for me.