This article is reprinted from my blog
In the course of learning machine learning and image processing, we often come across the concept of convolution. Every time I come across this concept, I sort of get it. Sometimes you know the intuitive explanation, but you don’t know how it works in the formula. Again, I don’t fully understand the concept.
Wikipedia has a GIF to illustrate this concept, but it’s still a bit complicated for me. So he found a lot of articles on the Internet to study, and finally had a more intuitive impression, here I understand the explanation, as a summary.
One dimensional convolution
1.1 Mathematical Definitions
Wikipedia gives the formal definition of convolution as follows:
? f(x)*g(x) = \int_{-\infty}^{\infty} f(\tau)g(x-\tau) d\tau \tag{1}\label{1} ?
1.2 Intuitive Explanation
Let’s analyze this formula:
-
$f(x)*g(x)$convolved with $g(x)$
-
It is the integral of $\tau$over $(-\infty, \infty)$;
-
The integral object is the product of two functions: $f(\tau)$and $g(x-\tau)$.
-
Only $g(x- tau)$mentions $x$, the rest of the equation focuses on $\tau$
I’m afraid such a formula is still difficult to understand, and I’ll explain it with an example.
1.3 example
Imagine xiao Ming had to go to infusion every day for a period of time. The medicine would remain in the body until it became invalid, and the efficacy of the medicine would continue to decline over time. So for the sake of simplicity, we’re going to assume that it’s going to expire in 4 days, and that the duration function is discrete. As shown below:
In the figure, the abscissa is days and the ordinate is efficacy. On the day of infusion (day=0), the efficacy was 100%, 80% on the second day, 40% on the third day, and 0 on the fourth day.
Now let’s define some symbols: $\ OperatorName {m}(t)$\ OperatorName {eff}(t)$\ OperatorName {rest}(t)$\ $\operatorname{eff}(t) = \begin{cases} 100 \% & \text{t=0} \\\ 80 \% & \text{t=1} \\\ 40 \% & \text{t=2} \\\ 0 \% & \text{t>2} \\\ \end{cases}$
The following is to observe the efficacy of xiaoming’s body after three consecutive days of infusion from the first day (assuming a fixed dose of 10 drugs per day).
-
$\ OperatorName {rest}(t) = \ OperatorName {m}(t)\cdot \ OperatorName {eff}(0) $).
The cumulative effect of the first day
-
The next day, Xiao Ming went to the hospital to prepare infusion
-
$\ operatorName {m}(t-1)\cdot \ OperatorName {eff}(1) $\cdot \ operatorName {eff}(1) $
-
After the infusion, the drugs he had on him were: $\ operatorName {rest}(t) = \ OperatorName {m}(t-1)\cdot \ OperatorName {eff}(1) + \ OperatorName {m}(t)\cdot \ operatorname {eff} (0) $)
-
The cumulative effect of the second day
-
The third day, Xiao Ming went to the hospital to prepare infusion
-
$\cdot$40%=4 ($\ OperatorName {m}(t-2)\cdot \ OperatorName {eff}(2) $) $\cdot$80%=8 ($\ operatorName {m}(t-1)\cdot \ OperatorName {eff}(1) $).
-
After the infusion, the drugs he had on him were: $\ operatorName {rest}(t) = \ OperatorName {m}(t-2)\cdot \ OperatorName {eff}(2) + \ OperatorName {m}(t-1)\cdot \ operatorName {eff}(1) + \ OperatorName {m}(t)\cdot \ OperatorName {eff}(0) $).
-
The cumulative effect of the third day
1.4 analysis
$\ operatorName {rest}(t) = \sum_{I =1}^n \ OperatorName {m}(t- I) \ OperatorName {eff}(I)$, \sum_{I =1}^n \ OperatorName {m}(t- I) \ operatorName {eff}(I)$, Where $N $is the maximum number of days of effective efficacy. $\ operatorName {eff}(t)$ If the drug lasts indefinitely, $n$is $\infty$. A pharmacophore function that lasts indefinitely, $\ OperatorName {rest}(t) = \int_{-\infty}^\infty \ OperatorName {m}(t-\tau) \ OperatorName {eff}(\tau) \,d\tau$(in this case, it should technically be $\int_0^\infty$, which is generalized to $(-\infty, \infty)$). So this is basically the wikipedia definition of convolution.
1.5 summarize
My previous confusion about the concept of convolution was mainly due to my misunderstanding of the meaning of $\tau$in the formula $\ref{1}$, which always thought $\tau$was a variable with the coordinate axis. In fact, in the above example, $\tau$is traversed along the vertical axis: it is the sum of the “vertical” function $\ OperatorName {eff}(t)$at the current point ($t$) residual ($\ OperatorName {rest}(t)$). It’s also an integral over the top, not over the top.
The change in the x-coordinate is always $t$, and the change in $t$is not obvious in the convolution.
To recap the entire process above:
Compare three days since the diagram can be found, if we take “the day” rather than the first $t $day for reference, you will see the $\ operatorname {eff} (t) $is in the left shift over time (dark blue line on the same day, said a few days ago the line on the left). $\ operatorName {eff}(t)$= $\ operatorName {eff}(t)$= $\ operatorName {eff}(t)
$\ operatorName {eff}(t)$, $t=0, $t= 1,\cdots$, $\ operatorName {eff}(t)$$\ operatorName {eff}(t)$ I think this “inversion” is a natural process, not the core of the whole convolution. In addition, in the computer world, at least in the image processing and machine learning applications I have encountered, the convolution kernel (i.e., the shifting function $\ operatorName {eff}(t)$) is generally symmetric, so the concept of inversion is not so necessary.
Two dimensional convolution
2.1 Mathematical Definition
$ f(x, y)* g(x, y) = \int_{\tau_1=-\infty}^\infty \int_{\tau_2=-\infty}^{\infty} f(\tau_1, \tau_2) \cdot g(x-\tau_1, y-\tau_2)\,d\tau_1 d\tau_2 \tag{2} $
Two-dimensional convolution is often encountered in image processing, and the discrete form of two-dimensional convolution is mostly used in image processing:
$ f[x,y] * g[x,y] = \sum_{n_1=-\infty}^\infty \sum_{n_2=-\infty}^\infty f[n_1, n_2] \cdot g[x-n_1, y-n_2] \tag{3} $
2.2 Two-dimensional convolution in image processing
Two dimensional convolution is an extension of one dimensional convolution, and it works pretty much the same way. The core is again (inversion), movement, product, sum. The two-dimensional inversion here is the inversion of the convolution kernel along the opposite diagonal, like a matrix
a b c
d e f
g h i
Copy the code
Flip to
i h g
f e d
c b a
Copy the code
After that, the convolution kernel is translated on the two-dimensional plane, and each element of the convolution kernel is multiplied by the corresponding position of the convolution image, and then summed. By constantly moving the convolution kernel, we have a new image, which consists entirely of the product and sum of the convolution kernel at various positions.
Here’s the simplest example of mean filtering:
A 3×3 mean filtering kernel (convolution kernel) :
1/9, 1/9, 1/9, 1/9, 1/9, 1/9, 1/9, 1/9, 1/9, 1/9Copy the code
And the convolved image (simplified here as a two-dimensional 5×5 matrix) :
5. 5. 6Copy the code
When the convolution kernel moves to the lower right corner of the image (the convolution center and the image correspond to the 4th row and 4th column of the image), the result of convolution with the image is shown below:
It can be seen that the effect of two-dimensional convolution in the image is: the weighted sum of the neighborhood of each pixel of the image (the neighborhood size is the size of the kernel) gets the output value of the pixel point. The filter core is used here as a “weight table”.
References:
Welcome to my blog for more articles