Do you ever get the feeling that learning to program is like digging a well, and you can dig a well into the shape of “Web” or “app” or “game”? One day, someone said: The water in the well is the sweetest after the shape of artificial intelligence is dug. So, you start digging around and trying to get the sweetest water out of a well that you’ve already dug, but you end up digging a lot of curves…

Machine learning algorithms are based on mathematical theory, and if mathematics is not well learned, it is difficult to learn even existing machine learning frameworks. If you used to have a good command of mathematics, but had difficulty in getting the mathematical thinking you should have in programming, then there is a high probability that your if-else has been used for a long time, and the programming thinking has not made a path with the mathematical thinking

Huh? Is programming thinking not aligned with math thinking? My years of data structures + algorithms have been wasted?

The path to “programming math” does not mean that previous knowledge is useless, but that machine learning is more dependent on mathematical knowledge than previous knowledge. For example, for a large number of “linear” and “nonlinear” data, the former requires the knowledge of linear algebra, while the latter mostly combines the knowledge of probability theory and mathematical statistics. Most of today’s supervised and unsupervised learning algorithms are based on linear algebra and mathematical theories of probability theory and mathematical statistics.

Talking so much about linearity and supervision is definitely a diversion to explain what it means to get through programming and mathematics.

If you want to systematically learn artificial intelligence, then you are recommended to see the bed length artificial intelligence tutorial. It’s a great piece of work. The course is not only easy to understand, but also funny.

The forehead.. Let me start with the title. Why is it that the longer you learn to program, the harder it is to get into ARTIFICIAL intelligence? The answer is: you use too much if-else!

In C++ for example, “enter two integers, compare the size of two numbers, enter the maximum”, the traditional program is written as:

#include <iostream>using namespace std;int max(int a, int b){    if(a > b){        cout<<a;    }    else{        cout<<b<<endl;      }}int main(){    int a, b;    cin>>a>>b; max(a, b); }
      
Copy the code
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18


Above this code, every beginner of the road, now look back at this code, feel to vomit, so very admire the teacher teaching programming, each group of students, to speak again… To get back to the point, please ignore the code above and look at the next paragraph, which is also a relatively small program, but surprisingly:

#include <iostream>using namespace std;// Return the absolute value of a numberint abs(int num){    return num>0? num : -num; }// Returns the maximum value of the two numbersint max(int a, int b){    return ( a + b + abs(a-b) ) / 2; }int main(){    int a, b;    cin>>a>>b;    cout<<max(a, b)<<endl;    return 0; }
      
Copy the code
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
【 note 】

1, pay attention to the Max function to observe the code above, did not use the if – else can achieve more size, but with a + b + a – b | | 2 a + b + a – b | | 2 formula to calculate a, b’s maximum.

2, in the same way, through the formula of a + b – a – b | | 2 a + b – a – b | | 2 can calculate the minimum value.


Ouch, that’s interesting. However, this program is slower with the formula, ok? Originally ALU can calculate once, but with the formula, it is several times more, which is not superfluous? What about getting through programming and math? That’s a clickbait headline.

Who doesn’t want to be read by more people? Since the above example didn’t get to the point, let’s try another one. Write a program, 1+2+3+… +100. As usual, let’s review the traditional for loop algorithm that you can write with your eyes closed:

#include <stdio.h>int sum(int n){    int num=0;    for(int i=1; i<=n; i++){        num += i;    }     returnnum; }int main(){    printf("Sum %d\n\n".sum(100));    return 0; }
      
Copy the code
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16

Let’s look at the formula:

#include <stdio.h>/ / return 1 + 2 +... Plus nint sum(int n){    return n*(n+1) /2; }// Add 1 to 100int main(){    printf("Sum %d\n\n".sum(100));    return 0; }
      
Copy the code
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

The sum function uses the formula n×(n+1)2n×(n+1)2.

Well, I admit, the sum formula is actually faster than the for loop. With these two programs in mind, is the point of this blog post to replace the for loop with less for loops and more mathematical memorization?

Not this one. The purpose of this article is to express that, from beginner to proficient, programmers will become more and more capable of code logic, but they will gradually neglect the mathematical knowledge that once took a lot of time to master can also be used at the code level. Especially when it comes to artificial intelligence, math is fundamental.

In fact, I also know that math is very important, but now I am very busy, math knowledge is much, and I do not know where to start, how to learn…

If you don’t like math, it is painful to learn, but progress is always accompanied by pain. In machine learning, the simplest, should be linear regression algorithm, or even multiple linear regression algorithm, I think this algorithm is the foundation of the foundation of machine learning, the knowledge required is linear algebra.

Next, I’ll talk briefly about multiple linear regression algorithms that you don’t need to know linear algebra to understand.

One day, I strolled through the entrance of the school guidance office and accidentally picked up a piece of paper. Is it the final exam paper? I was a little disappointed, but I still took a look curiously. Here is a table, which is a comprehensive score of morality, intelligence, physical beauty for several students:

students DE wisdom The body of the beauty Overall rating
potassium 9 7 5 10 80
calcium 8 8 8 8 80
sodium 7 10 4 5 65
magnesium 7 6 9 5 68
aluminum 8 7 8 6 74

[My C language teacher told me that when I see a table, THE first thing I think of is a two-dimensional array]

So I had this image in my head:

// Table data float dataROW] [COL] = {    {        9, 7, 5, 10, 80}, {        8, 8, 8, 8, 80}, {        7, 10, 4, 5, 65}, {        7, 6, 9, 5, 68}, {        8, 7, 8, 6, 74    }};
      
Copy the code
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

The data in this table is called training data.

The data in the form suddenly interrupted my thoughts, “hey ah, not ah, potassium students sports only 5 points, the total evaluation has 80 points, how possible, is not good grades, cut!”

I readily threw the form away, and then the more I think the more gas, good grades great! At this moment, I recalled, before the teacher in charge has said that the total evaluation is calculated according to the score of virtue, intelligence, physical beauty, but how much is the proportion of the total evaluation of physical education, with intelligence (performance) proportion, I really can not remember, this calculation of the total evaluation formula is about:

X1x1 × virtue + x2x2 × wisdom + x3x3 × body + x4x4 × Beauty = total rating

Xx is the proportion of morality, intelligence, body and beauty respectively in the total evaluation, this proportion, also known as the weight. Now, I will calculate these four weights, make a good comparison, set out the facts and reason, the four weights should be average, why the weight of wisdom is greater than the others.

Unconsciously, subconsciously, defines the calculation function of this general evaluation:

// Multiple linear equationsfloat fun(float d, float z, float t, float m, float d_w, float z_w, float t_w, float m_w){    returnd*d_w + z*z_w + t*t_w + m*m_w; }
      
Copy the code
  • 1
  • 2
  • 3
  • 4

At this point, I thought of multiple linear regression, also known as multivariable linear regression. Multiple linear regression is a useful formula: J (theta) = ∑ mi = 1 (h theta (xi) – yi) 22 by Michael Jackson (theta) = ∑ I = 1 m (h theta (xi) – yi) 22 x m.

Some people look at this formula and immediately give up math. So what does this formula mean? Simply put, J(group I weight combination)=∑ number of rows in the table I =1(Group I weight combination x VIRTUE, intelligence, physical beauty − total score)22 x number of rows in the table J(group I weight combination x Virtue, intelligence, physical beauty − total score)=∑ I =1 Number of rows in the table 22 x number of rows in the table.

And the other thing is, instead of doing it once, it does it for almost all the weight combinations, and then it takes the minimum value of J(θ), J(θ), which is the weight combination.


Is it too complicated? Here’s an example:

1. Assume a set of weight values: 1, 2, 1, 2, corresponding to the weight values of virtue, intelligence, body and beauty respectively;

2, look-up table to see the first line: k 9 7 | | | | | 10 80

Calculate the total evaluation guess value of the corresponding weight value: 9×1+7×2+5×1+10×2 = 48 (this result is the guess value of the corresponding assumed weight value, which is not real. In the next step, square the difference between this value and the real total evaluation value);

3. Use the total guess value obtained in the previous step – the actual value in the table, then square the result, i.e. (48-80)^2 = 1024 (oh, the calculated value is nice);

4, That’s not enough, look at the table, look at the second row, using the same set of weights: 8×1+8×2+8×1+8×2 = 48, and then 1024;

7×1+10×2+4×1+5×2 = 41, then (41-65)^2 = 576;

6, then continue, the corresponding weight value to calculate the total rating guess value – the real total rating value and then square the result;

7. After all rows are calculated, add up all the squares of the difference between the calculated guesses and the total rating, and then divide the result by 2 times the number of rows in the table;

8. Re-take the new weight combination and calculate the above seven steps again. After all the corresponding weight combination results are calculated, the corresponding weight combination with the smallest result value is taken as the data training result.


According to the multiple linear regression formula, the algorithm is as follows:

// Data training after completion of the weight of virtue, intelligence, body and beautyfloat d_w = 0;float z_w = 0;float t_w = 0;float m_w = 0;// The value range of each weight is estimatedfloat w_min = 0;float w_max = 4;// Multiple linear regression weight estimation algorithmvoid train( float data[ROW][COL] ){    // The minimum value of the regression function    float Linear_value_min = -1;         // Calculate the result value of the regression function each time    float Linear_value = 0;    // Virtue, intelligence, and physical beauty    float d = 0;    float z = 0;    float t = 0;    float m = 0;     // Calculate the weight precision (the smaller the precision, the longer the calculation time)    float w_precise = 0.5;     / / traverse    for(float d_g = w_min; d_g <= w_max; d_g += w_precise){        for(float z_g = w_min; z_g <= w_max; z_g += w_precise){            for(float t_g = w_min; t_g <= w_max; t_g += w_precise){                for(float m_g = w_min; m_g <= w_max; m_g += w_precise){                    // Core algorithm                    for(int i=0; i<ROW; i++){                        for(int j=0; j<COL-1; j++){                            // Assign virtue, intelligence, body and beauty                            switch(j){                                case 0 : d = data[i][j];                                    break;                                case 1 : z = data[i][j];                                    break;                                case 2 : t = data[i][j];                                    break;                                case 3 : m = data[i][j];                                    break; }}/* Now that the parameters are ready, let's calculate the regression function */                        float truth_score = data[i][COL-1];     // True general comments                        float guess_score = fun( d,z,t,m, d_g, z_g, t_g, m_g);  // Guess the general comment                        / / sum                        Linear_value += (guess_score - truth_score)*(guess_score - truth_score);                        /* Check the traversal */                               cout<<"In training:"<<d_g<<""<<z_g<<""<<t_g<<""<<m_g<<""                            <<truth_score<<""<<guess_score<<""<<truth_score<<""                            <<Linear_value<<""<<Linear_value_min<<endl;                    }                    // Find the regression function                    Linear_value = Linear_value/2*ROW;                    // If the result is the minimum value                    if( Linear_value_min == -1 || Linear_value < Linear_value_min ){                        // Replace the original minimum value                        Linear_value_min = Linear_value;                        // Save the weights obtained                        d_w = d_g;                        z_w = z_g;                        t_w = t_g;                        m_w = m_g;                    }                    if( Linear_value_min == 0) {// Find the minimum, return directly                        return;                    }                    / / reset                    Linear_value = 0;                }                               //system("cls");}}}}
      
Copy the code
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91


Note 1: There are too many for loops. Since I’m still learning linear algebra, I can’t simplify the code with matrix calculations. Children who are interested can watchLinear Regression — Vectorization

Note 2: The idea of calculating weights is the basis of machine learning algorithms. Those who are still unacquainted with machine learning can read this articleWhat has machine Learning Learned?

Note 3: The mathematical markdown format in this article is queried inMarkdown Grammar: How to Write Mathematical Formulas using LaTeX Grammar

Note 4: The table examples in this article refer toMachine Learning Fun 01: The World’s Easiest Introduction to Machine Learning









The complete code of multiple linear regression is as follows:

#include <iostream>#define ROW 5#define COL 5using namespace std;// Get the data, virtue, intelligence, body and beauty score + total ratingfloat data[ROW][COL] = {    {        9.7.5.10.80}, {8.8.8.8.80}, {7.10.4.5.65}, {7.6.9.5.68}, {8.7.8.6.74    }};// Data training after completion of the weight of virtue, intelligence, body and beautyfloat d_w = 0;float z_w = 0;float t_w = 0;float m_w = 0;// The value range of each weight is estimatedfloat w_min = 0;float w_max = 4;// Multiple linear equationsfloat fun(float d, float z, float t, float m, float d_w, float z_w, float t_w, float m_w){    returnd*d_w + z*z_w + t*t_w + m*m_w; }// Multiple linear regression weight estimation algorithmvoid train( float data[ROW][COL] ){    // The minimum value of the regression function    float Linear_value_min = -1;         // Calculate the result value of the regression function each time    float Linear_value = 0;    // Virtue, intelligence, and physical beauty    float d = 0;    float z = 0;    float t = 0;    float m = 0;     // Calculate the weight precision (the smaller the precision, the longer the calculation time)    float w_precise = 0.5;     / / traverse    for(float d_g = w_min; d_g <= w_max; d_g += w_precise){        for(float z_g = w_min; z_g <= w_max; z_g += w_precise){            for(float t_g = w_min; t_g <= w_max; t_g += w_precise){                for(float m_g = w_min; m_g <= w_max; m_g += w_precise){                    // Core algorithm                    for(int i=0; i<ROW; i++){                        for(int j=0; j<COL-1; j++){                            // Assign virtue, intelligence, body and beauty                            switch(j){                                case 0 : d = data[i][j];                                    break;                                case 1 : z = data[i][j];                                    break;                                case 2 : t = data[i][j];                                    break;                                case 3 : m = data[i][j];                                    break; }}/* Now that the parameters are ready, let's calculate the regression function */                        float truth_score = data[i][COL-1];     // True general comments                        float guess_score = fun( d,z,t,m, d_g, z_g, t_g, m_g);  // Guess the general comment                        / / sum                        Linear_value += (guess_score - truth_score)*(guess_score - truth_score);                        /* Check the traversal */                               cout<<"In training:"<<d_g<<""<<z_g<<""<<t_g<<""<<m_g<<""                            <<truth_score<<""<<guess_score<<""<<truth_score<<""                            <<Linear_value<<""<<Linear_value_min<<endl;                    }                    // Find the regression function                    Linear_value = Linear_value/2*ROW;                    // If the result is the minimum value                    if( Linear_value_min == -1 || Linear_value < Linear_value_min ){                        // Replace the original minimum value                        Linear_value_min = Linear_value;                        // Save the weights obtained                        d_w = d_g;                        z_w = z_g;                        t_w = t_g;                        m_w = m_g;                    }                    if( Linear_value_min == 0) {// Find the minimum, return directly                        return;                    }                    / / reset                    Linear_value = 0;                }                               //system("cls");}}}}// Start trainingint main(){    // Training data    train(data);        // End of training    cout<<endl;    cout<<"Training over!"<<endl<<endl;    cout<<"The weight of virtue is:"<<d_w<<endl;     cout<<"The weight of wisdom is:"<<z_w<<endl;    cout<<"The body weight is:"<<t_w<<endl;    cout<<"The weight of beauty is:"<<m_w<<endl; }
      
Copy the code
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132

The result is as follows:

After the operation program, the calculated weight is reconstituted as: 4, 2, 2, 2. Insert any row in the table, for example, the second row calculates: 8×4+8×2+8×2+8×2 = 80. (Careful children will notice that I had guessed that intelligence was more important, but the calculation showed that intelligence was as important as sports, proving that I was overly biased against those who did well in their studies.)







conclusion

If you want to better learn and master machine learning algorithms, math is essential, and So is Python (otherwise writing in C is a pain). With good math skills and proficiency in python language, it’s easy to pick up a machine learning framework and use it flexibly.

If the mathematics foundation is solid, language proficiency or learning machine learning algorithm is difficult, might as well transform the thinking, the combination of mathematical thinking and programming thinking (learning MATLAB may be good). Because programming thinking pay attention to certainty, step; And mathematical thinking has relational sex, abstractness. The combination of the two is artificial intelligence.

(The above is only the blogger’s own feelings and understanding, please point out the mistakes, thank you for reading)