Wen: Sun snow
Source: Python Technology [public ID: pythonall]

Machine learning and data analysis are becoming more and more important, but in the process of learning and practice, I often feel distressed because I don’t know how to use the program to realize various mathematical formulas. Today, we will learn about the ways and methods of using Python from the perspective of mathematical formulas.

A word of caution: don’t be intimidated by formulas, they are paper tigers

About Numpy

NumPy is the basic software package for scientific computation using Python. It includes, among other things:

  • Powerful N – dimensional array objects

  • Precision broadcast function

  • A tool for integrating C/C+ and Fortran code

  • Powerful linear algebra, Fourier transform and random number functions

Machine learning and data analysis, NUMpy is the most commonly used scientific computing library, can be used in a minimalist, consistent with the habit of thinking way to complete the code implementation, for learning and practice provides great convenience

Environment to prepare

Create virtual environment (can be omitted), install numpy package:


pip install numpy

Copy the code

Test installation:


>>> import numpy

>>>

Copy the code

In the following practice, numpy is referenced as NP by default:


import numpy as np

...

Copy the code

Basic operation

Most operations in programming languages are for simple numerical values, and complex operations are calculated by combining corresponding data structures with program logic. Although numpy is constructed for complex data structures, such as matrices, it provides the same ease of operation as simple numerical calculations.

Power operation

Exponentiation operators are **, that is, two asterisks (one asterisk for multiplication), such as computing x squared: x**2, x cubed: x**3, and so on

Taking the square root is equivalent to taking the 1/2 power, which is x**(1/2) or x**0.5, because numpy provides a convenient function, SQRT, such as taking the square root of a number x, is np.sqrt(x).

There’s actually a handy way to square: Np.square

The absolute value

Absolute value indicates a number line on the value of the distance from the origin, expressed as | | x, abs numpy provide convenient method to calculate, such as np. Abs (x), for the absolute value of x

Understand vectors and matrices

Linear algebra is one of the basic mathematics of machine learning and data analysis, and vectors and matrices are the basic concepts of linear algebra, so it is very important to understand vectors and matrices.

vector

General data is divided into scalars and vectors, and a scalar is easier to understand, a number on a number line

Intuitively, a vector is a set of values, which can be understood as a one-dimensional array. But why is the common definition: a value with direction, what does direction mean? I’ve had this problem for years (wry smile). In fact, it’s because when you start learning linear algebra, you start directly with the formula theorem, without knowing how it works or where it comes from.

The direction of a vector means that the origin of the coordinate system in which the vector is located points in the direction of the point represented by the vector in the coordinate system. For example, in the plane rectangular coordinate system, the vector [1,2] represents a point with X-axis 1 and Y-axis 2. From the origin, namely [0,0], the vector points in the direction of the point, which is the direction of the vector. Extended three dimensional coordinate system, to n is coordinate system (of course, more than three human is difficult to understand), the number of vector elements means that the vector belongs to the coordinate system of several dimensions, but no matter how many dimensions, you can draw the direction of the origin to the point of the vector.

Because linear algebra studies the pure mathematical calculation of vectors and vector groups (matrices), the concept of coordinate system is discarded and only the appearance of vectors is retained, thus causing the phenomenon that vectors are difficult to understand.

Simply put, a vector is an array of values.

matrix

So if you understand vectors, it’s easy to understand matrices, as a set of vectors, as a set of points in a coordinate system, and matrix operations, as operations or transformations of vectors.

Maybe it’s a little convoluted or redundant, but I’ll leave it there, and I’ll explain what vectors and matrices actually mean in a future article

Initialize the

Numpy provides a variety of methods for generating vectors and matrices. For example, arrays can be initialized as Numpy matrices:


m = np.array([(1.2.3), (2.3.4), (3.4.5)])

Copy the code

I can create a matrix with three dimensions and three numbers

Basic operation

Numpy is particularly good at handling vector and matrix operations, such as multiplication, which means multiplying each value in a vector by a multiple, and writing code in between that allows you to traverse the vector and multiply each value by a multiple.

Using Numpy is a lot easier: x * 2, just like a scalar operation, feels like vectors have the same value.

  • Add x + 2,

  • Subtraction – 2 x

  • Punishment x / 2

Matrix power operation

Since a vector or matrix can be viewed as a number, power operations are easy to understand, e.g., matrices

M squared can be written as m**2, and the result is:

Matrix is the dot product

Matrices of different dimensions can perform multiplication operations, but not general multiplication operations. The operations are called dot products. In order to be represented by numpy, you need dot functions, such as the matrices M and n

With the code m.dot(n), you get the following result:

Summation and multiplication

Sums are common in statistical formulas, such as over matrices:

Denotes the sum of all elements in the matrix M, nunpy completes the calculation by sum:

m.sum()

Similar to summation, the product of all elements in a matrix is:

Numpy is computed by prod, such as the multiplication of matrix m to m.rod ().

practice

Now that you know the basics, do some practice

Calculate the mean

The formula for the mean value of vectors is:

Analyze the formula, where n is the number of elements of vector x, numpy vector, obtained by size, followed by vector sum, completed by sum, the final code is as follows:


(1/x.size)*x.sum(a)Copy the code

or


x.sum()/x.size

Copy the code

Implement the Frobenius norm

Now for a more complicated Frobenius norm, the formula is as follows:

So instead of worrying about what Frobenius’s formula means, let’s just do it in Python, and analyze it, and you can see, first you square each element of the matrix, then you sum it up, and then you take the square root of the result, so you write it from the inside out, right

The sum of the matrix elements, written as m**2, will give you the new matrix, and the sum can be written directly as:


np.sqrt((m**2).sum())

Copy the code

Numpy to achieve the formula, very simple.

Sample variance

We’re looking at a formula:

Among themDenotes the mean of vector x, calculated above, so it can be applied as:


np.sqrt(((x-(x.sum()/x.size))**2).sum()/(x.size-1))

Copy the code

For example, the mean value of vector x can be written as: np.mean(x), so the above code can be simplified as:


np.sqrt(((x-np.mean(x))**2).sum()/(x.size-1))

Copy the code

The above formula is actually the sample standard deviation formula. For standard deviation, NUMpy provides a convenient method STD, which can be used directly

Np.std (x) can be calculated, of course, now we use the standard deviation formula:

It’s easy to write a Numpy implementation, so give it a try.

Euler distance

In the previous writing, euler distance was used to simulate the spread of the epidemic. At that time, the expression ability of NUMPY formula was not well understood, so it took three steps to calculate. Now, if you want to calculate the Euler distance between two vectors, a line of code can do it.

Numpy is implemented as:


np.sqrt(((a-b)**2).sum())

Copy the code

Because euler distance is widely used, NUMPY is implemented in linear algebra module, so after understanding numPY’s method of realizing mathematical formula, it can be simplified as:


np.linalg.norm(a-b)

Copy the code

conclusion

Numpy is a broad and profound mathematical computation library, which is the foundation of scientific computation in Python. Today, we have learned how to convert numpy into code implementation from the perspective of mathematical formulas. In the process of data analysis or machine learning, or paper writing, even if you do not know the concise operations in NUMpy, you can write code based on mathematical formulas, and then learn and understand NUMpy through practice is easier

reference

  • Blog.csdn.net/garfielder0…

  • Blog.csdn.net/robert_chen…

  • mathtocode.com/

Welcome to follow the wechat official account: Python technology, here we have personally written 100 days of actual practice, there are all kinds of interesting programming practice, there are all kinds of learning materials, and a large group of lovely friends to discuss with each other.