Vectorization is a necessary acceleration technique for machine learning

From traditional machine learning to deep learning, vectorization is the most basic necessary acceleration skill

Machine learning needs to train a large amount of data to train the model. How to accelerate the training process is a problem that must be considered. The so-called vectorization, frankly speaking, is to use matrix multiplication to replace the accumulation of for cycles

A case in point

Question: If you were given 1 million pieces of data a1 to A1 million, and 1 million pieces of data B1 to b1 million, and asked to find the sum c of each pair of ai and BI multiplied together, what would you do?

The code for the for loop is shown below

Calculation results: 249879.05298545936, for cycle calculation time: 519.999980927msCopy the code

If numpy is used for the same operation (vectorization)

Calculation results: 249879.05298545936, matrix calculation time: 0.999927520752msCopy the code

The gap is deeper than the Mariana Trench, and the reason for the gap is that matrix computing programs such as NUMpy and MATLAB take full advantage of modern CPU SIMD technology to greatly improve the efficiency of computing

Vectorization in LR

Now let’s take the simplest LR logistic regression and talk about how to use vectorization techniques to manually train the model

If we choose BGD(batch gradient descent) or MBGD(small batch gradient descent), then the parameter training formula of LR is

It can be seen that in the formula derived above, the big matrix in the middle is actually the transpose of the input data matrix, so the code implementation is very simple. The LR algorithm implemented by numpy is

def grand_ascent(data_train, data_label): dataMatrix = np.mat(data_train) labelMat = np.mat(data_label).transpose() m, Shape (dataMatrix) weights = np.ones((n, 1)) alpha = 0.001 for I in range(0, 500): h = sigmoid(dataMatrix * weights) weights = weights + alpha * dataMatrix.transpose() * (labelMat - h) return weightsCopy the code

Long press the QR code to follow

Recommendation systems and machine learning

ID: RecomAI

Vectorization is a necessary acceleration technique for machine learning

A case in point

Vectorization in LR

Related Posts

Feasibility of machine learning algorithms (cornerstone of machine Learning)

Six big data types in Python

Tutorial | Generic functions: Quick element series group functions