Small public account: WeaponZhi. Get AI+Python materials and an introduction to machine learning video

The first machine learning algorithm we’re going to look at is supervised learning, so what is supervised learning? To understand supervised learning, we need to go back to the way we code. Our usual way of coding is hard coding. In short, when facing a problem, we deal with all aspects and logic of the problem through coding through positive violence coding, so that the code can run step by step according to our ideas and finally solve our established problem.

For example, suppose that we want to write a code to distinguish between a person’s gender, we may first can extract the characteristics of men and women, men may have the beard, for example, have Adam’s apple, height is generally in a range of xx in this way, then we can write some judgment conditions according to these properties, finally realizes a algorithm, Then you feed the attributes that someone has into the algorithm, and the algorithm gives you a result that tells you what gender that person is likely to be.

This might implement an algorithm, but there are two problems. The first is the determination of these attributes. There may be many attributes that determine a person’s gender. If there are too many attributes, it will be impractical to write algorithms manually. Second, the influence of these properties to the final output is limited, there will always be exceptions, some men are not long beard, or some women are the height is very high, in the face of such a special case, the algorithm is very possibly go wrong, obviously, pure rely on hard coding, it is impossible to be perfect.

And it is supervised learning algorithm, the sample input to the algorithm, some established the samples including the characteristics of the specific properties and specific output, such as “one meter of eight height, the beard, Adam’s apple”, this is the sample input, “this man is a man” is the output of it, these have clear input and the results of sample input to the supervised learning algorithm, It will learn and summarize by itself. As the number of samples increases, the judgment rules of supervised learning will be trained to be more accurate and mature. Then, when presented with a sample it has never seen before, such as “six-foot, no beard, Adam’s Apple,” the algorithm will give you a verdict based on its previous rules, telling you what gender the person is likely to be.

Attributes such as beard and Adam’s apple are called Features, and the output results such as male or female are called label-label. To implement such a function, we need a classifier, so our entire supervised learning process looks something like this.

Scikit-learn is easy to implement. If you have Anaconda installed, scikit-learn is already included. Other download methods are not described here.

If you’re using Python to learn machine learning, Scikit-Learn is definitely the library you need to use. Scikit-learn is a very conservative library, it’s very specialized, it only does machine learning, it never does anything outside of machine learning. Moreover, sciKit-Learn uses widely proven machine learning algorithms that are often the most efficient and simplest to implement. Therefore, reading the source code implementation of SciKit-Learn is also a very good way to learn.

Scikit – learn website

Let’s use specific code to demonstrate the three processes described above.

Collect real training data

In the real application environment, there are many ways to get data. For example, you can get data by reading existing files, or you can dynamically listen for data and enter data as it gets data. Let’s not be so complicated, using the above example to determine gender, simply simulate a few data:

>> features = [[160.30.2.1], [170.15.2.3], [178.8.2.5], [188.10.2.8], [167.22.2.2]]
>> labels = [0.0.1.1.0]
Copy the code

Take the first data [160,30,2.1] for example, 160 represents height,30 represents hair length,2.1 represents waist circumference, and the unit is feet. Well, forgive me if MY imagination is a bit weak, and I don’t have a good example, but at the moment we use height, hair length and waist circumference as the characteristics that we use to determine gender, and of course you can extract more, just for our convenience. Labels are labels, so 0 means the person is female, 1 means the person is male

In this way, we have completed the first step of collecting real data

Training classifier

So we have the data source, and we need to pick a classifier to train this set of data, and we’re going to pick a decision tree, and we’re going to ignore what a decision tree is, but you can think of it as a form of implementing a classifier, right

>> from sklearn import tree
>> clf = tree.DecisionTreeClassifier()
>> clf.fit(features,labels)
Copy the code

Sklearn is the SciKit-Learn library. We import the decision tree class from the library and use the fit() function to match features and labels. Then our classifier is complete and CLF has completed its training.

Make predictions about the data

Actually you can put the classifier as a black box, with some data of training, the box would be certain judgment, when you input into the data again, it will be able to get the corresponding forecast, as to what this box using the method based on the data of the training, to specify rules that you selected algorithm is doing, such as here, We chose a decision tree as the rule-making algorithm for the classifier, but there are many other algorithms you can choose from, and you can implement your own algorithms, which are beyond the scope of this article. Below, we use the implemented classifier to predict an unknown data

>> print(clf.predict([[180.15.2.3[]]))0]
Copy the code

We use the predict() function to predict a set of data that the classifier has never seen before. Classifiers think a 6-foot-2 with 15-centimeter long hair and a 2.3-foot waist is a girl. Well, here’s what it thinks.

So, this is your first machine learning code, isn’t it simple and fun?


Reference: Machine Learning Recipes with Josh Gordon

Welcome to follow my public number