In the previous article of my book, we can find that machine learning is a huge category, including a variety of algorithms, and then we can classify these algorithms, and there is not only one method of classification, so now let’s take a look.
First of all, let’s look at a classification method for algorithms:
The first algorithm classification, we can divide machine learning algorithms into supervised learning, unsupervised learning, semi-supervised learning. So what is supervised learning, unsupervised learning and semi-supervised learning? In fact, this aspect has been outlined in my previous articles, and I will briefly say a few words here. So what we’re doing is we’re doing machine learning, we’re doing an algorithm, which is essentially training a sample of data, whether it’s a classification model, whether it’s a prediction model, what we’re going to do is we’re going to build a model, and we’re going to use that model to determine the relationship between X and Y. Were, in fact, the so-called supervised learning in the training data clear what is the Y values are given, such as we are a classification algorithm, if the use of supervised learning, have made it clear in our training data shows the data belongs to that category, it can be as training data has been playing well in advance for the label. Just as we identify spam, we will give a heap of training data system first, this group of data is system first to a pile of junk mail, the mail has been playing well in advance we label it is spam or not spam, then we to analysis the data, in fact we have known this Y is spam, And then we train the parameters of the model based on this given Y. The training methods for some of the above data are collectively referred to as supervised learning. Typical supervised learning algorithms include classification algorithm and regression algorithm. Because no matter the classification algorithm or the regression algorithm, we have given the classification Y explicitly in the training set data. For classification, this Y might be a category, for regression, this Y might be a number. Our final result should be as close to the Y as we can get.
So the opposite of supervised learning is unsupervised learning, and the simple summary is that in unsupervised learning, we don’t have a Y, in other words, we don’t know what the Y looks like. The most common algorithm is clustering, which is the algorithm of telecom user classification mentioned in the previous article. After all, we do not know how many kinds of users are divided into, so we can use unsupervised learning to classify users by machine.
The last category is semi-supervised learning, which is called reinforcement learning. In fact, this semi-supervised learning is quite understandable, just like our baby is just learning to walk at home. Your baby will certainly stumble at the beginning of his walk, but with more walking, he will be able to walk smoothly and steadily, and even speed up to a trot. The so-called semi-supervised learning is actually similar to this example, there may be some Y values in the algorithm, and your training model may not do well at first, but maybe as you train more samples, your model will get better.
Ok, so that’s the algorithm of the first class, the algorithm that divides the algorithm into three categories based on whether or not you have that Y in your data. Now let’s look at the second category of algorithms:
According to the classification of the second algorithm is, in fact, we want to actual problem for classification, such as classification and regression, we want to predict a Y value, clustering, is to be aimed at a specific are you going to solve the type classification, the last one, mark, mark is a bit similar to the classification, to some extent but and classification and there is a little different. So what are annotations?
Now LET me write a text:
This text can actually be labeled word by word
— — — — — — — — — — — — — — — –
I’m working hard on machine learning
— — — — — — — — — — — — — — — –
Let’s say I want to see which word in this sentence is a noun, which word is an adjective, and which word is a verb, so that’s essentially labeling each word in the sentence and breaking it down.
The above is the second algorithm classification, they are mainly based on the actual problem to solve the classification.
Now let’s look at the third category of algorithms:
Generate model and discriminant model
The third category is a very important algorithm, because this kind of algorithm has been directly into the essence of your algorithm, so the third type of algorithm is a very important content, no matter how to also want to study, and at the time of the interview is likely will lose you, but does not directly ask you concept, such as what is generated model what is discriminant model, But he might ask you what’s the difference between an algorithm that generates models and other algorithms? For example, what is the essential difference between a logistic regression algorithm and a naive Bayes algorithm? If you want to find out what the difference is, to be honest, what is the essential difference between your generative model and your discriminant model. So we say that the third kind of algorithm is more important, but the first two are also important, but the first two algorithms are simpler. Let’s say you go on an interview, and the people you’re interviewing with assume you know this. The third kind of knowledge is actually more advanced knowledge. But for this introductory article, I personally think it’s a little early to introduce the difference between a generative model and a discriminant model. Because there are still many concepts that are relatively difficult to understand, SO I will briefly mention, not in-depth overview.
So let’s look at the discriminant model. The generative model and the discriminant model are basically used to discriminate the classification problem. It’s dividing something into categories like ABC and so on.
The discriminant model is like giving you a function, and you throw your data at the function, and the function will return a result, and the function will tell you which category you belong to. The discriminant model points directly to your purpose.
This generation model is actually A bit similar to that of the jury on the court, (the discriminant model is belong to justice), he won’t tell you what kind of data you just will tell you the probability of your data belong to this class will be how, for example, we now have the ABC three categories, generating model will tell you how big is your class A probability will, How likely are you to be a B, how likely are you to be a C. The ones with the highest probability have the highest probability of falling into this category. But it’s less likely to fall into this category.
Then we can see from the above, the discriminant model and the generative model give different results. The discriminant model is either one or two, and the generative model is ambiguous. But the more important difference between the two is that they are different in their training ideas.
These are the algorithms included in this article. In the next article, I will continue to talk about the algorithm of machine learning, and I hope you can give me more advice. I will continue to work hard and learn more knowledge.