Introduction to Machine Learning -02. Basic Concepts

This is the second day of my participation in Gwen Challenge

We have already set up the basic environment, so today we are going to take a look at some related concepts, mainly about the types of machine learning and the concepts related to data.

Types of machine learning

First of all, let’s take a look at the mainstream types of machine learning, mainly including supervised learning, unsupervised learning, reinforcement learning, and deep learning.

Supervised learning

Supervised learning refers to providing marked data, including basic input data and expected output data. The algorithm will train the model continuously according to the marked expected data to generate a model that is close to the expected data.

Unsupervised learning

Unsupervised learning refers to the fact that the data provided is unlabeled, requiring the machine to explore and drive out potential connections from unlabeled data.

Reinforcement learning

Reinforcement learning is a learning style with incentive mechanism, that is, if the machine acts correctly, it will generate positive incentives, and if the machine acts incorrectly, it will generate negative incentives. In such a scenario to obtain the maximum benefit, to achieve the maximum incentive.

Deep learning

Deep learning is an algorithm derived from the algorithm based on neural network. It takes artificial neural network as the framework to carry on the representation learning of data.

Data and data sets

Machine learning requires data sets, so let’s take a look at the following table:

The serial number	countries	gender	age	income
1	China	male	24	3500
2	China	female	44	12500
3	The United States	male	28	25000
4	Japan	male	34	18000
5	China	male	17500

In the above data we call the entire table a dataset, we call a row a sample, we call a column in the table a feature, and a specific value in a column we call an attribute value. Of course, there may also be blank data in the data table. For example, the age in line 5 is blank, which is called missing data.

In the above data table, we often expect to infer the income of people in different countries according to their gender and age, so we can divide the above table into two tables:

The serial number	countries	gender	age
1	China	male	24
2	China	female	44
3	The United States	male	28
4	Japan	male	34
5	China	male	–

The serial number	income
1	3500
2	12500
3	25000
4	18000
5	17500

We expect to be able to infer the second table from the first table, as above we can call the data of the first table independent variables, and the data of the second table dependent variables.

In practice, we also need to divide the data into two parts, one for training the model and the other for testing whether the model we generate is accurate, so we can divide the data into the following two parts

The serial number	countries	gender	age
1	China	male	24
2	China	female	44
3	The United States	male	28

The serial number	countries	gender	age
4	Japan	male	34
5	China	male	–

The first table we use to train the model is called the training set, and the second model is called the test set.

Then we’ll talk about data preprocessing, which is another necessary operation before machine learning.

Introduction to Machine Learning -02. Basic Concepts

Types of machine learning

Data and data sets

Related Posts

This article provides an overview of neural network models

[Tutorial] A simple and easy to understand TensorFlow tutorial

Several summaries will be presented in this paper. Unsupervised graph anomaly detection based on deep learning