“This is the fourth day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021.”

Machine learning

Machine Learning, also known as statistical Learning theory, is an important branch of artificial intelligence. It obtains data rules through data analysis and applies these rules to predict or judge other unknown data.

For example, suppose you want to implement a program that identifies cats. Traditionally, if we want a computer to identify, we need to input a series of instructions, such as a cat with furry hair, a pair of triangular ears, and so on, and then the computer according to these instructions to identify. But how should the program react if we show it a picture of a tiger? What’s more, all the rules needed to be written the traditional way, and there are bound to be some difficult concepts involved in the process, such as what it means to be furry. Better, therefore, to let the machine teach itself.

We can provide the computer with lots of pictures of cats, and the system will view them in its own unique way. As the experiment continues, the system learns and updates, eventually being able to accurately determine which cats are and which are not.

Machine learning is divided into supervised learning, unsupervised learning and semi-supervised learning

Supervised learning

Supervised learning learns a function (model parameter) from a given training data set from which results can be predicted as new data arrives. Highly dependent on information given by pre-determined classification systems.

Supervised learning includes:

  • KNN (K-nearest Neighbor method) — Hello, KNN! , pattern open, take you to unlock kd tree
  • Naive Bayes – here we go. Naive Bayes pats you
  • Decision trees — decision trees aren’t that complicated
  • Support vector machine
  • perceptron
  • Integrated learning

Unsupervised learning

The most important difference between supervised and unsupervised models is whether manual label information is required during training.

Unsupervised learning input data is not tagged and there is no definitive outcome. The type of sample data is unknown, so it is necessary to carry out clustering (clustering) according to the similarity between samples in order to minimize the intra-class gap and maximize the inter-class gap. The common point is that in practical application, in many cases, it is impossible to know the label of the sample in advance, that is to say, there is no corresponding category of the training sample, so we can only learn the classifier design from the sample set without the original sample label.

The goal of unsupervised learning is not to tell the computer what to do, but to let it learn how to do things.

You may also have heard of self-supervised learning 👇

Self-supervised learning mainly uses auxiliary tasks to mine its own supervised information from large-scale unsupervised data, and trains the network through this constructed supervised information, so as to learn valuable representations for downstream tasks. (In other words, the supervised information of self-supervised learning is not manually marked, but the algorithm automatically constructs supervised information from large-scale unsupervised data to carry out supervised learning or training. So, most of the time, we call it unsupervised pretraining or unsupervised learning, but technically it should be called self-supervised learning.)

Semi-supervised Learning

Semi-supervised learning is a learning technique between supervised learning and unsupervised learning. It uses both labeled samples and unlabeled samples to learn. The data set it uses can be divided into two parts, one is marked data set and the other is unmarked data set, in which the category markers of sample points are unknown. The general assumption is that there is much more unlabeled data than labeled data.

Semi-supervised learning mainly considers how to use a small number of labeled samples and a large number of unlabeled samples for training and classification.

Deep learning

Deep Learning is an important branch of machine Learning and an important extension of traditional neural network.

Deep learning can be understood as a neural network structure with multiple hidden layers. “Deep” in deep learning is a technical term that generally refers to the number of layers in a neural network. A shallow network has one so-called hidden layer, while a deep network has more than one. For general simple data features, it will be transmitted from one layer of the network layer to the next layer to be represented by mapping relations, and the hierarchical structure of deep neural network can express more complex data mapping relations to represent more complex features. The problems faced by artificial intelligence also have complex characteristics of multi-level data, so deep learning has effectively solved some complex problems in all walks of life.

In order to improve the training effect of deep neural network, the neuron connection method and activation function have been adjusted. Its purpose is to build a neural network that simulates the human brain for analysis and learning, and to interpret data, such as text, images and sound, by imitating the mechanism of the human brain.

The resources

Since the supervised Learning | (1) the Self – supervised Learning portal

A review of semi-supervised learning