Machine Learning Classical Algorithm - Decision Tree (1)

Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.

A case in point

When we are looking for a job, for example, if we want to understand a company the right company, first of all, we will care about the salary, if more than their bottom line or above now receive wages, will continue to ask whether to double cease, and the situation of the five social insurance and one housing fund payment, these problems can be seen as you want to make the final decision. What we’re doing is we’re building a decision tree.

Decision tree definition
The objective function
Overfit and prune
Advantages and disadvantages of decision trees
Integrated learning

Recently, I want to talk about the decision tree, because there is a lot of design content, so I take some time to keep updating, and I will keep updating and improving the finished article, I hope you can give me more support.

The decision tree

We classify data according to data features as nodes, and the possible value of the feature is the value of the classification. Each branch is a branch, and there are feature nodes or leaf nodes under each branch. If it is a leaf node, it cannot be divided.

Nodes understand features
The eigenvalues of the edges
Leaf node (LEAF) can understand annotation

Decision trees can be regression or classification, mainly classification. Decision trees are supervised learning.

The objective function of the decision tree

If you select features, which features are selected for partitioning, and the order in which those features are selected.

The information entropy

One of the most common ways to measure the purity of a sample set.

H(X) = – \sum_{i=0}^n p_i \log p_i

Information gain

For the set composed of only two types of data samples, its entropy can be expressed as follows. In other words, when there are the same number of these two types of data samples, its entropy value is the maximum, that is, the degree of chaos is the maximum.

H(X) = -p \log p – (1-p)\log(1-p)

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Machine Learning Classical Algorithm — Decision Tree (1)

A case in point

The decision tree

The objective function of the decision tree

The information entropy

Information gain

Machine Learning Classical Algorithm — Decision Tree (1)

A case in point

The decision tree

The objective function of the decision tree

The information entropy

Information gain

Related Posts

Paper notes based on CNN and LSTM collocation generation algorithm

Install Python+CUDA+PyTorch on Windows

Blackberry is behind Baidu Apollo? Yeah, we didn’t get it backwards