Artificial intelligence is not a new term, it’s been around for decades, starting around the early ’80s, when computer scientists began designing algorithms that could learn and mimic human behavior.

In terms of algorithms, the most important algorithm is the neural network, which is not very successful due to overfitting (the model is too powerful, but the data is insufficient). Still, the idea of using data to adapt functionality for more specific tasks has been remarkably successful, and forms the basis of today’s machine learning.

In imitation, AI focuses on image recognition, speech recognition and natural language processing. Artificial intelligence experts spend a lot of time creating things like edge detection, color profiles, N-grams, syntax trees, etc. However, these advances are not enough to meet our needs.

Traditional machine learning:

Machine learning (ML) technology plays an important role in prediction. ML has been developed through multiple generations and has formed a rich model structure, such as:

1. Linear regression.

Logistic regression.

3. Decision tree.

4. Support vector machines.

5. Bayesian model.

6. Regularize the model.

7. Model Ensemble.

Neural networks.

Each of these prediction models is based on a specific algorithmic structure and the parameters are adjustable. Training prediction model involves the following steps:

1. Select a model structure(e.g. Logistic regression, random forest, etc.)

2. withtrainingdata(input and output) Inputmodel.

3. learnxiAlgorithm willLose,The mostoptimalmodel(that is, models with specific parameters that minimize training errors).

Each model has its own idiosyncrasies, performing well in some tasks but poorly in others. But in general, we can divide them into low-power (simple) models and high-power (complex) models. Choosing different models is a very tricky problem.

Using low power/simple models is better than using high power/complex models for the following reasons:

  • In my

    It will take a long time to train high-power models before they have powerful processing power.

  • In my

    Before they have a large number of data, the training model of high power can cause excessive fitting problems (because of the high power model with abundant parameters and can adapt to a wide range of data form, so we could end up a suitable for the specific to the current training data, rather than promote enough to keep prediction of future data).

However, choosing a low-power model runs into the so-called “under-fit” problem, where the model structure is too simple to fit the training data if it is complex. (Imagine that the underlying data has a quadratic relationship: y = 5 * x ^ 2; You can’t adapt to linear regression: y = a * x + b, no matter what A and B we choose.

To mitigate the “problem of misfit,” data scientists often use their domain knowledge to come up with “input characteristics,” which are more directly related to the output. (For example, returning the quadratic relation y = 5 * square (x), if you create a feature z = x ^ 2, you can fit the linear regression: y = a * z + b, by selecting a = 5 and b = 0).

The main obstacle to machine learning is the feature engineering step, which requires the domain expert to find very important features before entering the training process. The feature engineering step is manual and requires a lot of domain expertise, making it a major bottleneck for most machine learning tasks today.

In other words, if we don’t have enough processing power and enough data, then we have to use a lower power/simpler model, which requires us to spend a lot of time and effort to create the right input features. This is where most data scientists spend their time today.

godthenetwindingThe back ofTo:

In the era of big data, the co-development of cloud computing and massively parallel processing infrastructure has resulted in a huge increase in machine processing power in the early 21st century. We are no longer limited to low power/simple models. For example, two of the most popular mainstream machine learning models today are random forests and gradient ascending trees. Still, both are very powerful and provide training data for nonlinear model fitting, but data scientists still need to carefully create features to achieve good performance.

At the same time, computer scientists are reusing many layers of neural networks to perform these human-mimicked tasks. This has breathed new life into DNN (deep neural networks) and provided major breakthroughs in image classification and speech recognition tasks. The main difference with DNN is that you can input raw signals (such as RGB pixel values) directly into DNN without creating any domain-specific input functions. With multiple layers of neurons (which is why it’s called a “deep” neural network), DNN can “automatically” generate the appropriate features through each layer, ultimately providing a very good prediction. This greatly eliminates the trouble of looking for “feature engineering”, which data scientists love to see.

DNN has also evolved into many different network topologies, so there are CNN (convolutional neural network), RNN (recursive neural network), LSTM (long-term short-term memory), GAN (generative hostile network), transfer learning, All of this is known collectively as Deep Learning, and it’s catching the attention of the entire machine Learning community.

Reinforcement learning:

Another key component is about how to mimic a human (or animal) learning, imagining the very natural animal behavior of the perception/behavior/reward cycle. A person or an animal first learns about the environment by perceiving his or her state. On this basis, he or she will choose an “action” that will take him or her to another “state”. Then he or she gets a “reward” and the cycle repeats until he or she disappears. This learning style (called reinforcement learning) is very different from the curve-fitting approach of traditional supervised machine learning. In particular, reinforcement learning learns very quickly, because every new feedback (such as performing an action and receiving a reward) is immediately sent to influence subsequent decisions.

Reinforcement learning also provides a smooth integration of prediction and optimization, as it maintains the belief of the current state and the probability of possible transitions while taking different actions, and then makes decisions about which actions can lead to the best outcome.

The depth of the studyxi+Reinforcement learning= AI

DL provides a more powerful prediction model than the classical ML technique, and generally produces good prediction results. Compared with the classical optimization model, reinforcement learning provides a faster learning mechanism and is more adaptable to environmental changes.

Original title ‘How-AI-pegases-From-ML’

The author:Ricky Ho

The tiger said eight things.

The article is briefly translated. For more details, please check it outThe original article

This article is translated and published by users for their own study and research purposes. If you find any infringement of the copyright of the original author, please contact the community [email protected]

The original link