Translated/Ali Tao F(X) Team – Cheng Yue

Want to know how Deep Learning works? Here’s a quick guide for everyone.

Artificial intelligence (AI) and machine learning (ML) are the hottest topics right now.

The word “artificial intelligence” is thrown around almost every day. You’ll hear good developers say they want to learn about AI. You’ll also hear managers say they want to implement AI in their services. But a lot of times, these people don’t understand what ARTIFICIAL intelligence is. After reading this article, you will understand the basics of artificial intelligence (AI) and machine learning (ML), and more importantly, you will understand how the most popular type of machine learning, deep learning, works. (This guide is intended for everyone and therefore does not cover advanced mathematics)

background

The first step in understanding deep learning is to understand the differences between some of the technical terms

Artificial intelligence vs machine learning

Artificial intelligence is the replication of human intelligence in a computer. When AI research first began, researchers tried to replicate human intelligence for specific tasks — like playing games. They introduced a lot of rules that computers had to follow. The computer has a specific list of possible actions and makes decisions based on those rules.

Machine learning refers to the ability of machines to learn using large data sets rather than hard-coded rules. Machine learning allows computers to learn on their own. This type of learning takes advantage of the processing power of modern computers and can easily process large data sets.

Supervised learning vs. unsupervised learning

Supervised learning is marked by input of training data and output that conforms to expectations. When you train an AI using supervised learning, you give it an input and tell it the expected output. If the AI produces output errors, it will recalibrate its calculations. This process is iterated over the data set until the AI makes no more mistakes. One example of supervised learning is weather forecasting artificial intelligence. It learned to use historical data to predict the weather. The training data has input (pressure, humidity, wind speed) and output (temperature).

Unsupervised learning is the task of machine learning using data sets with no specific structure. When you train an AI using unsupervised learning, you need the AI to logically classify data. An example of unsupervised learning is behavior prediction ai for e-commerce sites. It does not learn by using labelled input and output datasets. Instead, it creates its own taxonomy of input data. It tells you which types of users are most likely to buy different products.

So how does deep learning work?

You are now ready to understand what deep learning is and how it works. Deep learning is a machine learning approach. It allows us to train the AI to predict the output, given a set of inputs. Both supervised and unsupervised learning can be used to train artificial intelligence.

We’ll see how deep learning works by building a hypothetical airfare estimation service. We will use supervised learning to train them. We expect our airfare estimators to use the following inputs to predict prices (we do not include return tickets for simplicity) :

  • From the airport
  • Destination airport
  • Departure time
  • airlines

The neural network

Let’s look inside the brain of artificial intelligence. Like animals, our estimator AI has neurons in its brain. They are represented by circles. These neurons are connected to each other.


Neurons are divided into three distinct layers:

  • The input layer
  • Hidden layer
  • Output layer

The input layer receives input data. In our example, the input layer has four neurons: departure airport, destination airport, departure time, and airline. The input layer passes input to the first hidden layer. The hidden layer performs mathematical calculations on our input. One of the challenges in creating a neural network is determining the number of hidden layers, and the number of neurons in each layer. “Deep” in deep learning refers to having multiple hidden layers. The output layer returns output data. In our case, it gives the price forecast.


So how does it calculate the price forecast? This is where the magic of deep learning begins. Each connection between neurons is associated with a weight. This weight determines the importance of the input value. The initial weights are set randomly. The date of departure is an important factor in predicting ticket prices. So the date of departure will carry a lot of weight.


Every neuron has an activation function. These functions are difficult to understand without mathematical reasoning. Simply put, one of its purposes is to “standardize” the output of neurons. Once a set of input data has passed through all the layers of the neural network, it returns output data through the output layer. Nothing complicated, right?

Training neural network

Training ai is the hardest part of deep learning

  1. You need a big data set.
  2. You need a lot of computing power.

For our airfare estimators, we need to find historical airfare data. Since there may be a large number of airport and departure date combinations, we need a very large ticket price list. To train an AI, we need to give it inputs to a data set and compare its output to the output of the data set. Since the AI has not been trained, its output will be wrong.

Once we have traversed the entire data set, we can create a function that shows us the errors between the output of the AI and the actual output. This function is called the cost function. Ideally, we want the cost function to be zero. In other words, the output of our AI is the same as the output of the data set.

How to reduce the cost function?

We change the weights between neurons. We can change them randomly until we have a very low cost function, but that’s not very efficient. Instead, we’ll use a technique called gradient descent.

Gradient descent enables us to find the minimum of a function. In our example, we are looking for the minimum of the cost function. It works by changing weights in small increments after each dataset iteration. By calculating the derivative (or gradient) of the cost function at a given weight, we can see which direction the minimum is in.


To minimize the cost function, you need to traverse the data set several times. That’s why you need a lot of computing power. Updating weights with gradient descent is done automatically. That’s the magic of deep learning! Once we train the airfare AI, we can use it to predict future prices.

Where can I learn more?

There are many other types of neural networks: convolutional neural networks for computer vision and recursive neural networks for natural language processing. If you want to learn the technical aspects of deep learning, I suggest you take an online course. Currently, one of the best courses for deep learning is Professional Deep Learning by Andrew Wu. If you’re not interested in getting a certificate, you don’t even have to pay for it.

To sum up…

  • Deep learning uses neural networks to mimic animal intelligence.

  • There are three types of neuron layers in neural networks: input layer, hidden layer and output layer.

  • The connections between neurons are related to a weight that determines the importance of the input values.

  • The neuron applies an activation function to the data to “normalize” the neuron’s output.

  • To train a neural network, you need a large data set.

  • Iterating through the data set and comparing the output produces a cost function that accounts for the deviation of the AI from the actual output.

  • After each iteration, the weight between neurons was adjusted by gradient descent method to reduce the cost function.



    Tao department front – F-X-team opened a weibo! (Visible after microblog recording)
    In addition to the article there is more team content to unlock 🔓