Editor’s note: In the tech world, if you don’t understand “machine learning,” you’re out. When someone is talking about machine learning and you don’t know what to do? If you can only nod your head but can’t get a word in edgeways with your colleagues, what should you do? Let’s make a change! Adam Geitgey has written an easy to understand “machine learning, Fun” material, divided into five parts, mainly for all interested in “machine learning”, but do not know where to start friends, hope. In this way, more people can know about “machine learning” and stimulate their interest in “machine learning”.

I’m sure many of you are tired of the long wikipedia pages and are desperate for a more accurate, higher-level explanation. This is exactly what machine learning will do for you!

What is “machine learning”?

Machine learning means that instead of writing complicated, specific, traditional code, machines can use a common set of algorithms — generic algorithms — to tell you something interesting about a set of data. You don’t need to write code or program. You just give the data to the generic algorithm, and it automatically sets up the data logic.

For example, generic algorithms for classification can categorize a set of data, and generic algorithms for handwritten numbers can also distinguish spam from non-spam without changing the code. Both use the same generic algorithm, but because the input data is different, the classification logic of the output is different.

This machine learning algorithm is a black box that can handle all kinds of classification problems.

Machine learning is an umbrella structure that covers a wide variety of generic algorithms in addition to classification algorithms.

Two classes of machine learning algorithms

You can think of machine learning algorithms for a moment as systems with two broad branches — supervised and unsupervised. The difference between the two is obvious, but its importance cannot be ignored.

① Supervised learning

Let’s say you’re a real estate agent, and your business keeps growing, so you have to hire a bunch of new intern agents to help. Here’s the problem: You can get a rough idea of what a house is worth at a glance, but your hiring of an intern agent isn’t as experienced and accurate as you are.

To help your internship agent (and give yourself a break, of course), write an app that does a real estate appraisal, identifying similar homes based on size, neighborhood, and so on, and then estimating prices.

So you need to keep track of all property transactions in your city over a three-month period. For each transaction, you need to record a number of details, such as the number of rooms, the size of the house, the surrounding area, and most importantly, the final sale price.

That’s the data we’ve collected, what we technically call “training data,” and we hope to use it to create a program that will help us value other houses in our area. Of course, we also want to be able to use this data to estimate the price of other houses outside the area.

This model is called “supervised learning.”

You know the price of each house right from the start, in other words, you already know the answer to a question, and you can go back and rethink it, and build up a logic to solve it.

To build an app that estimates house prices, you feed data about each house — “training data” — into your machine learning algorithm, which comes up with a mathematical formula that solves these data relationships.

It’s a bit like a math test sheet with all the arithmetic symbols blacked out, as in the picture below.

Oh, my gosh! The arithmetic symbols on the teacher’s answer sheet had been blotted out by some naughty student! At this point, it should be hard for you to see what kind of math the exam is about, is it addition or subtraction?

So to solve the problem of addition or subtraction, you need to do some “processing” on the left and right numbers to figure out how they relate.

With supervised learning, you let the computer do a series of computational tasks for you. When you understand the mathematical relationship between the left and right sides of the equation, you will know how to solve the mathematical problem, and similar problems will be solved.

② Unsupervised learning

Let’s go back to the real estate agent example. What if even you, the real estate agent, are unsure about the price of each house?

If all you know is the size of your house and its location, that’s fine. You can still make a difference with what I’m going to describe as “unsupervised learning.”

Machine learning allows you to do things without having decisive digital information (in this case, the price of a house).

It’s a bit like someone handing you a piece of paper with a bunch of numbers written on it and saying, “I don’t know what all these numbers mean, but you should be able to draw some sort of hidden message from them. Good work, good luck!”

What can we do with this string of numbers? For starters, you can use an algorithm to automatically identify market segments within your data. You may find that some buyers in the university area tend to buy smaller houses with lots of rooms, while buyers in the suburbs tend to buy houses with lots of floor space and fewer rooms (maybe only three). Understanding the different needs of different customers can be a great guide to your marketing work.

In addition, what you can do is automatically identify those isolated houses that are different from others, maybe those isolated houses are high-rise buildings, so you can target your sales in this area.

In the rest of this post, we’ll focus more on supervised learning, but that doesn’t mean unsupervised learning is useless. In fact, as algorithms improve, “unsupervised learning” becomes more important because it works even when data is not labelled with the correct answer.

Note to academics: There are many different types of generic algorithms for machine learning, but these two are a good place to start for beginners.

Using “machine learning” to value a house is cool, but is it really learning?

As a human, your brain can handle many unexpected situations and learn to handle them without explicit instructions. If you’ve been selling homes for a long time, you have a gut reaction to the correct valuation, the marketing strategy for the home, the potential customers for the home, and so on. The goal of powerful AI research is to replicate this instinctive response on a computer. But for now, machine learning algorithms are not that advanced, and for now they are only suitable for a limited number of specific problems. Therefore, in this case, it might be more appropriate to define “learning” as “deriving corresponding formulas from sample data to solve a particular problem.” But clearly the definition of a machine using formulas derived from sample data to solve a particular problem is not a good name, so we’ll just call it machine learning.

Of course, ai could be that powerful in 50 years, and you’ll find this post a little strange. If so, please put down this post, have your robot nanny make you a sandwich, and enjoy the convenience and comfort artificial intelligence brings.

Without further ado, let’s try writing that home appraisal program!

How on earth do we write a program that values a house? Please think about it in your own mind before you read on.

Assuming you don’t know anything about machine learning right now, you might be able to write about some basic rules and regulations related to home valuation. Something like this:

If you had the patience to study the above code, you might have written a decent appraisal program. However, this appraisal process is not perfect, and when prices fluctuate, your appraisal process may not be applicable.

Wouldn’t it be nice if the computer could output a relatively accurate valuation function for you? As long as the output estimate matches the price in the original data, no one cares where the function came from.

Look at it another way: think of the final price of a house as a bowl of soup in which the ingredients are required information about the number of rooms, the size of the house, and the surroundings. If you know how much each ingredient contributes to the soup (estimated price), you should have a relatively accurate “stew ratio,” which can also be used to value a house.

You will find that these assumptions, while inappropriate and slightly unrealistic, simplify your complex function to the extent that:

Notice the numbers in bold: 841231951398213, 1231.1231231, 2.3242341421, and 201.23432095. These are the metrics we use to balance our calculations, and if we can figure out the perfect metrics for all houses, then the program can estimate the price of a house.

The dumbest way to calculate this perfection indicator is something like this:

The first step:

Set each metric to 1.0

The second step:

Test your program with all the information you know about real estate prices, and check that the output of your function deviates from the actual home price.

Use your function to estimate the price of each house.

For example, if the first house actually sells for $250,000, but your function outputs a price estimate of $178,000, the program’s price estimate is off by $72,000 for that one house.

Now add up the squares of the price deviations for each house in your database.

Let’s say you have 500 house transactions in your database, and the total value of your function’s estimated bias from testing those transactions is $86,123,373. This number represents the total value of your function’s current bias. Now, divide the total deviation by 500 to get the average of the estimated deviation for each house. Call that average the “cost” of the function for now. If you can get the “cost” down to zero by playing around with the tradeoffs, your program is perfect. That means your program can make a relatively accurate estimate of the value of a house based on the data input.

Step 3:

Repeat step 2 using all possible combinations of indicators. Find the combination of metrics that maximizes the “cost” of your function to zero, and your problem is solved.

Three simple steps! Now let’s see what we’ve done: First, you take some data, apply it to three very simple, generic steps, and end up with an application (function) that estimates the prices of properties in your area.

With machine learning so powerful, watch out for Zillow.

However, the following questions may cause you some problems.

1. In recent 40 years, as many different fields such as linguistics, translation, according to a survey of these used to cook a pot of stew “data” (according to the example just stew creates new words) of the general algorithm than the man himself figured out a set of rules is more useful, even the most stupid machine learning tools than other human expert.

2. The function you end up developing doesn’t know about “room size” or “number of rooms,” it just uses the input known data to get the best estimate.

3. It’s possible that you have no idea how a set of metrics predicts home values, which means that you have a function that you don’t fully understand how it works, only that it helps you estimate home prices.

4. Imagine if, instead of the coefficient variables “square footage” and “number of rooms,” the function you just developed for estimating property values entered a bunch of numbers, each of which represented the pixel brightness of a photo taken with a camera mounted on the roof of your car. At this point, calling the output estimate “house price” is less apt than “steering wheel steering”. At this point, you have a function that allows the car to steer itself. Isn’t that amazing?

What about the “use all possible metrics” mentioned in step 3?

Of course, you can’t find the perfect combination by trying out all the possible metrics one by one, because it would take a lifetime to try out endless numbers.

To overcome this inefficient attempt, mathematicians have developed many simpler and more effective methods. Here’s one of them:

First, write a simple equation based on the data in step 2 above:

This is your cost function.

Now, try rewriting this equation in machine learning mathematical terms

θ represents your current metrics, and J(θ) represents your cost at your current metrics. This equation represents the degree to which our current value estimation function deviates from reality.

If we add this cost function to all possible metrics and plot it as a curve, we get a curve that looks something like this:

The graph of our cost and expense function presents a bowl-shaped graph, with the ordinate indicating cost and expense. The lowest point in blue on this graph is the lowest value of our cost, which means our function deviation is the least. On the contrary, the highest point is the maximum deviation of our function. So, when we find the index value that minimizes the image, the problem is solved.

Therefore, we constantly adjust our index values to get closer to the lowest point of the image, like going down a mountain path. As long as we get closer and closer to the bottom of the graph, we can eventually find the best formula for the function without trying out the numbers.

If you remember a little bit about calculus, you’ll remember that if you take the derivative of a function, you know the slope of the tangent line at each point. In other words, knowing which direction the Angle will make us fall on the graph, we can use that knowledge to approach the lowest point of the graph.

If we can take the partial derivatives of the cost function using each index value, then we can reduce each index value accordingly, and we can get closer to the bottom of the mountain, eventually reaching the lowest point and finding the optimal index value. (If all else fails, don’t worry, read on.)

The above is a relatively advanced summary of one of the methods for finding the best index value for a function, which we call “gradient descent”. If you’re interested in the details, delve a little deeper.

When you use machine learning software tools to solve a real problem, you don’t have to worry about all of this. But it’s also important to understand how it actually works.

What other advantages does machine learning bring you?

In technical terms, the three steps of the generic algorithm mentioned above are also known as “multiple linear regression”. We’re trying to figure out an equation that applies to all the housing data, and then use that equation to estimate the value of a house that you’ve never seen. It’s amazing to be able to use this seemingly fictitious equation to solve a real problem.

However, the approach I described earlier can work in some simple cases, but it is not foolproof. One reason lies in the non-linear nature of housing prices, that is, there is no specific law to follow.

Fortunately, there are many ways to overcome the uncertainty of housing prices. There are many other machine learning algorithms that can solve nonlinear data, such as neural networks or kernel support vector machines; And there are many linear regression methods for complex linear problems. All in all, either method is aimed at finding the best index value.

However, like most people, I ignored the problem of overfitting. It’s easy to find a set of index values that fit your original real estate database in the beginning, but these index values only apply to your original data, not new property data outside the original database. There are also ways to solve this problem, such as regularization or cross-validating data sets. Knowing how to deal with these problems is the key to mastering machine learning.

In other words, while the basic concepts are simple, it takes some skill and experience to really master machine learning and make it work. Fortunately, these are skills and experiences that every developer can learn to master.

What’s the magic of machine learning?

Once you have witnessed the ease with which a seemingly difficult task (such as handwriting recognition) can be accomplished using machine learning techniques, you will understand how complex problems can be solved with enough data. You simply enter the data and wait for the computer to come up with an equation that works for you.

But it’s important to understand that machine learning can only solve data problems, so you want to make sure that the data you input is helpful to solve the problem.

For example, if you build a model that estimates the value of a home based on the type of potted plants in the house, it is obviously impossible to predict the value of the home. Since the species of potted plants has nothing to do with the price of the house, the computer cannot derive the relationship between the two.

You can only model relationships that actually exist. So remember, if an expert can’t manually solve a problem with that data, then neither can a computer. Instead of saying that humans can solve problems, we should focus on how much time machines save on solving problems.

How to learn more about machine learning?

In my opinion, the biggest problem with machine learning is that its use is limited to some academic and business research and its popularity needs to be improved. At the same time, some laymen who want to learn more about machine learning are deterred by the lack of easy-to-understand industry materials. However, these problems are being solved step by step.

Andrew Ng’s free machine learning course on Coursera is extremely valuable, and I highly recommend getting started with this course. It is easier to understand for those with a degree in computer science and for those with a rudimentary knowledge of mathematics.

You can download and install SciKit-Learn, a Python framework that covers all standard algorithms and allows you to play around with different machine learning algorithms.

Note: This article is compiled by TupuTech. You can follow the wechat official account Tuputech to get the latest and best ai information.