Applying machine learning is challenging.

In machine learning, you have to make a lot of decisions on questions that have no right answer! Such as:

· What frame?

· What data will be used as input and what data will be output?

· What algorithm is used?

· What algorithm is used for configuration?

These problems are a serious challenge for beginners.

After reading this article, you will know:

· How to form a specific learning problem.

· There are four decision points to consider when designing a learning system for your problem.

· You can use three decisions to explicitly address the challenge of designing learning systems in practice.

An overview of the

This paper is divided into six parts as follows:

1. Adaptive learning problems

2. Select training data

3. Select the objective function

4. Select the expression of the objective function

5. Choose the learning algorithm

How to design a learning system

Adaptive learning problems

We can define a general learning task in the field of applied machine learning as a program that learns lessons from a number of tasks based on specific performance measures.

Tom Mitchell made this clear in his 1997 book Machine Learning:

A computer program learns experience E from a certain type of task T and performance measurement P, and improves with experience E if it performs P in task T.

We use this as a general definition for the types of learning tasks we might be interested in, such as predictive modeling and applied machine learning. Tom gave several examples to illustrate this, as follows:

· Learn to recognize spoken language

· Learn to drive driverless cars

· Learn the classification of celestial structures

· Learn world-class backgammon

We use the above definition to define our own predictive modeling problem. Once defined, the task becomes designing a learning system to cope.

Designing a learning system, such as a machine learning application, involves four design choices:

1. Select training data

2. Select the objective function

3. Choose the expression

4. Choose the learning algorithm

There may be the best set of options available for a given problem with unlimited resources, but we don’t have unlimited time to compute resources, as well as knowledge of the domain or learning system.

Thus, although we can prepare a well-defined description of the learning problem, it is difficult to design the most probable learning system. The best we can do is to use knowledge, skills, and available resources to do our work through design choices.

Let’s look at each design choice in more detail

Select training data

You must choose the data that the learning system will use as a learning experience.

This is what we’ve seen in the past

The type of training experience available can have a significant impact on a researcher’s success or failure.

For learning problems, you must always collect the data you need.

This means:

· Clear files

· Query data

· Execution file

· Organize different resources

· Merged entities

You need to take all the data at once and put it into a standardized form, so that an observation represents an entity whose results are usable.

Selective objective function

Next, you must choose a framework for learning the problem

Machine learning is really a problem of learning the mapping function (f) from input (X) to output (y)

Y = f (x)

This function can be used in the future to predict new data that is most likely to be output.

The goal of a learning system is to prepare a function that provides the resources available to map inputs to outputs. This is a problem called functional approximation. The result will be an approximation, meaning there is an error. We will do our best to minimize this error, but some errors will always be present in the data and interfere with it.

This step is about choosing exactly what data to input into the function, such as input characteristics or input variables, and what to predict, such as output variables.

I often refer to this as a framework for learning problems, and choosing the inputs and outputs is essentially choosing the type of target function that we’re going to look for.

Select the expression of the objective function

Next, you must choose the representation you wish to use to map the function

Consider this as the type of final model you want to use to make predictions. You must choose the form of the model and choose whether you like the data structure.

Now that we have detailed the ideal target function V, we must choose an expression that the learning program will use to describe the function V to be studied.

Such as:

· Perhaps your project needs a decision tree that is easy to understand and explain to stakeholders.

· Perhaps your stakeholders prefer a linear model that statisticians can easily interpret.

· Perhaps your stakeholders don’t care about anything but model representation, so all forms of model representation are up for grabs.

The choice of representation limits the types of learning algorithms that you can use to learn mapping functions.

Choose a learning algorithm

Finally, you must choose a learning algorithm that performs input and output data and learns the expressions you prefer.

If there is no constraint on the choice of expression form, as there often is, then you may evaluate a range of different algorithms and expressions.

If there are some strict constraints on the choice of representation, such as a weighted sum linear model or a decision tree, then the choice of algorithms will be limited to those that can operate on a particular representation.

The choice of algorithm can take advantage of its own limitations, such as the preparation of transformations for specific data such as data normalization.

How to design a learning system

Developing a learning system can be challenging.

This way no one can tell you the best answer for every decision; The best answer to your assigned learning problem is unknown.

Mitchell helps clarify this by describing the choices made in designing a chess-learning system.

 

Choice description in designing a chess-playing learning system.

From Machine Learning, 1997.

Mitchell said.

In many ways, these design choices have constrained the learning task. We have defined the types of knowledge that can be used to derive a linear evaluation function. In addition, we limit the evaluation function to rely only on the six specific board features provided. If the correct objective function V is truly represented by a linear combination of these special features, then our program has a good chance of learning it. If not, then we hope that it will learn a better approximation, because a program can never learn anything that it cannot at least represent.

In general, you can’t work out the answer to these choices analytically, such as what data to use, what algorithm to use, what algorithm to configure.

Here are three strategies you can use in practice:

1. Copy: read the literature or learn from experts about the same or similar problem as yours, and copy the design of the learning system. Chances are you’re not the first person to work on a given type of problem. At worst, copying a design gives you a starting point for your design.

2. Find: List the available options at each decision point and evaluate each empirically to see which works best for your specific data. This is probably the most robust and practical result in the application of machine learning

3. Design: With the copy and find method above, after completing many projects, you will have developed an intuition about how to design and learn systems.

Developing learning systems is not a science but an engineering.

Developing new machine learning algorithms and describing how and why they work is a science, and is usually not required when developing learning systems.

Developing a learning system is very similar to developing software. You must combine copies of past designs, prototypes that show usefulness, and design experience when developing a new system to get the best results.

Recommended reading:

Machine Learning, 1997 Related issues are explained more comprehensively in the book.


This article is recommended by Beijing Post @ Love coco – Love life teacher, translated by Ali Yunqi Community organization.

Why Applied Machine Learning Is Hard

By Jason Brownlee

Translator: Altman, edited by Yuan Hu.

The article is a brief translation. For more details, please refer to the original text