Math can be the hard part for many of you who have fallen into machine learning. In this article, the author introduces the mathematical background needed to build machine learning products or conduct machine learning research, as well as valuable experience and advice from machine learning engineers, researchers, and teachers, and provides many course and book resources.

It’s not entirely clear what level of math is required to start learning machine learning, especially for those who didn’t study math or statistics in school.

The goal of this article is to propose the mathematical background needed to build a machine learning product or conduct academic research on machine learning. These recommendations are based on conversations with machine learning engineers, researchers, and educators, as well as my experience in machine learning research and industry.

In order to construct the prerequisites for the required level of mathematics, this paper first proposes different modes of thinking and strategies for readers to approach mathematics education outside the traditional classroom. It then Outlines the specific backgrounds required for different types of machine learning projects, as these disciplines range from high school level knowledge of statistics and calculus to recent developments in probabilistic graph modeling (PGM). At the end of this post, I hope the reader gets a feel for the math learning you need to use effectively in your machine learning projects, no matter what!

First, I acknowledge that learning styles/frameworks/resources, learners’ individual needs and goals may be unique. We welcome your comments on THE HN!

Some experience about mathematics anxiety

It turns out that a lot of people, including engineers, are terrified of math. First of all, I want to talk about the topic of “being good at math”.

In fact, people who are good at math have a lot of practical experience in using math. As a result, they may find it normal to get stuck in math. Recent research has shown that a learner’s mindset, rather than innate ability, is the main predictor of a person’s ability to learn math.

Be aware that it takes time and effort to reach this well-adjusted state of routine, but it’s certainly not a natural ability. The rest of this article will help the reader determine the required level of mathematical foundation and outline strategies for how to build that level.

Getting started: Math and code

As a soft prerequisite, we require the reader to have some basic knowledge of linear algebra/matrix operations (so as not to get confused in symbolic representation) and a rudimentary understanding of probability theory. In addition, you are encouraged to master basic programming skills as a tool for learning math in a given context. After that, you can adjust your focus depending on the type of project you’re interested in.

How to study math outside school

I believe that the best way to learn math is to study it whole-heartedly (i.e. as a student). Without this all-day learning environment, you may not master the structure of an academic classroom, the peer pressure, and the resources available.

If you are going to study math outside of school, I recommend organizing study groups or luncheons and study seminars as important resources to get motivated. Reading groups can also be organized in research LABS. In terms of learning structure, your reading group can browse the chapters of the textbook and hold regular discussion lectures, open up channels for Q&A activities, etc.

Culture plays an important role here. This “extra” learning should be encouraged, lest it be lost in the daily grind. In fact, despite the short-term costs of this approach, building a peer-driven learning environment can make you more productive in the long run.

Math and code

Math and code are highly integrated in machine learning workflows. Code is often built directly from mathematical intuition and even shares some symbols and syntax with mathematics. In fact, modern data science frameworks such as NumPy make it much more intuitive and efficient to translate mathematical operations such as matrices/cross products into readable code.

I encourage readers to use code as a way to reinforce learning. Both mathematics and code depend on the accuracy of understanding and representation. For example, practicing the manual implementation of loss functions or optimization algorithms can be a good way to really understand the underlying concepts.

Let’s take a concrete example of learning math through code: backpropagation of ReLU activation in a neural network (yes, it can be done with Tensorflow/PyTorch! See medium.com/karpathy/y…) . Back propagation is a highly efficient gradient calculation technique based on calculus chain rules. To use the chain rule in this setting, we multiply the upstream derivative by the gradient of ReLU.

First, we visualized the activation of ReLU, as shown in the figure:

To calculate the gradient (or, intuitively, the slope), a piecewise function can be visualized as an index function as follows:

v

NumPy provides us with useful and intuitive syntax, and our activation function (blue curve) can be explained in code, where x is the input and relu is the output:

relu = np.maximum(x, 0)
Copy the code

Then there is the gradient (red curve), where grad indicates upstream gradient:

grad[x < 0] = 0
Copy the code

The meaning of this line of code might not be obvious without first deriving the gradient yourself. In our code, all values in upstream gradient (grad) are set to 0, [h <0], for all elements that satisfy the condition. Mathematically, this is actually equivalent to a piecewise representation of the ReLU gradient, which, when multiplied by upstream gradient, sets any value less than 0 to zero!

As you can see above, we can think about code clearly with our basic understanding of calculus. A complete example of a neural network implementation is here: pytorch.org/tutorials/b…

The mathematics needed to build machine learning products

In writing this section, I consulted with machine learning engineers to determine how best to use math to debug systems. The following example of a problem is a mathematical response from the engineer. It doesn’t matter if you’ve never seen it before, I hope this chapter provides some context for the specific questions you’re interested in.

Which clustering methods should I use to display high-dimensional customer data? Methods: PCA and tSNE

How should I calibrate the threshold for “blocking” fraudulent user transactions? Methods: Probability calibration

How to correctly describe the deviations of satellite data in different regions of the world? (Silicon Valley and Alaska, for example) Approach: Make research questions public. Like equalization for demographic informatics?

Usually, statistical tools and linear algebra can be applied to each of these problems in some way. However, a specific approach to a specific domain is often required to obtain satisfactory answers. In this case, how do you determine what math you need to learn?

There is a wealth of resources available to define systems (for example, SciKit-Learn for data analysis, Keras for deep learning) that can help readers start writing code to model systems. When using these resources, try answering the following questions:

  1. What are the inputs/outputs of the system?
  2. How should the data be prepared to fit the system?
  3. How do you build functionality or curate data to help generalize the model?
  4. How to define reasonable goals for the questions raised?

You may be surprised that it can be difficult to define a system! After that, the engineering required for pipeline construction is also very important. In other words, building machine learning products requires a lot of grunt work that does not require a deep mathematical background.

• Best Practices for ML Engineering by Martin Zinkevich, Research Scientist at Google

Learning math on demand starts with looking at how machine learning works, and you may find that you get stuck at some steps, especially when debugging. Do you know what to look for when you’re stuck? How reasonable are your weights? Why can’t your model be reconciled with a specific definition of loss? What is the right way to measure the success of a model? At this point, it may be helpful to make assumptions about the data, constrain optimization in different ways, or try different algorithms.

Often, you will find mathematical intuitions (such as how to choose loss functions and evaluation metrics) in the modeling/debugging process that may help you make informed engineering decisions. These are opportunities for you to learn!

Rachel Thomas from Fast. Ai is a proponent of this “learning on demand” approach. In educating students, she found that it was more important for deep learning students to get excited about the material. After interest is established, math education becomes on-demand.

The Computational Linear Algebra by Fast. Ai YouTube: 3blue1Brown: The Essence of Linear Algebra and Calculus Linear Algebra Done Right by Axler textbook: Elements of Statistical Learning by Tibshirani et al. Stanford’s CS229 (Machine Learning) Course Notes

Mathematical knowledge for machine learning research

The author now wants to describe which mathematical ways of thinking are useful for research-oriented work in machine learning. The pessimistic view of machine learning research is embodied in plug-and-play systems, where models put more computing power into squeezing out higher performance. In some populations, researchers remain skeptical of empirical approaches that lack mathematical rigor, such as some deep learning approaches.

It is worrying that future research systems may be based on existing systems and assumptions that do not extend our basic understanding of the field. Researchers need to provide raw resources to build new infrastructure modules that can be used to gain fresh perspectives and approaches to practical goals. Building blocks such as convolutional neural Networks for image classification may need to be rethought, as Geoff Hinton, the “father of machine learning”, argues in his recent paper on Capsule Networks. (Link to paper: arxiv.org/pdf/1710.09…)

Next, we need to ask fundamental questions. This requires a “deep understanding” of mathematics, which Michael Nelson, author of Deep Learning, calls “an interesting exploration.” This process involves thousands of hours of constant “getting stuck”, constantly asking questions and changing opinions as new questions are explored. “Interesting explorations” enable scientists to ask deep, insightful questions, not just simple combinations of ideas/architectures.

Of course, obviously, you still can’t learn everything when studying machine learning! To do “interesting exploration” properly requires doing it properly and following your own interests rather than prioritizing the hottest new results.

Machine learning research is a very rich area of research with pressing issues of fairness, interpretability and accessibility. In virtually all scientific disciplines, basic thinking is not an on-demand process, but requires patience to think through critical problems with the breadth of advanced mathematical frameworks.

• Blog: Do SWEs Need mathematics? Confessions of an AI Researcher By Keith Devlin How to Read Mathematics by Shai Simonson and Fernando Gouvea A volunteer’s Lament by Paul Lockhart1

Democratize machine learning research

I hope I didn’t paint “research mathematics” too abstruse, because the idea of using mathematical constructs should be presented in intuitive form! Unfortunately, many machine learning papers are still riddled with complex and inconsistent terminology that makes critical information difficult to discern. As a student, you can do yourself and the field a favor by blogging, tweeting, etc., and translating condensed essays into digestible ideas. You can even take an example from Distill. pub, a publication dedicated to providing clear explanations of machine learning research findings. In other words, it helps to turn the mystery of technical ideas into a means of “fun exploration”!

In conclusion, I hope this article will provide readers with a starting point for thinking about mathematics education related to machine learning.

Different questions require different levels of intuition, and I encourage readers to figure out what the goal is first. If you want to build machine learning products, find peers and study groups by asking questions, and inspire your learning by delving into the end goal. In the research world, having a strong mathematical foundation gives you rich tools to advance the field of machine learning by proposing new infrastructure modules. Math in general, and math in research papers in particular, can be intimidating, but getting stuck in the middle of learning is an important part of the learning process.

Good luck!

About the author: Vincent Chen is a computer science student at Stanford University and a research assistant at the university’s Artificial Intelligence Laboratory.

The original link: blog.ycombinator.com/learning-ma…

This article is from Xinzhiyuan, a partner of the cloud community. For relevant information, you can pay attention to “AI_era”. The math necessary for machine learning