At 9am, I walk in, say good morning to my colleagues, put my food in the fridge, pour a soda, and walk to my desk. I sit down and LOOK at my work notes from yesterday, and THEN I open Up Slack, and THEN I read messages, and I open up every paper or blog that the team has shared, and there’s something to read every day, because the field is changing so fast.

After reading the news, I read papers and blogs, and I focus on the ones that confuse me. Usually, there’s something in there that would help me in my current job. I spend an hour reading, sometimes more, depending on what I’m reading, reading is the most basic and critical skill, and if there’s a better way to do what I’m doing, THEN I’ll learn and apply it. It saves me time and energy.

At 10 a.m., if a work deadline is looming, I cut back on reading, which is where I spend most of my day, to catch up. I look back at what I did yesterday and look at the steps I wrote down for the rest of the day. My notebook records the flow of my day.

In the subsequent data in the process of operation, if I have the data processing in the right form, so I will need to use model data, adjust training time at first I was very short, if there is progress, I will take longer, if I encounter problems, data does not match the problem, then, I will solve this problem, then before trying new models, Get a baseline. Most of my time is spent making sure that the data is processed in the form the model requires.

4pm is almost here. It’s time to relax. When I say relax, I mean clean up the code I’ve written and make it legible. I’ll add some comments, restructure the code, and what if someone reads my code? I ask myself this question, and usually, the person reading my code is me, because I often forget the thoughts I had while writing the code.

That’s an ideal day at work, but not every day. Sometimes a wonderful idea strikes at 4:37 p.m., and I’ll get back to work. Now that you’ve got an idea of what I do every day, let’s talk about machine learning:

As the wave of ARTIFICIAL intelligence continues to advance, I believe that many readers like me have joined the machine learning team. My work covers everything from data collection, data processing, modeling and implementation services to every industry you can think of. After staying in this position for a long time, I found that there are rules to follow when doing many things. Based on the experience of a predecessor, I summarized 12 aspects that an excellent machine learning engineer should pay attention to. I hope that after reading this, you can be helpful to the practice and learning of machine learning!

Spend your time wisely: Data matters!

If you’re familiar with some of the basic principles of data science, you’ll see that solving practical application problems, dealing with coding, is essentially dealing with data. It’s amazing how often I forget this, and how often I focus on building better models rather than improving the quality of the data.

Building a bigger model and using more computing resources can give you better results in a short period of time. However, come out mix always want to return, next you will encounter very troublesome matter.

When you work on your first project, spend lots and lots of time familiarizing yourself with the data. The reason I say a lot is because you usually have to multiply by three the amount of time you expect to spend. This will save you a lot of time in the long run.

When you get a new piece of data, your goal should be to be the expert who knows the best about the data, you look at the distribution of the data, find the different types of features, where are the outliers, why are they outliers? If you can’t describe your data, how can you model it?


Don’t underestimate the importance of communication


The vast majority of the problems I encountered were not technical problems but communication problems. Yes, technical problems exist all the time, but that’s for engineers to solve. Never underestimate the importance of communication, both inside and outside your company. The worst thing you can do is solve a technical problem that shouldn’t have been solved.

Why does this happen?

Externally, most of this happens because there is a mismatch between the customer’s expectations, which can be achieved with machine learning, and what we can provide. Internally, because each of us is responsible for so many aspects of the company, it’s hard to be on the same page for the same goal.

Three provinces in my body

Back to the essence of the problem. Please do this often. Ask yourself, do your customers understand what you can offer? Do you understand the customer’s problem? Do they know what machine learning brings and doesn’t bring? What communication style makes it easy for you to present your work?

For employees within the company

In order to solve the problem of internal communication, people design a lot of software. The sheer number of them gives you an idea of how difficult it is to solve internal communication problems. These include Asana, Jira, Trello, Slack, Basecamp, Monday, Microsoft Teams.

One of the most effective ways for me to do this is to update myself at the end of each day on the channel related to the project.

The updates include:

  • 3-4 ideas
  • About my job description
  • why
  • So based on that, what I’m going to do


This is perfect, right? Not at all. But it seemed to work, and it allowed me to show what I had done and what I was going to do. Having your plans out in the open has the added benefit of being pointed out if your work plan doesn’t work out. It doesn’t matter how good an engineer you are, it matters your ability to tell people what your technology is and what it can deliver, and it goes hand in hand with your ability to maintain existing businesses and create new ones.

Stability > frontier

We used to have a problem with natural language: grouping written content into different categories. The goal is to help users send a piece of text to the service center, and automatically classify the text into one of two categories. If the model’s prediction is not accurate enough, the text is handed over to human processing, which is about 1000-3000 requests per day, no more, no less.

BERT is the most talked about noun of the year. But without Google’s scale computing tools, it would have been very difficult to use the BERT training model to do what we needed, and this was just to use the model for pre-production work, so we found another method, ULMFiT. This method is not cutting edge, but it produces good enough results and is easy to use.

Instead of improving a method to perfection, it is better to learn from existing models and transfer learning on this basis, which can bring more value.

Two difficulties in machine learning

There are two bottlenecks in putting machine learning into practice: from course outcomes to project outcomes and from theoretical models to production models (model deployment).

Internet search machine learning courses returned a large number of results, and I used many of them to create my own MASTER’s degree in AI. But even in the best completed several courses, when I started as a machine learning engineer, my skill is based on structured the trunk of the course, in reality, the project is not structured, I lack of specific knowledge, online Internet courses can’t teach you some skills, such as: how to query data, exploration and development model.

How to improve?

I’ve been lucky enough to work with some of the best talent in Australia, but I’m willing to learn and do wrong. Of course, being wrong is not the goal, but in order to be right, you have to figure out what is wrong. If you are taking a course in machine learning, continue to take the course and apply the knowledge to your own engineering projects so that you have expertise.

How to improve your ability at work?

My knowledge in this area is still very poor, but I have noticed a trend — machine learning engineering and software engineering are converging. With the growth of open source platforms like Seldon, Kubeflow and Kubernetes, machine learning will soon become another part of the equation.

Building the model in Jupyter notebooks is one thing, but getting thousands or even millions of people to use it is another. Based on recent discussions at Cloud Native events, most people outside big companies have no idea how to do this.

This principle

There is also a 80/20 rule in machine learning, where we have a 20% rule, which means that we spend 20% of our time learning.

The learning time proved valuable. For example, ULMFiT usage exceeds BERT due to the rule that 20% of the time is spent on learning, meaning that the remaining 80% of the time is spent on core projects.

  • 80% core product (machine learning professional services).
  • 20% of new things related to the core product.


If your job advantage lies in being the best at what you do now, your future job will also depend on continuing to do what you do best, which means constantly learning.

Papers need intensive reading

This is a rough indicator, but after you explore some data sets and experimental phenomena, it becomes clear that it is an objective fact. This concept is derived from Zinf/Price’s law, which states that half of all papers on the same topic are written by a group of highly productive authors approximately equal to the square root of the total number of authors. In other words, out of thousands of submissions per year, you might find 10 groundbreaking papers, and of those 10 groundbreaking papers, five might come from the same institute or author.

How to keep up with the trend of The Times? You can’t keep up with every new breakthrough, you better have a solid grasp of the fundamentals that have stood the test of time, new breakthroughs rely on original breakthroughs, and then new exploration and development.

Be your own skeptic

You can handle exploration and development problems by doubting yourself. Exploration and development is a dilemma between trying new things and reusing existing model results.

Develop your own model

It’s easy to run models you already use and get high-precision results and then report them to the team as new benchmarks. But if you get a good result, check it over and over again, and get your team to do the same, because you’re an engineer, you’re a scientist.

Explore new things

The 20% time standard is useful here too, but 70/20/10 is better. Maybe you spend 70 percent on the core product, 20 percent on the construction of the core product, and 10 percent on exploration, but exploration might not work, and I’ve never tried that personally, but that’s what I’m moving in that direction.

The first product short step, after to thousands of miles

You can use your own data set or small pieces of irrelevant data. The trick to success on a small team is to take one small step and then iterate quickly.

Let’s play rubber duck

Most programmers probably know of the yellow duck debugging (also known as rubber duck), the concept of carrying around a yellow duck while debugging code and explaining each line of code to it in detail. “Cone of Answer” is one of the most common cone of answer phenomena in the world. If your friend asks you a question and you are left blank in the middle of a conversation, he has already found the answer. In general, when you’re trying to explain your problem to someone, you’re naturally pushing yourself to adjust your thinking, and this works for programmers as well.

The rubber duck method is something my colleague Ron taught me. When you have a problem, sitting down and staring at the code may solve the problem, but it may not. In this case, it’s better to rephrase it in your teammate’s language, like your rubber duck.

“Ron, I’m trying to iterate through this array and try to track the states of this array by looping through another array and tracking its states, and then I want to combine those states into a list of tuples.” “Cycle of cycles? Why don’t you vectorize it? “” Can I do that? “Let’s see.” “…”


Transfer learning is important

You don’t have to rebuild the model from the ground up, the problem comes from the convergence of machine learning engineering and software engineering. Unless your data problem is very specific, many of the main problems are very similar, classification, regression, time series prediction, recommendation systems.

Services such as Google and Microsoft’s AutoML make it easy to use machine learning by simply uploading data sets and selecting target variables. But these things are still in their infancy. If you’re a developer, all you need is a library like fast. Ai to use state-of-the-art models in a few lines of code, as well as pre-built models of various models. For example, PyTorch Hub and TensorFlow provide the same functionality.

What does that mean? Although machine learning has become so convenient, there is still a need to understand the basic principles of data science and machine learning, and more importantly how to use them properly.

Math or Code? It is a problem

For the client-side issues I deal with, code comes first, and all the machine learning and data science code is Python. Sometimes I dabble in math by reading papers and replaying them, but 99.9 percent of the time, existing frameworks already contain libraries of math.

Although in real life, mathematics is not as important as imagined, after all, machine learning and deep learning are the application of mathematics. But knowing minimum matrix multiplication, some linear algebra and calculus, especially the chain rule, is still important.

Remember, my goal is not to invent a new machine learning algorithm, but to show clients whether machine learning can help their business, and with a solid foundation, you can build your own best model, rather than reusing existing models.

Rapid iteration in the software industry

What you did last year may not work next year. This is an objective fact, and it is becoming more and more so because of the convergence of software engineering and machine learning engineering. But now that you’ve joined the machine learning community, let me tell you what stays the same — frameworks change, libraries change, but basic statistics, probability theory, mathematics never change. The biggest challenge remains: how to apply them.

Having said that, I hope the above suggestions will be helpful to beginners and practitioners of machine learning. Finally, have fun and start your data journey!