01
On August 14, 2017, after two weeks of waiting, I started my internship in Meituan. I originally intended to leave in three to five months and strive for the next internship, but unexpectedly, I have spent 615 days and nights in Meituan until I became a summer intern, passed the retention interview, and continued my internship after the New Year. The people, the things, everything is wonderful.
The purpose of writing this article is to summarize the basic abilities required by the next algorithm engineer through my internship experience. Of course, this is only based on my own experience. If you have a deep understanding of this position, you can click exit. If you’re still stuck in a rut, I hope this article has been helpful.
From what I have seen during my two-year internship in Meituan, as a New Year’s algorithm engineer, I think I should have the following abilities. Of course, this ability is targeted at algorithm engineers who are more professional than those who are purely research oriented.
1. Basic understanding of algorithms
An algorithm engineer, the most basic algorithm ability or must have. However, with the development of deep learning, traditional methods such as SVM and other algorithms have rarely been used in practical applications. At least, I have not used them in the past two years. Xgboost and LightGBM are generally used as the Base of models, and then deep learning or reinforcement learning methods are used to improve them. Of course, methods such as LR and FM in the recommendation domain are still in use. So, it’s important to master the tree model. Then, for your different job search direction, learn some relevant deep learning models and some tricks of the deep learning model, such as dropout, regular, batch-norm, etc.
As for mathematics, I feel that the development of deep learning has begun to weaken the requirements for mathematics, but I still have to master basic gradient descent, back propagation, tree model derivation, LR derivation, FM derivation and so on.
2, strong code ability
As an algorithmic engineer, Python is certainly the best language for us, but is Python enough? Of course not, algorithmic engineer, both algorithmic and engineering. Python will help us implement the algorithm, but we also need to learn a language to help us implement the algorithm online. In Meituan, we use Java language. As an intern, I have not had the opportunity to contact the code that can be put online, but I will be responsible for online code in the future, so I need to master Java language well.
It’s important to note that although you might think that some knowledge of data structures is not important to an algorithm engineer in a job interview, that would be a big mistake. It’s easy to use data structure knowledge without knowing it. For example, THE Trie tree is the one I’ve used most in the past two years. Mastering data structures is important both for interviews and for future jobs.
3. Data processing and analysis ability
Data processing and analysis skills are a must for a professional algorithm engineer. First, we usually set ourselves a business goal, such as an overall conversion rate, and we usually break it down into phases and look at what happens in each phase. For example, the analysis of the overall conversion rate of takeout can be divided into several stages, such as exposure, users entering the merchant page, users entering the order submission page, and finally completing the payment. Once problems occur in any link, we can quickly locate and solve them. Second, data and features determine the upper limit of machine learning, and models and algorithms only approximate this upper limit. When using machine learning or deep learning models, we must extract our training data from big data, calculate corresponding features, and analyze the possible relationship between features and our goals.
During the first two months in Meituan, I basically did not have access to model-related things. I was mainly responsible for establishing a complete set of product data indicator framework, such as funnel model and retention rate. Although it was boring at the time, thinking about it now is very important to improve yourself.
Having said that, data sensitivity can be developed by working with data, but the basic tools can be learned in advance. On Meituan, I use Spark SQL and Hive the most, which take up about 70% of my time. Learn about hive functions, such as concat_ws,row_number,case.. When,if, get_jSON_object, etc. For Spark SQL, learn the basic operating principles and troubleshooting methods of some common problems. First, learn how to deal with data skew. Sometimes a whole day is wasted debugging a Spark code because of a data skew problem. Second, learn how to minimize the space footprint of Spark tasks and speed up spark tasks. The more resources your task takes up, the less resources someone else takes up, and the faster your task runs, the more space you can give someone else. Spark can be implemented in many languages, but I recommend learning Scala because it works seamlessly with Java. In addition to Spark and Hive, some knowledge of Excel is also necessary.
4. Model accumulation and migration ability
In our work, it is also necessary to accumulate models. Now there are various models, such as Transformer, Bert and DIEN, which are all new models put forward in recent years. For these models, we need to keep following up and accumulating. At the same time, it is not enough to understand how these models do what they do, but why they do it. However, it is not enough to accumulate these models, but to combine them with your own business. For a business problem, if you can quickly come up with a feasible solution, then your accumulation is fair; if you can’t come up with an appropriate model, then you are far away. Also, with your new model, learn to reflect on whether you can apply some of its ideas to your business problems and improve on existing models.
Well, having said so much, how can we keep up with the latest paper information from our busy work life? For me personally, basically have the following channels: PaperWeekly public number, original master (from a personal point of view, many popular public number very much the same, the content of the book, send data, elect the activity, make public, commercial, the public, more focus on people, but I don’t think it can bring more value. Only personal point of view, may be I see the less, hee hee), learning communication group of nouns accumulation (in my own study and communication group, everyone will say I don’t know when communication model, such as the depth of the tree matching model a few days ago, if you do not understand, write it down, and back detailed understanding), share, exchange (in Meituan, some communication is very valuable, For example, teacher Zhang Junlin shared the recommendation system, and brother Wang Zhongyuan shared the knowledge map. If you don’t have the opportunity to participate in any of these in-house networking events, sign up for free seminars like the DataFun community).
5. Product capability
Well, everyone is a product manager, and our RD can be a master sometimes. It is not necessary to listen to the needs of the product manager in any case. We need to have our own understanding of these product needs. I have to say, I still lack in this respect. Not everyone is born with product thinking, but as long as you look more, think more, ask more, I believe your product ability can be improved, focus on accumulation.
Soft power
Soft power includes thinking ability, communication ability, expression ability, cultural accomplishment, learning ability, teamwork ability and so on, which we need to pay attention to cultivate in the work.
02
Ok, the above is my basic understanding of the position of algorithm engineer during my internship in Meituan for two years. Now, if you have a good offer, you can quit. But if you are still preparing for this year’s or next year’s recruitment interviews, I hope you can read on and share with you some of my experiences from last year’s recruitment process:
1, the written test
1, multiple choice questions: multiple choice questions cover a lot of content, basic programming knowledge such as C++ and Java, probability theory, machine learning foundation, deep learning foundation, data structure and so on 2, programming questions: 3, short answer: short answer mainly involves hand formula and business understanding.
2, interview
Last year, I participated in the interviews of more than a dozen companies and received about 10 offers. The general feeling is that companies have higher and higher requirements for algorithm engineers. 1. Internship, thesis and competition: The interviewer will ask you to introduce yourself first, and then based on your internship experience, thesis and competition experience written on your resume. So make sure your resume has experience and follow the STAR rules. During this process, the interviewer will evaluate you in terms of algorithmic understanding, business understanding, etc. 2. Fundamentals of deep learning/machine learning: After talking about resume projects, you will often look at the fundamentals of some algorithms. Boosting model, which requires an understanding of deep learning/machine learning fundamentals, and an in-depth understanding of common models. It is also necessary to understand and master derivation for simple formulas (LR, ordinary neural network back propagation, forward propagation of RNN and LSTM, SVM, XGBoost, etc.). 3. Hand-torn code: The difficulty of hand-torn code varies from company to company, but the general leetcode of medium difficulty is ok. Xiaobian advice, we must master the array, linked list, binary tree and dynamic programming topics. 4, intelligence: often test is the probability calculation problem. 5. Business understanding: This piece of small editor feels it is very difficult. Generally, it will give you a scene, let you design a set of algorithm process, or ask you about your current project, the direction of follow-up work and so on. 6. Others: Other interviewers may be looking at engineering issues such as multi-process, multi-threading, Spark/Hive, etc.
3, face sharing
1, ali (failed) all around: www.jianshu.com/p/304e1023c… 2, baidu core search department (failed) on three sides: https://www.jianshu.com/p/02d931d5c1c8 3, zhihu (failed) on three sides: www.jianshu.com/p/40259bb05… By adding 4, ThoughtWorks (g + sp test) : https://www.jianshu.com/p/0b5514908683 5, Meituan retention interview (a total of one side, through) : www.jianshu.com/p/bbe21ff40… 6, convenient bee (a total of three, through) : https://www.jianshu.com/p/51e2d16f16a5 7, shell (three sides + sp persona, through) : www.jianshu.com/p/cd0a809cf… 8, a bit of information, a total of three, through: https://www.jianshu.com/p/fffc15c9d31d 9, baidu Feed (2 face failed) : www.jianshu.com/p/65032f77f… 10 and cat’s eye (failed) on three sides: https://www.jianshu.com/p/c32787be3dc8 11, sogou, a total of three, through: www.jianshu.com/p/8a116eb7f… 12, millet, a total of two sides, through: https://www.jianshu.com/p/e34ebebae15f 13, drops, a total of three, through: www.jianshu.com/p/bc9d5f820… 14, spell (a total of two sides, through a lot) : 15, iQIYI https://www.jianshu.com/p/a15bc7d0686a (a total of three, through) : www.jianshu.com/p/4ceb5de29… 16, the headline (a total of three, through) : https://www.jianshu.com/p/5bc533d1bf62
It can be seen that I feel that I still have a lot of shortcomings. There are two main reasons for the failure of several interviews. First, I did not master the questions of data institutions well and did not learn to draw inferences by analogy, such as the interview of Baidu and Zhihu. Secondly, I did not have a deep understanding of the business, and still stayed at the point of leadership assignment and hard work. My lack of thinking about the business led to the failure of the interview, such as the interview between Ali and Cat Eye.
4. Materials recommendation
Here recommend some materials to help you better review it: 1, “statistical learning methods” in the classic classic, suggested to read at least two times! 3. Deep Learning 500 Q: github.com/scutan90/De… Lesson 4, SVM:http://blog.pluskid.org/?page_id=683 5, Li Hongyi deep learning: www.bilibili.com/video/av977… Lesson 6, Li Hongyi reinforcement learning: https://www.bilibili.com/video/av24724071?from=search&seid=11841282802558935758 lesson 7, Li Hongyi machine learning: www.bilibili.com/video/av359… 8, the nature of the linear algebra: https://www.bilibili.com/video/av44855426?from=search&seid=15873340646320697328
Write in the last
Having said so much, I hope you can successfully get your desired offer in this autumn or some autumn in the future. Meanwhile, I hope you can stay true to your original aspiration and forge ahead! Come on!
Like, don’t forget to pay attention ~
Let’s make progress together with Wenwen!