Hi, everyone. My name is Chris. I worked as an algorithm in a listed game company for five years before joining the company. \
At present, I am also responsible for algorithm work in Ali, covering CV, NLP, architecture, etc., and business lines have also been extended to advertising, operation, customer service, risk control and other aspects.
\
Why is it difficult for the algorithm to recruit people?
\
In the eyes of the uninitiated, an algorithm engineer might get a new Paper issued by some big guy recently, or he might work on his own theory, derive formulas and produce theoretical results, realize it through parallel programming to support large-scale data training, and then beat the existing model, increase CTR by 200%, increase income by 200%, and make millions of dollars a year. But here’s the thing:
\
The ideal algorithmic engineer: propose hypotheses -> collect data -> train models -> interpret results.
Algorithm in practical engineer: hypothesized – > preprocessor – > – > collection data pretreatment – > training model – > debug – > debug – > to collect data – > preprocessor – > gather more data – > debug – > debug – > debug – >… – > to give up
\
As the head of the algorithm department, I have interviewed many candidates. Generally, I evaluate them from logical thinking, basic algorithms and data structures, mathematics, deep learning, expression ability and engineering experience.
\
I found that in fact, many people just think they know the algorithm, brush a watermelon book dare to come out of the interview, in addition, there is a mathematical basis of this year’s students, the algorithm is also good, but the actual 3 years may write less than 1000 lines of code, practical operation ability is very poor.
\
After interviewing several young people with excellent resumes, I was surprised to find that many beginners did not have a good understanding of the actual workflow of data mining/algorithm engineer, resulting in professional skills deviation. This is why companies are receiving more and more resumes, but there are only one or two available, and the asking price is 50% over the budget, so the painful sign may be poached by peers.
\
So what is the specific working process of the algorithm position?
\
Let’s start with a small NLP project flow to give you an idea of the larger context of machine learning projects:
\
1. Understand requirements and obtain data. Meet with the product and operations to understand the requirements, and then extract the vast amount of data that the company has accumulated and that you download and crawl from the Internet. \
\
2. Data preprocessing. Data processing probably accounts for 50%-70% of the total workload, and the corpus preprocessing can be completed through data cleaning, word segmentation, part-of-speech tagging and word stopping.
\
3. Feature engineering. After the corpus pretreatment, we need to consider how to represent the words and expressions after word segmentation into the type that the computer can calculate. There are two commonly used representation models for converting Chinese word segmentation strings into numbers, namely word bag model and word vector.
\
4. Feature selection. To construct a good feature vector, it is necessary to select suitable features with strong expression ability. Feature selection is a challenging process that relies more on experience and expertise, and there are many ready-made algorithms for feature selection.
\
5. Model training. For different application requirements, we use different models, including traditional supervised and unsupervised machine learning models, such as KNN, SVM, Naive Bayes, decision tree, GBDT, K-means, etc. Deep learning models include CNN, RNN, LSTM, Seq2Seq, FastText, TextCNN, etc.
\
6. Evaluation indicators. The trained model should be evaluated before going online, so as to make the model have better generalization ability to corpus.
\
7. Model on-line application. The model is applied online, the model is trained offline, and then the model is deployed online and published as interface service for the use of business systems.
\
From the perspective of business process, machine learning project is basically to understand business needs -> investigate industry solutions -> check whether it is applicable -> online effect. It is not difficult to find that how algorithm engineers improve their machine learning level through “practice” and how to improve the business level and revenue capacity of enterprises through the practical application of machine learning/deep learning is of great importance to specific businesses.
\
I often say that algorithms are just tools, and it’s important to achieve business goals with the right understanding of the industry and product.
\
So the fear that algorithm engineers will be replaced by their own algorithms is ludicrous. Although machines can do a lot, they can’t replace people’s understanding of data, which is the value of the existence of algorithm engineers. Although Deep Learning can replace human extraction of features to some extent, it can only solve the problem of feature transformation at most, and still cannot deal with the situation that domain knowledge is needed in data cleaning and preprocessing.
\
In my experience, I tend to think that algorithm engineer is a comprehensive talent integrating technology and product manager.
\
For students/practitioners of different majors, crossover is an advantage rather than an obstacle. Itself, especially if you as a is a in other industries (physics, engineering, chemistry, medicine, agriculture, satellite images recognition, network security, the social sciences) of the average programmer, in the industry have a deep theoretical and experimental background, access to huge amounts of data, then you can do some innovation and its work, this is artificial intelligence + talent.
\
There are a lot of machine learning courses and textbooks out there, and they’re mostly about how to build ovens from scratch, rather than how to cook and innovate recipes. This learning path is not only difficult, but 90% of learners are not deep in one direction, do not have core competencies, and do not conform to the talent concept of the enterprise.
\
The industry’s best AI boot camp
\
In order to let more beginners know the working process of machine learning/data analysis/data mining and find the entry point, I specially invited two experts in different fields of artificial intelligence, a data mining engineer @ Panda Jiang from BAT and an expert @Angela from computer vision. Also, @Chris, a senior algorithm engineer of Alibaba, held four consecutive sharing sessions on artificial intelligence introduction with his specific workflow as the core.
\
\
We will separately from their good at areas: python data analysis, machine learning theory, machine learning mathematics, algorithms, workflow, in our theoretical study, the giant concrete workflow reverse guidance route planning learning, is a rare entry-level course, aims to AI for the majority of the fans and an inter-bank learners to provide a solid foundation.
Benefits:
Free Introduction to Artificial Intelligence Sharing Session
Suitable for: entry-level, junior and intermediate students
1. 6. Smart refrigerator
First line factory algorithm workflow explanation and machine learning route
2. June 13 at 20:30
Mathematical basis for the derivation of common algorithms in machine learning
3. June 16th at 20:30
Starting from scratch, 90 minutes of introduction to Python data analysis
4. June 18th at 20:30
Ali experts: Neural network principles and artificial intelligence employment guide
(* Available for recording and recording, add teaching assistant: BT474849)
Learning style: \
\
1. Live broadcast
2. Homework 1V1 will be corrected
3. Assistant class
4. Group q&A
5. Completion test
\
This sharing session will answer the following questions:
\
Am I suitable to study ARTIFICIAL intelligence? I am a medical student, how is the AI medical employment situation now? Data analysis/data mining/algorithm engineer distinction and capability model? How well does an algorithm engineer need to understand an algorithm? Are model selection and parameter tuning techniques common? Application scenarios of deep learning algorithms…… (~ all your doubts will be solved here!)
\
Don’t miss an opportunity when you should be growing fast. Join this training camp, the first-line tutors will answer your questions online with all their heart, and your peers will supervise and encourage each other! This period is free to help you analyze specific professional progress direction!
\
The first 300 students praised!
Left left left
▲ The first session of the course experience feedback + group communication and q&A
\
Benefit two: study materials
Machine learning from entry to actual video courses
\
In addition, the first 500 students who successfully signed up for this course can get the video Course of Machine Learning from entry to actual combat, which is written by the tutor from Shanghai Peking University, Tsinghua University, Jiaotong University and other famous universities as well as the front-line engineers of Dachang. The video course is worth 1388 yuan. Contains python basics, data analysis, big data, machine learning, combat and other five categories of essential dry goods video, courseware and source can be downloaded, the following is the catalog.
“Machine learning from entry to actual combat video course” \
— Five chapters, 63 lectures
\
Fundamentals of Linux and Python programming
1. Install vmware VMS
2. Install centos6.9
3. Use basic Linux commands
4. Python is introduced
5. Python installation
6. Install python
7. First Python program
8. Use of PyCharm
9. Variables, integers, floating point, and strings
Nulls, Booleans, lists, tuples, dictionaries, collections
11. If conditional statement, input function
12. Loop statements
13. Function introduction, function definition, function call, function parameters
14. The return value of the function
15. Global and local variables
16. Framework of student management system
17. The addition of student management system and the compilation of viewing modules
18. Revision and deletion of student management system, homework
\
Python data analysis
19. Introduction to Python Data Science
20. Introduction to common Python libraries
21. Data analysis environment construction
22. Numpy data type and index handling
23. NumpyAPI and matrix operations
24. Numpy advanced features and generic functions
25. Panda Overview and Serise
26. Pandas_DataFrame earnestly
27. DataFrame and Series indexes
\
Big data and data processing
28. What is big data
29. The relationship between big data, artificial intelligence and machine learning
30. Data volume and high concurrency (does high concurrency necessarily mean large data volume?)
Hadoop Introduction :HDFS introduction, architecture composition, practical operation drill
Hadoop Introduction :Mapreduce, Wordcount instance, framework process
33. Introduction to Spark, Environment construction, cluster installation, and example demonstration
\
Introduction to machine learning
34. Introduction to machine learning
Machine learning development environment
36. Introduction to MACHINE learning IDE
Basic theory and Philosophy of machine learning
38. Machine learning algorithm classification
39. Machine learning common tasks
40. Data cleaning
41. Standardization of data
42. Python and Sklearn data standardization practices
Similarity measurement in machine learning
44. The KNN algorithm
45. Case: Iris flower data classification based on KNN (SKlearn)
46. Case: Iris flower data classification based on KNN (Python)
47. Unary linear regression
48. Multiple linear regression
49. Polynomial regression
50. Sklearn linear regression practice
51. Python linear regression practice
Case: Advertising revenue analysis based on linear regression
Logistic regression classification algorithm
54. Dichotomous classifiers deal with multi-classification problems
55. Case: Iris flower data classification based on Logistic regression (SKlearn)
56. Case: Data classification of Iris flowers based on Logistic regression (Python
\
5. Machine learning
57. The preface
58. Preparation
59. High-end but generic word clouds
60. DCgan face image generation
61. Stock price forecast
62. Tensorflow object detection
63. Deep Dream
Of course, any information is only auxiliary, the most important thing is to follow the teachers to hands-on practice, learn the ai thinking of front-line development, understand the specific workflow of Dachang, and take the most solid step of artificial intelligence! \
\
Welfare receipt: \
Scan code to add teaching assistant little sister wechat
Reply “artificial intelligence” for collection
Bonus 1: Ai boot camp qualification
Bonus 2: Ai introductory and advanced video resources