Wen/Zeng Jia, a data analyst at Ant Financial

Guide language: learn data analysis from scratch, what degree can you look for a job? Ant Financial data analysis experts start from three key questions to help you systematically comb out how to enter the data analysis industry.

Many people advise newbie to light up a lot of skill trees, Excel/ statistics /SQL/R/Python/Hadoop/ machine learning/visualization, etc. I’m not a big fan of this for a simple reason: If the purpose is to “find a job”, the core is to “get into the industry quickly”, rather than “systematic learning”, so it is not appropriate to learn knowledge in so many fields at the beginning.


Next, “think backwards” and solve the problem step by step.


Problem a

What kind of company do I want to work for?


In my mind, what is the daily working status of a data analyst?

This problem is very important, even more important than to learn what, because of the same title for the data analyst position, job content may be very different: some similar to the “business consultant”, and some similar to “engineer” data warehouse, some similar to the “engineers of machine learning”, also some similar to the “check” data size, and so on.


These jobs, also called “data analysts”, not only have completely different jobs, but also require a completely different skill tree. If you don’t know what skill trees each of them requires, you’re likely to be blindsided when you look for a job.



Here’s an example: At some of the big Internet companies (alibaba, Tencent, etc.), there is a position called “business data analyst,” which sounds fancy, but if I told you that many of the people in this position don’t need statistics, don’t need Python, and certainly don’t need machine learning, All you need is SQL and “business sense”. Are you surprised and confused? But that’s what happened.


Have to say is that not all the data analysts need to use tools on the tall, because most of the data analyst core goal is to “problem solving” (used in the solution is not important), and the core to solve the problem is often a “business”, this is a lot of companies in the recruitment emphasized the importance of “business sense”. If you think you are good at having many skills, you are a typical student.


P.S. you may have another question. What does “business sense” mean to a newcomer? In other words, when interviewing, how can an interviewer tell if a junior candidate makes sense? According to my observation, there are four main directions for domestic Internet companies:

  1. Whether the school and professional background (more mathematical science or economics is preferred) are competitive;

  2. Whether you have relevant internship experience;

  3. Logic and framing when describing your experience and answering questions;

  4. Are your views on current Internet news interesting and original?



Question 2

If I have a company I like now, how should I judge the job content of this company?


Further, determine what skills you need to master.

An easy way to do this is to go directly to the job site and look at the JD (job description), but this method is not particularly generic. In my opinion, we can judge the skills required by the “data analysis” position from the five dimensions of company type, company size, company business, company stage and company style. As shown below:




1. Company type

The job of a data analyst is quite different in different types of companies:

  • For traditional companies, data analyst is closer to “business analyst”, they don’t need to deal with too much raw data, more need to existing data integration and analysis, so as to support the business development, for this kind of company, statistics is important, and processing the data of tools such as Excel/R/Tableau of learning is also necessary.
  • But for many Internet companies, data analysts may need to deal with more raw data, so SQL/Python/Java data cleaning tools are more important.

2. Company size

There is a clear difference in the “breadth” of what “data analysts” do at small and large companies.


For large companies, there is a clear division of work. “Data analyst” is a series of different roles:


  1. Closest to the raw data and furthest from the business is the Data warehouse Engineer (known by many nicknames, such as: Data engineer/data fusion engineer /ETL engineer, etc.), their work is mainly to clean and pre-process the behavioral data extracted by technology from users and merchants, so as to make it structured, which is closer to the technical position. Comparatively speaking, the work is simpler.
  2. Further away from the raw data and closer to the Business are Business data analysts (also known as Business Intelligence, BI), whose job is to extract the right Business data and produce reports and insightful analysis. This kind of position may need to deal with a lot of complicated data caliber, SQL, Tableau/Excel according to the company’s reporting system, but more importantly, effective input to the business side. Because this position is linked to data and business, it requires very strong “collaborative ability”.
  3. Not far from the original data and business are not close position data is primarily a data mining engineer (have some branches, such as: algorithm engineer, machine learning, engineer, etc.), these jobs often don’t need to contact the original data, also won’t be in the forefront of the business, but often need to the business’s ability to provide some indirect, such as judgment (such as: Whether the relationship between two users is classmates), predictive ability (e.g., predicting users who will generate business risks), recognition ability (e.g., judging whether a picture is a cat) and so on. This kind of work is independent and creative, but also demanding.




But for small, sophisticated companies, the division of “data analyst” roles may be less clear. With limited staffing, the company couldn’t fill every job function, so it wanted to hire a “full-stack data analyst” (or, more coolly, “data scientist”). From data extraction to presentation of results, “full-stack data analyst” needs to be very clear about every link, so employees with strong comprehensive ability can be competent.


3. Company business

The company’s business has a significant impact on what a “data analyst” job entails:

, in the vertical relatively concentrated in the company or business, data sources and types of relatively little, we don’t need too much energy on data preprocessing, pay more attention to the use of a data and multidimensional, mining valuable information, this job is more exploratory, closer to “engineer” data mining.

• But the “data analyst” position is delicate in a complex company. Business multifarious means that change is faster, which makes a general sense of “data analyst” tend not to do the same business over a long period of time (always docking, an analyst at the same line of business we usually referred to as the “operations” (don’t think on the operating post tall enough, in fact, good operation is also very good at data analysis, the company is of great value). So the ability to produce data quickly becomes very important. In addition, in this case, we need quality data systems, and more importantly, “data products”. Tableau is an excellent data product, and many large companies will also design their own data products to meet the needs of business. The need to do data products created two new positions, one called “data development engineer” and one called “data product Designer”.


4. Company stage

The stage of the company will affect the direction of the “data analytics” work:

  • For startups, the whole data system is not built, and there is usually nothing about the data, so you can’t expect to use models to do fancy analytics. Working with technology to find the right data is your first priority. It may seem boring, but it’s incredibly important, and if you do it right, you’ll soon have the opportunity to become one of the company’s most indispensable employees — after all, you’re the only outlet for all of the company’s data.
  • For mature companies, the underlying data infrastructure is already well established, and for entry-level employees, you don’t need to change anything. The data you want, basically, as long as you have patience, you can get it, but in the search for caliber, it may be very time-consuming and laborious, and it takes you a long time to clean the data. However, after gathering a large number of data, can not be upgraded model? Algorithms, machine learning, everything, you can swim in the ocean of data to your heart’s content.


5. Company style

Finally, a word about corporate style. There are two types of company styles associated with data analysts: “data driven” and “business driven.”


For data-driven companies, we look at enough data and find interesting points in the data, and then analyze them to decide what to do in the future. For “business-driven” companies, we decide what business to do first, and then decide what data we want.


This difference in style can make a huge difference in the status of a data analyst. In “data-driven” companies, data analysts have a high status because you determine the company’s KPIs. In “business-driven” companies, data analysts can become “number-crunchers” without a good leader.


The not-so-good news is that there are very few data-driven companies in China today, especially in some big companies. Although they claim to be “data-driven”, in fact, data analysts are often led by the nose by the business and are subordinate. Therefore, before the real work, if there is a chance, or recommended first internship, to avoid the pit.


Well, after reading this, you’ll have a pretty good idea of what kind of data analyst you want to be, and you’ll have something to focus your studies on.


Question 3

If I really have no foundation, how do I get started?

Back to the beginning of the answer, a lot of respondents recommended a bunch of books, such as “simple data analysis”, “simple SQL” and so on, these books are certainly good, but according to my observation, reading for self-study is more suitable for talented people, most people are difficult to learn well only through reading, for the real white, Online learning platforms are the preferred way to get started quickly from zero learning. Estimated to take about 300 hours, you can have a good foundation.




As for platform choice, although domestic education platforms are booming, I prefer overseas platforms because they started earlier and are more fully developed.


Coursera is well-known for its many courses on data analysis, which are suitable for people with different needs. Most of the lecturers are professors from famous North American universities, and the lectures will explain knowledge well. However, most of the courses are in English, which is a little obstacle for many students with poor English foundation.


Udacity is one of the three online platforms, and the quality of its courses is high. Most of the lecturers are from Silicon Valley Internet giants like Facebook and Google, and the courses are fully Chinese. The Coursera program and its Chinese teaching assistant and tutor services can also help you learn. For example, the Data Analysis course will help you learn Python/SQL/ statistics from scratch.


Advanced courses further involve R/Tableau, etc. After a total of 300 hours of serious learning and absorption, the foundation of data analysis can be laid very solid. For the “business data analysts” in the big Internet companies mentioned above, there are corresponding courses without learning any programming to meet such job requirements.




Add the 7-day trial now: Udacity nano degree

13 weeks introduction to data analysis, learn to create optimal data interpretation with Tableau, Python, and R


Build a foundation and take some advanced courses on the company and position you want to work for. So, as long as the educational background is not too bad, there should be no problem that most companies can fill most entry-level positions of data analysts with their well-crafted “skill tree” growth path.


Of course, no matter what course you study, it’s not easy to learn it all. But after all, it is a career choice. It is definitely worthwhile to be prudent and serious and spend more energy or even money.


Silicon Valley Universities allow you to master new skills in your spare time

Founded by Sebastian Thrun, the founder of Google’s self-driving car, Udacity has partnered with leading companies such as Google, Facebook and Amazon to create a series of cutting-edge technology courses, as well as providing human project review and one-on-one online q&A services. It aims to make everyone able to use the latest and hottest technology education in Silicon Valley at a much lower cost than offline education, and to help students become the sought-after talents who can drive enterprise innovation and change.