Does a resume full of successful projects really count with an interviewer? Experience says: not necessarily. Haebichan Jung, project director and Recurly Data scientist at Towards Data Science, recently posted about his experience. He says that doing lots of projects and doing well can be useful at the resume screening stage, but the interviewer may not care about your project at all and will decide whether you stay or leave through an “intelligence test”.
Selected from Towardsdatascience, by Haebichan Jung, Panda W, Zhang Qian.
The “project” in question is some kind of latest machine learning or deep learning algorithm on Jupyter Notebook that will be uploaded to GitHub. You may want to get a good score from the interviewer.
Project mentality :(noun) the more machine learning projects you have in mind and the more projects you have on your resume, the better your chances of landing a high-paying data science position; But the truth is, it doesn’t make many people think you’re awesome.
PS: Remember, I only applied for a data scientist position in San Francisco, California. So my opinion may not be consistent with your geographic location or the position you are applying for. And that’s just me (actually two people, more on that later). But there’s something universal about this story, because I’ve seen so many people around the world fall for the (false) appeal and potential of “projects.”
I spent a full two weeks reading academic papers on the subject and, looking back, I understood about 30 per cent of them. But there’s something really troubling about that 30%. I don’t think some of the researchers working on AI-generated tunes have a deep understanding of the basics of music. You can tell because they use very complex neural network architectures to create new sounds, but those architectures don’t reflect the way real musicians compose music.
Pop Music Maker takes musical data as input, breaks down musical notes, looks for statistical relationships between those notes, and then recreates a new Pop song based on those statistics.
As my project becomes more known, hundreds of people use my algorithms every day through a flask website I’ve set up. This caused my site to crash repeatedly because the AWS EC2 instance I deployed the code on was too small to handle the volume of traffic. Some people on the Internet started accusing me of being a fraud because they tried my algorithm, only to find that the site didn’t work.
It wasn’t long before the criticism exploded into a full-fledged debate on many social media sites. Some researchers with PHDS have angrily pointed out that my Bayes-based approach is simply wrong. Others defended me and defended my work (including Ben Lorica). In short, I’ve reignited the Bayesianism vs the frequency statistics school of jihad in some parts of the Internet.
What’s more, the members of the hiring committee didn’t test me on these projects. Because the hiring process isn’t about how many projects you’ve done. But I see a lot of candidates for data science jobs thinking the same way.
“In most data scientists, I have seen the biggest drawback is that the machine learning model with commercial effect. So, a lot of very, very smart people will create the five layers of this very complex neural networks. It can make a good prediction of scores is also very high. But when we explore the specific model of commercial effect, they often struggle to answer.”
Before, that seemed to me oppressive and empty. But after thinking long and hard about the term “intelligence” used in technology, I’m beginning to understand what it actually means. When I learned what it meant, I realized it had nothing to do with “biology” — that is, anyone can improve by being prepared. More importantly, I discovered the secret to successfully passing a data science interview.
-
Analytical thinking
-
Extract variable
-
Edge case detection
-
Process optimization
This part of intelligence can be measured either through practical coding puzzles or theoretical business/product problems. The interviewer will present you with a question that, at first glance, feels open. This is intentional, because the answer to this question is not the purpose of the test. So it doesn’t really matter if your solution actually works. The point of this question is to assess your ability to coordinate a multi-step plan to solve a complex problem.
For candidates who want to improve this skill, solve as many Leetcode problems as possible. Also read questions about data science products. Here is an example of a product problem:
A food delivery company is launching a new app with a new UI. The goal is to boost delivery workers’ earnings by increasing their miles. Please suggest a testing strategy to see if the new app is better than the old one.
Extract variable
This kind of thought experiment is usually done by product/non-data people who don’t know much about data science and want to get a sense of your “intelligence”. Intelligence here refers to your ability to come up with the variables that solve the problem (the ones the interviewer himself thinks of).
1. Time (Does rush hour affect the speed of the elevator?)
2. Location (Maybe some floors have more elevators than others?)
3. Technology (Maybe there is a technical problem with elevators, outside of one’s internal perception.)
4. User statistics (Who are there in the building? Do visitors use one elevator and workers use another?)
You can improve your intelligence in this area by studying as many different kinds of data as possible, such as temporal data, geographic data, and so on. Anything that expands your knowledge of data in different areas of knowledge is worth a try.
This is a difficult part of the interview process because you will feel uneasy because holes have been found in your logic. You need to calm down and listen carefully to the hints your manager throws at you. Usually, they already have some answer in their head, and you have to find a way to find it. They drop clues that remind you to find the answer in their head.
How to practice? This is really hard to practice. When this happens, take a deep breath, ask questions, figure out what you need to do, and follow the clues.
Why do you do that? Because all data science work in the industry starts out rough and takes many iterations to improve. But this work can only be done once the first rough version is complete. So I don’t think it’s as high a priority as the first three.
I believe the program can be useful in the early stages of job hunting. In my opinion, the project can solve the problem:
1. Build confidence. Many people see completing a project as a necessary pre-requisite (an inner sense of ritual) before applying for a corporate job.
2. Practice variable extraction and optimization. Projects allow you to experiment with many different types of data. Allows you to experiment with workflows to optimize data processing, and so on.
3. Give you a chance to win over the initial recruiter. The initial recruiter’s job is not to conduct an intelligence test, but to screen candidates for the interviewer and then have the interviewer take the test. Projects also let the initial recruiter know about your enthusiasm and commitment to data science. Projects can help you show this well.
1. Projects don’t help you pass technical quizzes.
2. Projects don’t provide external validation of your potential as a data scientist — only that you can copy or remember existing code very well.
3. Interviewers don’t have time to read pages and pages of your notes. They process hundreds of applications every day. They also have to manage their own teams — which is enough to take up all their working hours.
Still not convinced? Finally, listen to the director of The Data Science Institute at Columbia University:
1. How do you design an algorithm to solve this particular problem
2. How can I break this particular problem down into smaller parts
3. How do you define an abstraction layer
4. How to define interfaces between components
I also asked a senior data scientist at IBM, “What is the most important skill you have as a data scientist?”
The original link: towardsdatascience.com/sorry-proje…