Last month, Kaggle co-founder and CTO Ben Hamner answered a series of questions about Kaggle, machine learning, and artificial intelligence on Quora. The Kaggle Team rewrote and rewrote the core summary of Hamner’s eight Steps to Machine Learning.


Learning machine learning and artificial intelligence is better than ever. In recent years, the field has developed rapidly and yielded fruitful results. Experts open source high-quality software tools and libraries, and new online resources and blog posts proliferated. Machine learning has generated billions of dollars in industry revenue, unprecedented resources and jobs. But that also means getting started with machine learning can be a bit confusing. Here’s how I got started. If you get stuck somewhere in this article, search Kaggle (maybe someone has had the same problem before) and ask a question in the Kaggle forum (if no one has asked the question before), it’s a great way to find direction and solve the problem.


1. Pick an issue that interests you


Starting with a problem you want to solve, rather than an intimidating, unstructured list of topics (you can Google a list of machine-needed resources, which I won’t provide here), you’ll find it easy to focus and actively learn. Solving problems will force you to get deeper and more engaged, rather than just passively reading about machine learning.


There are several criteria for choosing a good introductory question:


  • The questions cover an area of personal interest to you

  • Data is readily available and very problem-solvable (otherwise most of your time will be wasted)

  • You can use data (or some related subset of data) comfortably on a single machine

  • No problem? Be worried!!! We’ve provided some great machine learning problems on Kaggle through our Entry Contest series. Many years ago click competition (https://www.kaggle.com/c/titanic).


2. Create a quick, shoddy, and clunky end-to-end solution to your problem.


It’s really easy to get bogged down in implementation details or debugging bad machine learning algorithms, and you want to avoid it.


Your goal here is to get something super basic as quickly as possible, covering end-to-end problems: reading data and processing it into a form suitable for machine learning, training the basic model, creating results and evaluating its performance.


3. Develop and refine your initial plan


Now that you have a functional baseline, it’s time to innovate. Try to improve each component of the initial solution and measure the impact to see where it makes sense to spend your time. In many cases, capturing more data or improving data cleansing and preprocessing steps has a higher ROI than optimizing the machine learning model itself.


Part of this step should include hands-on use of the data — examining the rows and visualizing the distribution to better understand its structure and oddities.


Write and share solutions


The best way to get feedback on your solution is to write it down and share it. The writing process is a new way of teasing out solutions and leading to better understanding. This will also enable others to understand what you are doing and provide feedback to help you learn. This also kick-starts your machine learning portfolio, which helps you demonstrate your capabilities and get the job.


(https://www.kaggle.com/datasets) and Kaggle Kaggle data set Kernel (https://www.kaggle.com/kernels) is your share data and solutions, to get feedback from others and see how others an effective way to expand your problem, and began to enrich your Kaggle file.



5. Repeat steps 1-4 for a series of different questions


Now that you’ve solved the single problem you’re interested in, do this multiple times in a series of different domains.

Did you start with tabular data? One more problem involving less structured text, and another problem dealing with images.


Were machine learning problems initially structured for you? Much of the innovative and valuable work is on how to turn a loosely defined business or research goal into a well-defined machine learning problem from the very beginning. Solve a problem type in this way.


Kaggle competitions (https://www.kaggle.com/competitions) and Kaggle data sets defined machine learning problems and is suitable for the machine learning of the original data resources provides a good starting point.


6. Seriously participate in a Kaggle contest (if you haven’t already)


Giving the best answer to a problem that thousands of people are working on is a huge learning opportunity: it forces you to iterate over the same problem and allows you to discover what works.



The individual contest forums have rich resources on how others are using your methods to handle and debug problems, and the kernel provides exploratory insights into the data for simple ways to start solving problems, And the winning post (http://blog.kaggle.com/category/winners-interviews/) at the last show the best results.


Kaggle competitions also provide a unique opportunity to team up with other people. People in the community have different backgrounds and skills, and each person can play both teaching and learning roles. You never know, maybe your future colleagues are in the Kaggle community.


Apply for a job in machine learning


This allows you to spend most of your time on machine learning and really improve your game. Deciding on the type of position you want to pursue and building a portfolio of relevant personal representation projects is a strong starting point. If you’re not ready to interview for a machine learning position, take on new projects and seek consulting opportunities in your current position; Participating in citizen hackathons and taking advantage of data-related community service opportunities are additional ways to gain a foothold. Professional work requires strong programming skills and can be used to greatly improve performance — the boost that comes through focused projects will generate many downstream benefits.


Valuable opportunities for professional machine learning jobs include:


  • Application of machine learning in production systems

  • Focus on machine learning research to promote the latest progress

  • Use machine learning to enhance exploratory analysis of product and business decisions


8. Teach others machine learning


Teaching can help solidify your understanding of the core concepts of machine learning. There are many ways to teach others, so choose the one that works best for you:


  • Write a research paper (https://papers.nips.cc/book/advances-in-neural-information-processing-systems-29-2016)

  • In a speech

  • Write a blog (http://blog.kaggle.com/) and tutorials (http://blog.kaggle.com/category/tutorials/)

  • Answer questions on Kaggle, Quora, and other sites

  • Personal coaching and coaching

  • Sharing code instances (on the Kaggle kernel and GitHub)

  • Teach a class

  • Writing a book (http://www.deeplearningbook.org)


The original link: http://blog.kaggle.com/2017/04/17/the-best-sources-to-study-machine-learning-and-ai-with-ben-hamner-kaggle-cto/