Original text: towardsdatascience.com/guide-to-le…

By Insaf Ashrapov

Kbsc13: The Growth of the Algorithmic Ape

preface

The tutorial resources recommended in this article are based on my own experience and will be very helpful in increasing your knowledge of computer vision theory. In addition, it is better to understand and learn about machine learning and Python before studying the theory of computer vision.


The framework

You don’t have to choose to use a framework when you start learning computer vision, but applying your new knowledge is essential.

There are no other recommendations for the framework:

  • pytorch:pytorch.org/tutorials/
  • Keras (TensorFlow) : www.tensorflow.org/guide/keras

Pytorch may require more code, but it’s more flexible, so it’s best to use it, and more and more deep learning researchers are adopting the framework.

Albumentation and Catalyst (a framework that encapsulates PyTorch’s high-level API) are also very helpful, so they can be used as well, especially Albumentation, the first image enhancement framework.


hardware

  • Nvidia GPU 10XX+ model is sufficient for use (around $300)
  • Kernels for Kaggle: www.kaggle.com/kernels, free weekly… 30 hours;
  • Google Colab:colab.research.google.com/, each session have a 12 hour limit, the weekly free time limit is unknown.

Theory & Practice

Online courses

  • CS231n:cs231n.stanford.edu/ is the preferred online course for… There are course videos on Youtube, and they have after-class exercises, but completing them is not recommended (although they are free);
  • Ai: Course. Ai /, the second course that should be seen. … Pytorch is also a high-level framework that encapsulates PyTorch, but they frequently change their APIS and lack documentation, both of which are reasons why they are not recommended. But watch their course videos for theories and interesting techniques that are well worth the time.

When studying these courses, it is recommended that these theories be implemented through the recommended framework.

Thesis and Code

  • Arxiv.org/ : Free access to the latest papers…
  • paperswithcode.com/sota: Demonstrates deep science…
  • Github.com/topics/comp…

books

There aren’t too many books to read, except for the following two books that I find very useful for implementing code with either PyTorch or Keras

  • Deep Learning with Python: www.amazon.com/Deep-Learni… Keras developer and Google AI researcher Francois Chollet. It’s not a free book, but it’s easy to understand, and you can learn a lot of things you didn’t know before;
  • Deep Learning with Pytorch: pytorch.org/deep-learni… Eli Stevens & Luca Antiga of Pytorch Team

Kaggle

Website: www.kaggle.com/competition…

Kaggle is a well-known online platform for machine learning competitions, which include various types of competitions, many of which are computer vision competitions. If you don’t finish the course, you can start the competition because there are many public Kernels (public end-to-end code) that can run for free from the start of the competition.


More difficult learning route

The other learning path can be very difficult, but you can learn not only about just making training models and predicting results, but also about doing your own research, according to Sergei Belousov AKA Bes.

All you need to do is read and implement all of the papers below, although it would be nice to just read them.

The network structure

  • AlexNet: cca shut. Nips. Cc/paper / 4824 -…
  • ZFNet: arxiv.org/abs/1311.29…
  • VGG16: arxiv.org/abs/1505.06…
  • ResNet: arxiv.org/abs/1704.06…
  • GoogLeNet: arxiv.org/abs/1409.48…
  • Inception: arxiv.org/abs/1512.00…
  • Xception: arxiv.org/abs/1610.02…
  • MobileNet: arxiv.org/abs/1704.04…

Semantic segmentation

  • FCN: arxiv.org/abs/1411.40…
  • SegNet: arxiv.org/abs/1511.00…
  • UNet: arxiv.org/abs/1505.04…
  • PSPNet: arxiv.org/abs/1612.01…
  • DeepLab: arxiv.org/abs/1606.00…
  • ICNet: arxiv.org/abs/1704.08…
  • ENet: arxiv.org/abs/1606.02…

Generative Adversarial network (GAN)

  • GAN: arxiv.org/abs/1406.26…
  • DCGAN: arxiv.org/abs/1511.06…
  • WGAN: arxiv.org/abs/1701.07…
  • Pix2Pix: arxiv.org/abs/1611.07…
  • CycleGAN: arxiv.org/abs/1703.10…

Target detection

  • RCNN: arxiv.org/abs/1311.25…
  • Fast – RCNN: arxiv.org/abs/1504.08…
  • Faster – – RCNN: arxiv.org/abs/1506.01…
  • SSD: arxiv.org/abs/1512.02…
  • YOLO: arxiv.org/abs/1506.02…
  • YOLO9000: arxiv.org/abs/1612.08…

Examples of segmentation

  • Mask – RCNN: arxiv.org/abs/1703.06…
  • YOLACT: arxiv.org/abs/1904.02…

Position forecast

  • PoseNet: arxiv.org/abs/1505.07…
  • DensePose: arxiv.org/abs/1802.00…

summary

This article is a selection of the authors’ recommended resources for getting started with computer vision, including recommended deep learning frameworks, courses, ebooks, a website for reviewing papers and code, and a contest site, Kaggle;

There is also a more difficult route, to learn is to read the classic paper, from the network structure to the common direction of computer vision, detection, segmentation, GAN and posture estimates, but after the completion of the harvest will be more, is not just a will only use the framework to training model and solve the problem, still have a chance to study direction to develop.

The code of this article can pay attention to my public number, and then the background reply computer vision paper to obtain the paper.

My didi Cloud exclusive AI master code: 9192, purchase Didi Cloud GPU and other AI products enter the master code to enjoy a 10% discount. Click www.didiyun.com to go to the official website of Didi Cloud to buy.

Welcome to follow my wechat official account — the growth of algorithmic ape, or scan the QR code below, we can communicate, learn and progress together!