The million-downloaded Ng machine learning and Deep Learning notes are updated! (With PDF download)

Today, I have updated and made available for download both of ng’s machine learning and Deep learning course notes, which are perfect for getting started with machine learning and deep learning. (Author: Huang Haiguang)

0. Guide language

My classmates and I made a printed version of The machine learning and deep learning course notes of Dr. Ng, and put it on Github. After downloading it, you can print it. Notes are based on course videos and subtitles. Thanks to Teacher Andrew Ng for providing such a practical tutorial!

2018-4-28 Published word and Markdown files for deep learning notes!

So far, it has been downloaded more than 1 million times on Github.

This update: Many students said that they could not understand the formula, so I added the mathematical basis as an attachment and put it in the notes for reference.

The notes can be used as auxiliary textbooks for undergraduate, master and doctoral courses. Please do not use it for commercial purposes.

If you need to reference the Repo:

Machine learning notes

fengdu78, Coursera-ML-AndrewNg-Notes, (2018), GitHub repository, https://github.com/fengdu78/Coursera-ML-AndrewNg-Notes

Deep Learning Notes

Fengdu78 deeplearning_ai_books, (2018), making the repository, https://github.com/fengdu78/deeplearning_ai_books ` ` \

Students are free to print. Machine learning notes P DF total 336 pages, depth notes PDF total 781 pages, recommended to find online print shop (5 cents double sided many). * * * *

Github address: \

1. Personal notes of Ng’s Machine Learning course (Star Number: 16K +)

Github.com/fengdu78/Co…
2. Notes and Resources for Ng’s Deep Learning Course (Star Number: 10.5K +)

Github.com/fengdu78/de…

Baidu cloud download (my Github image file, if harmoniously removed, it is suggested to reply “1978” in the public account to obtain) :

Github full download address: \

Instead of downloading the whole site, just “load notes PDF” into this folder

Link: pan.baidu.com/s/1XHZA0jCI…

Extraction code: I0RT

Note print effect (now thicker than this, hope to be prepared)

1. Personal notes of Ng’s machine learning course

Ng’s personal notes, translated subtitles (including videos), and python code are also available in Word and Markdown files.

The original course work code was from Octave and is now almost useless, so I have recreated the course code in Python and made available word and Markdown versions of the notes.

Notes PDF contains 336 pages. \

In December 2014, I mobilized several doctors to translate the subtitles of the video “Enda Ng Machine Learning”, sorted out and translated most of the video, and suppressed the video and Chinese and English subtitles into MKV file. (At present, the subtitles of this video have been sent to netease Cloud Class: Ng machine Learning course for free.)

In addition, GitHub has offline video downloads with Chinese and English subtitles. Github address (star 16K +) :

Github.com/fengdu78/Co…

Notes directory \

Week 1

1. Introduction

1.1 welcome

1.2 What is machine learning?

1.3 Supervised Learning

1.4 Unsupervised learning

Linear Regression with One Variable

2.1 Model Representation

2.2 Cost function

2.3 Intuitive understanding of cost function I

2.4 Intuitive understanding of cost functions II

2.5 Gradient descent

2.6 Intuitive understanding of gradient descent

2.7 Linear regression of gradient descent

2.8 What follows

Linear Algebra Review

3.1 Matrices and vectors

3.2 Addition and scalar multiplication

3.3 Matrix vector multiplication

3.4 Matrix Multiplication

3.5 Properties of matrix multiplication

3.6 Inverse and transpose

2 weeks

Linear Regression with Multiple Variables

4.1 Multidimensional Features

4.2 Multivariable gradient descent

4.3 Gradient descent Practice 1- Characteristic scaling

4.4 GDA practice 2- learning rate

4.5 Feature and polynomial regression

4.6 Normal Equation

4.7 Normal Equations and Irreversibility (Elective)

5. Octave Tutorial

5.1 Basic Operations

5.2 Moving Data

5.3 Calculating Data

5.4 Drawing Data

5.5 Control statements: for, while, if statements

5.6 to quantify

5.7 Programming exercises for work and submission

3 weeks

6. Logistic Regression

6.1 Classification Problems

6.2 Representation of hypothesis

6.3 Determining boundaries

6.4 Cost Function

6.5 Simplified cost function and gradient descent

6.6 Advanced Optimization

6.7 Multiple Categories Category: One-to-many

7. Regularization (Regularization)

7.1 Problems of over-fitting

7.2 Cost Function

7.3 Regularized linear regression

7.4 Regularized logistic regression model

4 weeks

Neural Networks are often used for Representation.

8.1 Nonlinear Hypothesis

8.2 Neurons and the brain

8.3 The model represents 1

8.4 Model representation 2

8.5 Features and Intuitive Understanding 1

8.6 Samples and Intuitive Understanding II

8.7 Multi-category Classification

5 weeks

9, Neural Networks Learning

9.1 Cost Function

9.2 Back Propagation Algorithm

9.3 Intuitive understanding of the back propagation algorithm

9.4 Implementation Note: Expand the parameters

9.5 Gradient Test

9.6 Random Initialization

9.7 Taken Together

9.8 Autonomous Driving

6 weeks

Advice for Applying Machine Learning

10.1 Deciding what to do next

10.2 Evaluating a hypothesis

10.3 Model selection and cross validation set

10.4 Diagnostic bias and variance

10.5 Regularization and deviation/variance

10.6 Learning Curve

10.7 Decide what to do next

11. Machine Learning System Design

11.1 What should be done first

11.2 Error Analysis

11.3 Error measure of quasi-deflection

11.4 Trade-offs between precision and recall

11.5 Machine learning data

7 weeks

12. Support Vector Machines

12.1 Optimization Objective

12.2 Intuitive understanding of large boundaries

12.3 Mathematics behind Large Boundary Classification (Elective)

12.4 Kernel function 1

12.5 Kernel 2

12.6 Using support Vector Machines

8 weeks

13. Clustering

13.1 Unsupervised Learning: Introduction

13.2 K-means algorithm

13.3 Optimization objectives

13.4 Random Initialization

13.5 Selecting the Number of clusters

14. Dimensionality Reduction

14.1 Motivation 1: Data compression

14.2 Motivation 2: Data visualization

14.3 Principal component analysis problems

14.4 Principal component analysis algorithm

14.5 Choose the number of principal ingredients

14.6 Compressed representation of reconstruction

14.7 Suggestions on the application of principal component analysis

9 weeks

15. Anomaly Detection

15.1 Motivation for the question

15.2 Gaussian Distribution

15.3 algorithm

15.4 Develop and evaluate an anomaly detection system

15.5 Comparison between anomaly detection and supervised learning

15.6 Selecting Features

15.7 Multivariate Gaussian Distribution (Optional)

15.8 Anomaly Detection using Multivariate Gaussian Distribution (Optional)

(16) Recommender Systems

16.1 Problem formalization

16.2 Content-based recommendation system

16.3 Collaborative Filtering

16.4 Collaborative filtering Algorithm

Vectorization: Factorization of low rank matrices

16.6 Implementation details: mean normalization

10 weeks

17. Large Scale Machine Learning

17.1 Learning of large data sets

17.2 Stochastic gradient descent method

17.3 Small batch gradient descent

17.4 Convergence of stochastic gradient descent

17.5 Online learning

17.6 Mapping simplification and data parallelism

18, Application Example: Photo OCR

18.1 Problem Description and Flowchart

18.2 Sliding Windows

18.3 Obtaining large amounts of data and artificial data

18.4 Upper limit analysis: which part of pipeline should be done next

19. Conclusion

19.1 Summary and acknowledgements

The attachment

CS229 Machine Learning Review Materials – Linear Algebra

1. Basic concepts and symbols

Matrix multiplication

3 Operations and properties

Matrix calculus

CS229 Machine Learning Recitation Materials – Probability Theory

1. The basics of probability

2. Random variables

3. Two random variables

4. Multiple random variables

5. Other resources

Mathematical Basis of Machine Learning (Chinese Textbooks)

Higher mathematics

Linear algebra

Probability theory and mathematical statistics

2. Notes and resources of Ng’s Deep Learning course

In August 2017, Mr. Ng launched the DeepLearning course (deeplearning. ai). This course was launched one after another. I organized many students to write it together and finally compiled it into word and markdown files. I’ve translated deeplearning. ai’s take-home quiz for beginners. All questions have been translated, suitable for students who are not good at English.

Notes PDF 781 pages in total.

Notes PDF (Word, Markdown), quizzes and offline videos are all posted on GitHub, earning 10.5K + STAR and available for download:

Github.com/fengdu78/de…

Notes directory

Neural Networks and Deep Learning

Week 1: Introduction to Deep Learning

1.1 Welcome (Welcome)

1.2 What is a neural network? (What is a Neural Network)

1.3 Supervised Learning with Neural Networks

1.4 Why the rise of deep learning? (Why is Deep Learning taking off?)

About this Course

1.6 Course Resources

Week 2: Basics of Neural Network Programming

2.1 Binary Classification

2.2 Logistic Regression

2.3 Logistic Regression Cost Function

2.4 Gradient Descent

2.5 Derivatives

2.6 More Derivative Examples

2.7 Computation Graph

2.8 Derivatives with a Computation Graph

2.9 Logistic Regression Gradient Descent

2.10m samples Gradient Descent on M Examples

2.11 Vectorization

2.12 More Examples of Vectorization

2.13 Vectorizing Logistic Regression

2.14 VectorizingLogistic Regression’s Gradient

2.15 Broadcasting in Python

A note on Python or Numpy Vectors

Notebooks Quick Tour of Jupyter/iPythonNotebooks

2.18 (Optional) Explanationof Logistic Regression Cost Function

Week 3: Shallow Neural Networks

3.1 Neural Network Overview

3.2 Neural Network Representation

Computing a Neural Network’s output

3.4 Vectorizing across Multiple Examples

3.5 Justification for Vectorized Implementation

3.6 Activation Functions

3.7 Why is a nonlinear activation function needed? (Why need a nonlinear activation function?)

3.8 Derivatives of activation Functions

3.9 Gradient Descent for neural networks

3.10 (Elective) Intuitive Understanding of Backpropagation;

3.11 Random+Initialization

Week 4: Deep Neural Networks

4.1 Deep L-Layer neural network

4.2 Forward and Backward Propagation

4.3 Forward Propagation in a Deep Network

4.4 Checking the dimensions of your matrix (Getting your matrix dimensions Right)

4.5 Why Use deep Representation? (Why deep representations?)

4.6 Building blocks of deep neural networks

4.7 Parameters vsHyperparameters

What does this have to do with the brain?

The second course: Improving Deep Neural Networks:Hyperparameter Tuning,Regularization and Optimization

Week 1: Practical Aspects of Deep Learning

Train/Dev/Test sets

1.2 Bias /Variance

1.3 Basic Recipe for Machine Learning

1.4 Regularization (Regularization)

1.5 Why does regularization help prevent overfitting? (Why regularization reduces overfitting?)

1.6 Dropout Regularization (Dropout Regularization)

1.7 UnderstandingDropout

1.8 Other Regularization Methods

1.9 Normalizing inputs

1.10 Gradient Vanishing/Explosion Gradient

1.11 Weight Initialization for Deep Networks

Numerical approximation of Gradients

1.13 Gradient checking

1.14 Gradient Checking Implementation Notes

Week 2: Optimization Algorithms

2.1 Mini-Batch gradient descent

2.2 Understanding mini-Batch gradient Descent (understandingMini-Batch gradient Descent)

2.3 Exponentially Weighted Average

2.4 Understanding Weighted Averages

2.5 Bias Correction in Weighted Averages

2.6 Gradient Descent with Momentum

2.7 RMSprop

2.8 Adam Optimization Algorithm

2.9 Learning Rate Decay

2.10 The Problem of Local Optima

Week 3 HyperParameterization, Batch Regularization and Program Framework

3.1 Tuning Process

3.2 Using an appropriate scale to pick Hyperparameters

Hyperparameters tuning in practice: Pandas vs. Caviar

3.4 Normalizing Activations in a network

3.5 FittingBatch Norm into a neural network

3.6 Why does Batch Norm work? (Why does Batch Norm work?)

3.7 Batch Norm (Batch Normat Test Time)

3.8 Softmax Regression

3.9 Traininga Softmax classifier (Traininga Softmax classifier)

3.10 Deep Learning Frameworks

3.11 TensorFlow

Structuring Machine Learning Projects

Week 1 Machine Learning (ML) Strategy (1)

1.1 Why ML Policy? (Why ML Strategy?)

1.2 Orthogonalization

1.3 Single number Evaluation Metric

1.4 Satisficing and Optimizing Metrics

1.5 Training/development/Test Set Division (Train/dev/test review)

1.6 Size of dev and test sets

1.7 When is it time to change development/test sets and metrics? (When to changedev/test sets and metrics)

1.8 Why is it human performance? (Why human-level performance?)

1.9 Avoidable Bias

Understanding Human-level Performance

1.11 Surpassing human performance, Surpassing human-level Performance

1.12 Improving Your Model Performance

Week 2: Machine Learning Strategies (2) (ML Strategy (2))

2.1 Carrying out error analysis

2.2 Cleaning up Incorrectly labeled Data

2.3 Build your first system quickly, then iterate (iterate)

2.4 Training and testing on different distributions

2.5 Bias and Variance with mismatched data distributions

2.6 Addressing Data Mismatch

2.7 Transfer Learning

2.8 Multi-task Learning

2.9 What is end-to-end deep Learning? (What is end-to-end deep learning?)

2.10 Should WE use end-to-end deep learning? (Whether to use end-to-end learning?)

Convolutional Neural Networks

Week 1 Foundations of Convolutional Neural Networks

1.1 Computer Vision

1.2 Edge Detection Example

1.3 More Edge Detection

1.4 the Padding

1.5 Strided Convolutions

1.6 Convolutions over Volumes

1.7 One Layer of a Convolutional Network

1.8 A Simple Convolution Network Example

1.9 Pooling Layers

1.10 Convolutional Neural Network Example

1.11 Why Convolution? (according to convolutions?)

Week 2 Deep convolutional Models: Case Studies

2.1 Why to conduct case study? (Why look at case studies?)

2.2 Classic Networks

2.3 Residual Networks (ResNets)

2.4 Why are residual networks useful? (Why ResNets work?)

2.5 Network in Networks and 1×1 convolutions

2.6 Google Inception (Inceptionnetwork Motivation)

2.7 Inception Network

2.8 Using Open-Source Implementations

2.9 Transfer Learning

2.10 Data augmentation

2.11 The State of Computer Vision

Week 3 Object Detection

3.1 Object Localization

3.2 Landmark Detection

3.3 Object Detection

3.4 Convolutional Implementation of Sliding Windows

3.5 Bounding Box Predictions

3.6 Interp over Union

3.7 Non-Max Suppression

3.8 the Anchor Boxes

3.9 YOLO Algorithm (Putting It Together: YOLO Algorithm)

3.10 Candidate Regions (Region Proposals (Optional))

Special Applications: Face Recognition and Neural styleTransfer

4.1 What is face recognition? (What is face recognition?)

4.2 One-shot Learning

4.3 Siamese Network (Siamese Network)

4.4 Triplet Loss (Triplet Loss)

4.5 Face verification and binary Classification

4.6 What is neural style transfer? What is neural style transfer?

4.7 What does deep convolutional Network learn? (What are deep ConvNets learning?)

4.8 Cost Function

4.9 Content Cost Function

4.10 Style Cost Function

Generalizations of Models of 1D and 3D

Course 5 Sequence Models

“Recurrent Neural Networks”

1.1 Why sequential models? (Why Sequence Models?)

1.2 Mathematical Notation

1.3 Recurrent Neural Network Model

1.4 Backpropagation through time

1.5 Different Types of Recurrent Neural Networks

1.6 Language Model and Sequence Generation

1.7 Sampling novel sequences

1.8 Vanishing Gradients with RNNs of Circulating Neural Networks

1.9 GRU (Gated Recurrent Unit (GRU))

1.10 LSTM (Long Short Term Memory) Unit

Bidirectional RNN (Directional RNN)

1.12 Deep Recurrent Neural Network (Deep RNNs)

Week 2 Natural Language Processing and Word Embeddings

2.1 Word Representation

2.2 Using Word Embeddings

2.3 Properties of Word Embeddings

2.4 Embedding Matrix

2.5 Learning Word Embeddings

2.6 Word2Vec

2.7 Negative Sampling

2.8 GloVe Word Vectors

2.9 Sentiment Classification

2.10 Words Debiasing Word Embeddings

Sequence Models and Attention Mechanism

3.1 Various Sequence to Sequence Architectures

Picking the most likely sentence

3.3 Beam Search

3.4 Refinements to Beam Search

3.5 Error Analysis in Beam Search

3.6 Bleu Score (optional)

3.7 Attention Model Intuition

3.8 Attention Model

3.9 Speech Recognition

3.10 Trigger Word Detection

3.11 Conclusion and Thank you

The attachment

The power of example – Ng’s interview with ai masters

Andrew Ng interviewed Geoffery Hinton

Andrew Ng interviews Ian Goodfellow

Andrew Ng interviewed Ruslan Salakhutdinov

Andrew Ng interviewed Yoshua Bengio

Andrew Ng interviewed Lin Yuanqing

Andrew Ng interviewed Pieter Abbeel

Andrew Ng with Andrej Karpathy

Deep Learning Symbol Guide (original course translation)