Joining the AI industry and getting a high salary is just the beginning of a career. At present, the structure of AI talents is constantly upgrading, and the requirements for AI talents are also constantly increasing. If you do not have high requirements for yourself, it is easy to be eliminated by the trend of rapid development. \
In order to meet the needs of The Times, we launched the “Machine Learning High-end Training Camp” class last year, which was taught by the founding team of the college and was well received by the majority of students. In this training camp (the fourth session), we have made significant updates to the content. On the one hand, we have added cutting-edge topics such as graph neural networks, and on the other hand, we have increased the theoretical depth of core parts such as convex optimization and reinforcement learning. At present, there should be no similar systematic courses on the Internet. The course is still delivered live.
So what kind of people are suitable for the advanced class? \
- I have been engaged in AI industry for many years, but I always feel that I am not deep enough in technology, and I feel that I have met a technical bottleneck.
- It is difficult to come up with a new model based on a business scenario if you are stuck using models/tools;
- The optimization theory and cutting-edge technology behind machine learning are not deep enough;
- Plan to engage in cutting-edge scientific research and research work, apply for postgraduate and doctoral degrees in AI field;
- Plan to enter the top AI companies such as Google, Facebook, Amazon, Ali, Toutiao, etc.;
- It is difficult to read ICML, IJCAI and other conference articles, and it is difficult to understand every detail thoroughly.
01 Course Outline
Part ONE: Convex optimization and machine learning
Week 1: Convex optimization introduction
- Understanding machine learning from an optimization perspective
- The importance of optimization techniques
- Common convex optimization problems
- Linear programming and Simplex Method
- Two-Stage LP
- Case: Transportation problem explanation
Week 2: Convex functions
- Judgment of convex sets
- First-Order Convexity
- Second-order Convexity
- Operations Preserve Convexity
- Quadratic programming problem (QP)
- Case study: least squares problem
- Project assignment: stock portfolio optimization
Week 3: Convex optimization problem
- Common classes of convex optimization problems
- Semidefinite programming problem
- Geometric programming problem
- Optimization of nonconvex functions
- Relaxation
- Integer Programming
- Case: Matching problem in taxi hailing
Week 4: Duality
- Lagrangian dual function
- The geometric meaning of duality
- Weak and Strong Duality
- KKT conditions
- The dual problem of LP, QP and SDP
- Case: Duality derivation and implementation of classical model
- Other applications of duality
Week 5: Optimizing techniques
- First and second order optimization techniques
- Gradient Descent
- Subgradient Method
- Proximal Gradient Descent
- Projected Gradient Descent
- SGD and convergence
- Newton’s Method
- Quasi-Newton’s Method
The second part is graph neural network
Week 6: Basics of Math
- Vector space and graph theory fundamentals
- Inner Product, Hilbert Space
- Eigenfunctions, Eigenvalue
- Fourier transform
- A convolution
- Time Domain, Spectral Domain
- Laplacian, Graph Laplacian
Week 7: Graph neural networks in the spectral domain
- Convolutional neural network regression
- Mathematical significance of convolution operations
- Graph Convolution
- Graph Filter
- ChebNet
- CayleyNet
- GCN
- Graph Pooling
- Case: Based on the recommendation of GCN
Week 8: Graph neural networks in spatial domains
- Spatial Convolution
- Mixture Model Network (MoNet)
- Attentional mechanism
- Graph Attention Network(GAT)
- Edge Convolution
- Comparison of spatial domain and spectral domain
- Project assignment: Link prediction based on graph neural network
Week 9: Improvement and application of graph neural network
- Extension 1: Relative Position and Graph neural networks
- Extension 2: Integrating Edge features: Edge GCN
- Extension 3: Graph neural Networks and Knowledge graph: Knowledge GCN
- Extension 4: Posture Recognition: ST-GCN
- Case study: graph-based text classification
- Case study: Graph-based reading comprehension
Part three reinforcement learning
Week 10: Fundamentals of intensive learning
- Markov Decision Process
- Bellman Equation
- Three methods: Value, Policy, model-based
- Value-Based Approach: Q-learning
- Policy-Based Approach: SARSA
Week 11: Multi-a **** Rmed Landscape
- Multi-Armed bandits
- Epsilon-Greedy
- Upper Confidence Bound (UCB)
- Contextual UCB
- LinUCB & Kernel UCB
- Case: Application cases of field in the recommendation system
Week 12: Path planning
- Monte-Carlo Tree Search
- N-step learning
- Approximation
- Reward Shaping
- Combined with Deep learning: Deep RL
- Project assignment: Examples of reinforcement learning in games
Week 13: RL in Natural Language processing
- Problems with Seq2seq model
- Customized Loss with Evaluation Metric
- Custom Loss combined with aspect
- The combination of different RL models and SEQ2SEQ models
- Case study: Text generation based on RL
The fourth part is Bayesian method
Week 14: Introduction to the Bayes methodology
- Bayes’ theorem
- From MLE, MAP to Bayesian estimation
- Comparison of integration model and Bayesian method
- Computational Intractiblity
- Introduction to MCMC and variation method
- Bayesian linear regression
- Bayesian neural networks
- Case study: Named entity Recognition based on Bayesian LSTM
Week 15: Theme models
- Generative model and discriminant model
- Implicit variable model
- The importance of Prior in Bayes
- Dirichlet distribution, polynomial distribution
- LDA generation process
- Parameters and hidden variables in LDA
- Supervised LDA
- Dynamic LDA
- Other variants of the LDA
- Project assignment: Modify and build an unsupervised sentiment analysis model based on LDA
Week 16: MCMC Method
- Detailed Balance
- Gibbs sampling for LDA
- Samples from Collapsed Gibbs of LDA
- Metropolis Hasting
- Importance Sampling
- Rejection Sampling
- Large-scale distributed MCMC
- Big data and SGLD
- Case study: Distributed LDA training
Week 17: Variational Method
- Core idea of variation method
- KL divergence and derivation of ELBo
- Mean – Field variational method
- The EM algorithm
- Derivation of variation method for LDA
- Big data and SVI
- Comparison of variational method and MCMC
- Variational Autoencoder
- Probabilistic Programming
- Example: Use probabilistic programming tools to train Bayesian models
Low low low
Further details of the course can be obtained from the course advisor
Add course consultant little sister wechat
Registration, course consultation
????????????????????
Part 02 Cases and projects
Transportation optimization problem: one of the most classic problems in the field of operations research and optimization, similar ideas are widely used in warehouse optimization, matching and other problems.
Knowledge points involved:
- Linear regression and optimization implementation
- Two-stage stochastic linear programming to optimize the implementation
The problem of route planning in taxi-hailing: We use taxi-hailing apps or take-out apps almost every day. For these applications, the core algorithm application is passenger and vehicle matching.
Knowledge points involved:
-
Mixed Integer Linear Programming\
-
Provide approximation bounds
Duality Derivation and Implementation of classical machine learning models: Gain a deeper understanding of machine learning models and the role of duality through this exercise.
Knowledge points involved:
-
SVM, LP model \
-
The dual technology
-
KKT conditions
Text classification based on graph neural network: After the text is processed with the grammar analysis tool, a piece of text can become a graph, and then the graph convolutional neural network can be used to do the subsequent classification work
Knowledge points involved:
- Syntax analysis
- Graph neural network
Reading comprehension based on graph neural network: Normal reading requires a machine to read multiple articles and provide answers to the questions raised. In reading comprehension it becomes important to extract key entities and relationships that can be used to construct a graph.
Knowledge points involved:
- Named recognition, relational extraction
- Graph neural network
- Heterogeneous Graph\
\
Moreover, some application cases of native fields in the recommendation system are described as following: moreover, in the application of field of sequential decision making, these fields are easy to be implemented, have high computational efficiency, have solved the problem of cold start, and have low requirements for data annotation (in general, only some annotations can be used as rewards, such as being clicked by users). This case describes how field can be applied in the news recommendation system to make content-based recommendations.
Knowledge points involved:
- Exploration & Exploitation
- Epsilon Greedy
- Upper Confidential Bounder
- LineUCB
Use probabilistic programming tools to train Bayesian models: Similar to Pytorch and Tensorflow, probabilistic programming tools provide automatic learning of Bayesian models. We use LDA and other models as examples to illustrate the use of these tools.
Knowledge points involved:
- The probability of programming
- Topic model
- MCMC and variation method
Stock portfolio optimization: In portfolio optimization, we need to design and mix assets according to the user’s risk tolerance. In this project, we tried to make some necessary modifications under the framework of quadratic programming, such as adding necessary constraints, necessary regulars to control the sparsity of the portfolio, adding prior information in the investment, and finally guiding the model learning according to the pre-defined evaluation criteria
Knowledge points involved:
- Quadratic programming
- Different re usage
- Optimization based on constraints
- Introduction of prior
03 Instructor
Wen-zhe li: Founder and CEO of Greedy Technology, an expert in the field of artificial intelligence and knowledge graph, he was the chief scientist of fintech unicorn company and a senior engineer of Amazon. He has been responsible for chatbot, quantitative trading, adaptive education, financial knowledge graph and other projects. He has published more than 15 papers in AAAI, KDD, AISTATS and other top conferences, and won the Best paper award of IAAI and IPDPS. He has delivered speeches at industry summits for many times. He studied at USC, TAMU and Nankai university for his PhD, master’s degree and bachelor’s degree respectively.
Dong Yang, Ph.D., City University of Hong Kong, postdoctoral fellow at UC Merced, mainly engaged in machine learning, graph convolution, graph embedding research. She has published several papers in ECCV, Trans on Cybernetics, Trans on NSE, INDIN and other international conferences and journals. Served as the greedy college of advanced courses lecturer, students have been unanimously praised.
04 Live lecture, live derivation demonstration
Different from the poor PPT explanation, the tutor deduces the whole process on site, allowing you to have a clear idea in learning and a profound understanding of every detail of the derivation behind the algorithm model. More importantly, you can clearly see the relationships between the various models! Help you get through six pulses!
▲ From: LDA model explanation
\
▲ from: Convex Optimization
▲ from: Convergence Analysis
Who is the course for?
College students’
- Bachelor’s degree/Master’s degree/doctoral degree in computer science or related field, with basic knowledge of machine learning
- I hope I can go deep into the AI field to prepare for scientific research or going abroad
- Before entering the workplace, I want to go deep into the AI field and cultivate myself into a T-shaped talent
Working people
- At present, I am engaged in AI-related projects and have a good foundation of machine learning
- I hope to break the technical ceiling and be able to make innovations in models
- Pursue a career path of senior engineer, researcher, scientist
06 Weekly course arrangement
The teaching method is live broadcast, 3-4 times per week, including core theory class, practical class, review and consolidation class and paper explanation class. The teaching model also refers to the teaching system of America’s top universities. The following is one week’s course schedule for your reference.
07 Admission Criteria
1. Bachelor degree, master degree or doctoral degree in science and engineering.
2. Currently engaged in AI work.
3. Good Python programming skills.
4. Students with a certain machine learning foundation are not suitable for zero-based students.
08 Registration Instructions
1. This course is fee-based.
2. The remaining quota is limited.
3, quality assurance! The full refund is unconditional within 7 days after the official start of the course.
4. Learning this course requires a certain foundation of machine learning.
Low low low
Further details of the course can be obtained from the course advisor
Add course consultant little sister wechat
Registration, course consultation
????????????????????