From making

The heart of the machine

Participation: Huang Xiaotian, Jiang Siyuan

Today, We found a Conscience project on Github: RedditSota compiled a list of the top research papers on various machine learning tasks for you to index. Heart of the Machine made a presentation on the project.

The address of the project: https://github.com//RedditSota/state-of-the-art-result-for-machine-learning-problems

The GitHub library provides the current best results for all machine learning problems and makes every effort to keep the library up to date. If you find that the current best result for a problem is out of date or missing, please mention it as a problem (with: paper name, data set, metrics, source code, year) and we will correct it immediately.

This is an attempt to find the best current result for all types of machine learning problems. We can’t do it alone, so we want every reader to participate. If you find the current best results for a dataset, please submit and update the GitHub project.

Supervised learning

NLP

  • 1. Language modeling

The following shows some of the best current research in language modeling and their performance on different data sets.

DYNAMIC EVALUATION OF NEURAL SEQUENCE MODELS

The paper address: https://arxiv.org/pdf/1709.07432.pdf

Implement address: https://github.com/benkrause/dynamic-evaluation

Regularizing and Optimizing LSTM Language Models

The paper address: https://arxiv.org/pdf/1708.02182.pdf

Implement address: https://github.com/salesforce/awd-lstm-lm

Thesis: FRATERNAL DROPOUT

The paper address: https://arxiv.org/pdf/1711.00066.pdf

Implement address: https://github.com/kondiz/fraternal-dropout

Factorization Tricks for LSTM Networks

The paper address: https://arxiv.org/pdf/1703.10722.pdf

Implement address: https://github.com/okuchaiev/f-lm

Among the four top results of Language Modelling, we see Yoshua Bengio et al. FRATERNAL DROPOUT achieving current best results on both PTB and Wikitext-2 datasets. In this paper, Bengio et al. proposed a technique called fraternal dropout, in which they first train two identical RNS (parameter sharing) with different dropout masks and minimize the differences in their (pre-softmax) predictions. In this way, the regular term will promote the invariance of the representation of RNN to the Dropout mask. Bengio et al. proved that the upper bound of their regular term is the droupout target of linear expectation, which can solve the Gap caused by the differences in training and inference stages of droupout.

In addition, Ben Krause et al. proposed the use of dynamic evaluation to improve the performance of neural sequence models. Regularization and optimization in LSTM Language Models, presented by Stephen Merity et al., Salesforce, focuses on language modeling at the word level and investigates specific problems such as regularization and optimization in LSTM models to investigate more efficient language modeling methods. Nvidia’s Oleksii Kuchaiev et al. proposed two LSTM correction units (LSTMP) with mapping to reduce the number of parameters and increase the speed of training.

  • 2. Machine translation

Attention Is All You Need

The paper address: https://arxiv.org/abs/1706.03762

Realize the address: https://github.com/jadore801120/attention-is-all-you-need-pytorch and https://github.com/tensorflow/tensor2tensor

Regressive NEURAL MACHINE TRANSLATION for non-autoregressive NEURAL MACHINE TRANSLATION

The paper address: https://einstein.ai/static/images/pages/research/non-autoregressive-neural-mt.pdf

Implementation address: not published

In terms of machine translation, we are familiar with the study of attention mechanism by Ashish Vaswani et al., Google Brain. This model has a very good performance in WMT 2014 British And French and British and German data sets. This study shows that in encoder-decoder configurations, dominant sequence dominant transduction model is based on complex RNN or CNN. The best-performing model also connects the encoder and decoder via the attention mechanism. Therefore, In this paper, Google proposed a new simple network architecture — Transformer, which is completely based on the attention mechanism and has completely abandoned loops and convolution. The experiments on the two machine translation tasks shown above also show that the translation quality of these models is not only very good, but also that they can be processed in parallel, so the training time required for this model can be greatly reduced. This paper shows that Transformer also generalizes well on other tasks and can be successfully applied to English group analysis tasks with large amounts of training data and limited training data.

In addition to this paper, researchers from Salesforce and the University of Hong Kong proposed models that avoided autoregressive effects and produced outputs in parallel, reducing the latency of inference by several orders of magnitude. This paper demonstrates a large number of performance improvements on the IWSLT 2016 English-German data set through a three-level training strategy, and currently achieves top-notch results on WMT2016 English-Romanian.

  • 3. Text classification

Learning Structured Text Representations

The paper address: https://arxiv.org/abs/1705.09207

Implementation address: not published

Thesis: Attentive Convolution

The paper address: https://arxiv.org/pdf/1710.00519.pdf

Implementation address: not published

Yang Liu et al. from the University of Edinburgh proposed learning structured text representations. In this paper, they focused on learning structured text representations from data without discourse parsing or additional annotation resources. Although there is no corresponding implementation code at present, their accuracy can reach 68.6 on Yelp data set. Another article on attentional convolution proposes an AttentiveConvNet, which expands the field of text processing through convolution operation.

  • 4. Natural language reasoning

Paper: DiSAN: Directional Self-attention Network for RNN/ CNN-Free Language Understanding

The paper address: https://arxiv.org/pdf/1709.04696.pdf

Implementation address: not published

Researchers at the University of Technology, Sydney and the University of Washington have proposed DiSAN, a directed self-attention network for RNN/CNN-free language comprehension. This study proposes a novel mechanism of attention, that is, the attention between each element in the input sequence is directional and multidimensional, which is a kind of attention linked to corresponding features. The study achieved 51.72% accuracy on the Stanford Natural Language Reasoning (SNLI) dataset.

  • 5, question and answer

Interactive AoA Reader+ (Ensemble)

Data set address: https://rajpurkar.github.io/SQuAD-explorer/

Implementation address: not published

The Stanford Question and Answer Dataset (SQuAD) is a new reading comprehension dataset whose questions and answers are based on Wikipedia and crowdsourced. The following GitHub address shows the data set and a model to evaluate it.

  • 6. Named entity recognition

Thesis: Named Entity Recognition in Twitter using Images and Text

The paper address: https://arxiv.org/pdf/1710.11027.pdf

Implementation address: not published

In this paper, researchers such as Diego Esteves of the University of Bonn propose a novel multilevel architecture that does not rely on any linguistic-specific resources or decoding rules for named entity recognition using images and text on Twitter. Their new model achieved an excellent performance of 0.59 f-Measure on the Ritter dataset.

Computer vision

  • 1, classification,

Thesis: Dynamic Routing Between Capsules

The paper address: https://arxiv.org/pdf/1710.09829.pdf

Implementation address: https://github.com/gram-ai/capsule-networks, https://github.com/naturomics/CapsNet-Tensorflow, https://github.com/XifengGu O/CapsNet – Keras, https://github.com/soskek/dynamic_routing_between_capsules

High Performance Neural Networks for Visual Object Classification

The paper address: https://arxiv.org/pdf/1102.0183.pdf

Implementation address: not published

In the field of computer vision, one of the most popular recent papers is the Capsule dynamic routing method proposed by Geoffrey Hinton et al. Machine Heart has also analyzed the paper and implementation in detail. In the paper, Geoffrey Hinton introduces Capsule as: “Capsule is a group of neurons, the input and output vectors instantiate the parameters of a specific entity type (that is, certain objects, concepts, entities such as the probability of certain attributes). We use the probability of the length of the input and output vectors represent entities, the direction of the vector said instantiate the parameters (i.e. some graphic attribute of the entity). The same level of capsule predicts the instantiation parameters of the higher level of capsule through the transformation matrix. When multiple predictions are consistent (dynamic routing is used in this paper to make the predictions consistent), the higher level of capsule will become active.”

In addition, Jurgen Schmidhuber et al. proposed a high-performance neural network for visual object classification. In this paper, they proposed a fast fully parameterized GPU implementation of a variant of the convolutional neural network. Although the paper was presented in 2011, it worked very well on the NORB data set.

voice

  • 1, the ASR

THE MICROSOFT 2017 CONVERSATIONAL SPEECH RECOGNITION SYSTEM

The paper address: https://arxiv.org/pdf/1708.06073.pdf

Implementation address: not published

This article introduces the 2017 version of Microsoft’s Conversational speech recognition system. It adds a CNN-BLSTM acoustic model to the existing model architecture setup and a hybrid network re-scoring step after the system is integrated. Results The system obtained 5.1% word error rate on Switchboard Hub5’00 data set.

Semi-supervised learning

Computer vision

SMOOTHING WITH VIRTUAL ADVERSARIAL TRAINING

The paper address: https://arxiv.org/pdf/1507.00677.pdf

Implement address: https://github.com/takerum/vat

Virtual Adversarial Training: A Regularization Method for Supervised and semi-supervised Learning

The paper address: https://arxiv.org/pdf/1704.03976.pdf

Implementation address: not published

In the first paper, Kyoto University proposed a new concept of statistical model smoothness, local distributed smoothness (LDS). It can be used as regularization to improve the smoothness of the model distribution. This method not only performs well in solving supervised and semi-supervised learning tasks on MNIST data sets, but also achieves a score of 24.63 and 9.88 for Test Error on SVHN and NORB data, respectively. The results above demonstrate that the proposed method performs better than the current best results in semi-supervised learning tasks.

The second paper proposes a new regularization method based on virtual antagonism loss: a new measure of local smoothness of output distribution. Since the indication of smoothness model is virtual adversarial, this method is also called virtual adversarial training (VAT). VAT is relatively cheap to calculate. In this paper, VAT is applied to supervised and semi-supervised learning on multiple benchmark data sets, and an excellent performance of Test Error 1.27 is achieved on MNIST data.

Unsupervised learning

Computer vision

  • 1. Model generation

PROGRESSIVE GROWING OF GANS FOR IMPROVED QUALITY, STABILITY, AND VARIATION

The paper address: http://research.nvidia.com/sites/default/files/publications/karras2017gan-paper-v2.pdf

Implement address: https://github.com/tkarras/progressive_growing_of_gans

In this article, Nvidia describes a new GAN training approach that increments the capabilities of both generators and discriminators: starting at low resolution, adding new layers of continuous modeling of fine detail as a training process. This not only speeds up training, but is also more stable, resulting in images of higher quality than expected. In this paper, we also propose a simple method to increase the variation of generated images, and obtain a score of 8.80 in CIFAR10. An additional contribution was to create a higher quality version of the CELEBA dataset.

Please click on the homepage of the competition US to enter the registration channel of the US division