The foreword 0.
Deep learning has been in use for over a year, and has recently begun work on NLP natural processing. Just take this opportunity to write a series of NLP machine translation deep learning practical courses.
This series of courses will go from principles and data processing to hands-on practice and application deployment, including the following content :(update ing)
- NLP Machine Translation Deep Learning Practical Course · Zero (Basic Concepts)
- NLP Machine Translation Deep Learning Practice course
- NLP Machine Translation Deep Learning Practice course ii (RNN+Attention Base)
- NLP Machine Translation Deep Learning Practice Course iii (CNN Base)
- NLP Machine Translation Deep Learning Practice Course iv (Self-attention Base)
- NLP Machine Translation Deep Learning practical course wu (Application deployment)
For this tutorial, see the blog: me.csdn.net/chinateleco…
1. Development status of NLP machine translation
1.1 Current situation of machine translation
1.1.1 What is Machine Translation?
What is machine translation? To put it bluntly, it is to convert one language into another language by computer, which is machine translation. This is very familiar to us students, so what is the theoretical support behind machine translation? And what’s the difference between machine translation a few decades ago and the neural networks we talk about today? First of all, let’s give a general description of machine translation from the perspective of its historical development. The history of machine translation has roughly experienced three stages:
- Rule-based Machine Translation (1970s)
- Machine Translation based on Statistics (1990)
- Neural Network-based Machine Translation (2014)
Rule-based Machine Translation (1970s)
The idea of rule-based machine translation first emerged in the 1970s. Based on observations of translators’ work, scientists are trying to force computers to do the same. The components of these translation systems include:
Bilingual dictionaries (Russian -> English) have a set of language rules for each language (e.g., nouns end in certain suffixes -heit, -keit, -ung, etc.)
That’s all. If necessary, the system can be supplemented with technical rules such as names, spelling correction, and transliteration of words.
Interested students can go to the Internet to take a closer look at the relevant information, here is a general flow chart, to represent the implementation process of rule-based machine translation.
Adjust the sentence structure according to the rules, and then go to the dictionary to find the meaning of the corresponding word fragments to form new sentences, and finally use some methods to make grammatical adjustments to the generated sentences.
Machine Translation based on Statistics (1990)
In the early 1990s, a machine translation system from IBM’s research center first appeared. Instead of understanding the rules and linguistics of the whole, it analyzes similar texts in both languages and tries to understand patterns.
The idea of the statistical model is to treat translation as a probabilistic problem. The principle is to use parallel corpus, and then word for word statistics. For example, although the machine does not know what the English word “Knowledge” is, it will find the word “Knowledge” in most of the corpus statistics whenever there is Knowledge in the English sentence. This allows machines to understand the meaning of words without having to manually maintain dictionaries and grammar rules.
The concept is not new, as Warren Weave first proposed a similar concept, but there was not enough parallel corpus and computers were too weak at the time to implement it. Where will modern statistical machine translation find its modern Rosetta Stone? The main source is actually the United Nations, because the RESOLUTIONS and announcements of the United Nations are available in the language of each member state, but other than that, you have to make your own parallel corpus, which is staggering given the cost of human translation today.
A large part of the 20 million corpus currently used in our own system is a parallel corpus from the United Nations.
Cms.unov.org/UNCorpus/zh…
Until 14 years ago, Google Translate, as it was known, was based on statistical machine translation. Hearing this, it should be clear that statistical translation models cannot achieve Babel. In your image, machine translation is still only “usable” rather than “useful”.
Neural Network-based Machine Translation (2014)
Neural networks aren’t new — in fact, they’ve been around for more than 80 years, but ever since Geoffrey Hinton remedied the fatal drawback of neural networks being too slow to optimize in 2006, deep learning has been popping up in our lives with all sorts of miraculous results. In 2015, for the first time, machines achieved image recognition beyond humans; In 2016, Alpha Go beat the world Go Champion. In 2017, speech recognition surpassed human stenographers; In 2018, machines surpassed humans in English reading comprehension for the first time. Of course, the field of machine translation is also flourishing with the super fertilizer of deep learning.
Yoshua Bengio laid the basic framework of deep learning technology for machine translation for the first time in his paper in 2014. He mainly uses sequential recursive neural networks (RNN) to allow machines to automatically capture word features between sentences, which can then be automatically written into another language translation. As soon as this article was published, Google was a treasure trove. Soon, with Google’s ample supply of gunpowder and the support of the great God, Google officially announced in 2016 that all statistical machine translation will be removed from the shelves, and neural network machine translation has become the absolute mainstream of modern machine translation.
This paper briefly introduces the general framework of machine translation based on neural network: encoder-decoder structure. In popular terms, encoder is the process of information compression, decoder is the process of decoding information back to human understanding, this process of information loss as little as possible. The structure is shown in the figure below:
Figure 1 GNMT machine translation framework
This is the structure of GNMT framework published by Google in 2016, which is realized by LSTM +attention mechanism. Interested students can check papers or baidu related blogs.
Figure 2 transformer machine translation framework
Transformer is Google in the 17 years of an essay to https://arxiv.org/pdf/1706.03762.pdf, the pioneering architecture before the structure is different from all the network structure of the machine translation, just rely on the advantage of the model, The state of the art result is obtained, which is better than any previous machine translation results.
1.1.2 Related papers
If you want a deeper understanding of the principle, or need to read some theoretical articles. If you just want to build such a system, follow the steps in the next practice, you will have the ability to build a machine translation system based on the most advanced models in the world.
Here is some sort of machine translation requires theoretical introduction, including the following contents: simple word embedded vector is introduced: blog.csdn.net/u012052268/… The Sequence to Sequence Learning with Neural Networks (2014) is a self-attentional mechanism based on Google’s GNMT (2016). The transformer (2017).
1.1.3 Related Meetings
The most famous top conference in MACHINE translation is WMT. All the world’s famous giant companies with MACHINE translation engine technology have won rankings in this competition. Since the beginning of this competition in 2017, all the top teams have optimized and iterated by building Transformer models.
The methods and techniques proposed by some of these teams have also been collected and collated by various companies with MACHINE translation technology to try to use in their own translation engines.
In addition, some important competition solutions at home and abroad are also some points we need to refer to. www.statmt.org/wmt18/