Recently, Baidu Ernie upgraded to 3.0, heavy release of knowledge enhancement of tens of billions of parameters large model. This model not only learns vocabulary, structure and semantics from massive text data, but also learns from large-scale knowledge graph.

Ernie 3.0 has updated 54 Chinese NLP task benchmarks, and its English model has topped the global list on SuperGlue, the internationally authoritative complex language understanding task evaluation, with a score of 0.8 percentage points higher than human level. At the same time, Ernie 3.0 has a strong language understanding ability and the ability to write novels, lyrics, poems, couplets and other literary creations.

At present, Ernie 3.0 has been opened on Baidu Wenxin official website. Users can experience different forms of content created by Ernie 3.0 and realize more creative and valuable applications.

Paper links: https://arxiv.org/pdf/2107.02… The Demo link: https://wenxin.baidu.com/wenx…

Ernie 3.0 Knowledge Enhancement Model: The introduction of large-scale knowledge for the first time in ten billion level pre-training

In the past year, the large-scale pre-training models represented by GPT-3 and Switch-Transformer have brought new breakthroughs in the field of artificial intelligence. Due to their strong generality and excellent transfer ability, the development of pre-training models towards large-scale parameterization has been set off. However, the existing large-scale pre-training models, which mainly rely on text-only learning, lack large-scale knowledge-guided learning, and have limited model capabilities.

Ernie 3.0 researchers further explored the potential of large-scale pre-training model. Based on the distributed training technology advantage of flying OARS on the deep learning platform, large-scale knowledge graph was introduced into the pre-training model of ten billion level for the first time. The Universal Knowledge-Text Prediction method of massive unsupervised texts and large-scale Knowledge graph is proposed. By simultaneously inputting the entity relationship of the large-scale knowledge graph and the large-scale text data into the pre-training model for joint mask training, the information sharing between structured knowledge and unstructured text is promoted, and the memory and reasoning ability of the model for knowledge is greatly improved.



Text and knowledge parallel pre-training in Ernie 3.0

Ernie 3.0 Unified Pre-Training Framework: both language comprehension and language generation

Baidu researchers put forward a model framework that combines general semantic representation with task semantic representation. This framework integrates different task semantic representation networks, such as autocoding and autoregression, which can handle language understanding and language generation tasks simultaneously. In addition, I can do zero-shot Learning without labeled data and fine-tuning training with labeled data. In addition, Ernie 3.0 adds a task semantic representation network on the basis of continuous learning framework to accelerate model evolution.



(Ernie 3.0 Framework)

The ERNIE3.0 framework has two layers. The first layer is the Universal Semantic Representation Network, which learns the basic and general knowledge in the data. The second layer is task semantic representation network, which is based on general semantic representation to learn task-related knowledge. The semantic representation network of different tasks can be realized through autocoding or autoregressive structure, and interaction and enhancement can be realized through underlying sharing. In the learning process, the task-semantic representation network only learns the pre-training tasks of the corresponding category, while the general semantic representation network learns all the pre-training tasks.

Ernie 3.0 Effect: Refresh 54 Chinese NLP task benchmarks at one stroke

Baidu researchers comprehensively verified and evaluated the effectiveness and general capability of Ernie 3.0 on 54 public datasets of Chinese natural language processing, including sentiment analysis, opinion extraction, reading comprehension, text summarization, dialogue generation, and mathematical operations. Ernie 3.0 showed the best results to date, with significant improvements of more than 3 percent in more than 20 different types of natural language processing tasks.



(Task effect of Ernie 3.0 in the fine-tuning paradigm)

In practical applications, there is often a lack of annotated data. Therefore, Baidu researchers also tested the effect of Ernie 3.0 under the paradigm of zero-shot Learning (zero-sample Learning), and Ernie 3.0 also achieved significant improvement on most tasks compared with the existing Chinese large model.



Ernie 3.0 in zero-sample learning

Ernie 3.0 English Model Top SuperGlue: 0.8 percentage points above human level

In addition to the amazing effect of Chinese model, Ernie 3.0 English model has surpassed T5 of Google and GPT-3 of OpenAI and other large models in the SuperGLUE of international authoritative complex language understanding task evaluation, ranking the top in the world with a score of 0.8 percentage points higher than the human level.

SuperGlue is an assessment of complex language understanding tasks jointly published by Google DeepMind, Facebook Research, New York University, the University of Washington and other authoritative organizations. It aims to improve the effectiveness of complex tasks such as common sense reasoning, causal judgment, context disambiguation, and reference resolution.



(Ernie 3.0 tops SuperGLUE worldwide)

In fact, back in December 2019, Ernie topped Glue’s global ranking for the first time, with an average score of 90 across nine tasks. This time Ernie 3.0 took the top spot in SuperGlue, proving once again the strength of Ernie.



(Ernie tops Glue worldwide)

Writing novels, lyrics, and ancient prose: Ernie 3.0’s literary creation and knowledge mastery improved significantly

Ernie 3.0, also has a significant improvement in literary creation ability, can be through the study of massive texts and knowledge, without special training, can create literature.

Ernie 3.0 also greatly improved the mastery of knowledge. The model was enhanced through the knowledge graph, which enabled the model to have stronger knowledge memory and reasoning ability.

Now that these capabilities are available, you can click (the Demo address) to see what Ernie 3.0 is all about.

Since its birth in 2019, Ernie has made a series of technological breakthroughs in language understanding, text generation, cross-modal semantic understanding and other fields, and won more than ten world championships in open authoritative semantic evaluation. In 2020, Wen Xin won the SAIL Award, the highest award of the World Artificial Intelligence Conference (WAIC).

At present, Wenxin Ernie has been widely used in search, information flow, smart speakers and other Internet products, and through Baidu intelligent cloud output to industry, energy, finance, communications, media, education and other industries, to help industrial intelligent upgrading. The release of Ernie 3.0 will also further improve application performance and create greater economic and social value.

For those of you who want to experience the literary and intellectual abilities of Ernie 3.0, click on the link in the description to visit the Demo.

The Demo link: https://wenxin.baidu.com/wenx…

Baidu Natural Language Processing (NLP) takes “understanding Language, being intelligent and changing the world” as its mission. It develops core technologies of Natural Language Processing, builds a leading technology platform and innovative products, serves global users and makes the complex world simpler.