Abstract: on April 25th, huawei cloud released pangu series super-large scale pre-training models, including the world’s largest vision (CV) pre-training model with 3 billion parameters, and the world’s largest Chinese language (NLP) pre-training model with 100 billion parameters and 40TB training data jointly developed with cycle intelligence and pengcheng laboratory. In the future, Huawei Cloud will also release multi-mode, scientific computing and other super-large pre-training models.
This share since huawei Cloud community the HDC. 2021 | huawei Cloud Cloud training models of the world’s largest, open the development of new industrialization AI mode “, the original author: technology torch bearers.
On April 25, huawei cloud released pangu series super-scale pre-training models, including the world’s largest vision (CV) pre-training model with 3 billion parameters, and the world’s largest Chinese language (NLP) pre-training model with 100 billion parameters and 40TB training data jointly developed with cycle intelligence and pengcheng laboratory. In the future, Huawei Cloud will also release multi-mode, scientific computing and other super-large pre-training models.
“Pre-training large model is an important method to solve the customization and fragmentation of AI application development,” said Tian Qi, HUAWEI cloud ARTIFICIAL intelligence chief scientist and IEEE Fellow. Huawei Cloud Pangu Large model can achieve a large AI model in many scenarios universal, generalized and large-scale replication, reduce the dependence on data annotation, and use ModelArts platform, so that AI development from workshop to industrial development of the new model.”
Huawei Cloud ARTIFICIAL Intelligence chief scientist, IEEE Fellow tian Qi
The world’s largest Chinese language pre-training model, breaking CLUE’s three world records
Pangu NLP large model is the world’s largest Chinese language pre-training model with 100 billion parameters, which is jointly developed by huawei cloud, cyclic intelligence and pengcheng laboratory. During the pre-training stage, 40TB Chinese text data is learned, and the application performance of the model in scenarios is improved through the sample tuning of industry data.
Pangu NLP large model has achieved breakthroughs in three aspects:
First, it has the leading ability of language comprehension and model generation: In CLUE, the authoritative Chinese language comprehension evaluation benchmark, Pangu NLP grand Model has ranked first in the overall ranking, classification and reading comprehension, breaking three world historical records. The total score of the ranking was 83.046, leading the industry in many sub-tasks, making a big step towards the human level (85.61).
Pangu NLP is no. 1 overall on CLUE’s list
In the NLPCC2018 text Summary task, the Pangu NLP Large Model achieved an industry-best Rouge average of 0.53, beating the next place by 60 percent.
Secondly, pangu NLP large model accumulates a large amount of general knowledge in the pre-training stage, which is capable of both understanding and generation. In addition to the end-to-end generation methods such as GPT-3, large models can also identify intentions through small sample learning and transform them into knowledge bases and database queries. The modular combination of functions supports the embedding of industry knowledge base and database, so as to connect industry experience and enable rapid adaptation and expansion of all scenarios. For example, in the financial customer service scenario jointly constructed by Huawei Cloud and Cycle Intelligence, Pangu NLP large model can better empower the sales link, help service personnel quickly improve their business level and reshape consumer experience.
Thirdly, Pangu NLP large model adopts the tuning route of large model and small sample, and realizes surpassing GPT series in small sample learning task. For example, in the scenario of customer demand analysis, when pangu NLP large model is used to produce semantic labels, the sample size required to obtain the target results is only one tenth of that of GPT series models, that is, the production efficiency of AI can be improved ten times.
3 billion parameters, the world’s largest visual pretraining model
Pangu CV large model is currently the industry’s largest visual pretraining model, containing more than 3 billion parameters. Pangu CV large model for the first time takes into account the ability of image discrimination and generation, so as to meet the needs of both low-level image processing and high-level semantic understanding, as well as facilitate the integration of industry knowledge fine-tuning and quickly adapt to various downstream tasks. The performance of Pangu CV large model is excellent, and the classification accuracy of small samples on ImageNet 1% and 10% datasets has reached the highest level in the industry (SOTA).
Pangu CV grand Model is committed to solving the problem that AI engineering is difficult to generalize and replicate, creating a new mode of AI development industrialization, and greatly saving research and development costs. In addition, Pangu CV large model provides model pre-training, fine-tuning, deployment and iteration functions, forming a complete closed loop for AI development and greatly improving the efficiency of AI development. At present, Pangu CV large model has been verified in more than 100 practical tasks such as medical imaging, finance and industrial quality inspection, which not only greatly improves the accuracy of business testing, but also saves more than 90% of the average research and development cost.
Pangu CV large model power uav intelligent inspection
State Grid Chongqing Yongchuan Power Supply Company is the early application of uav intelligent inspection technology of power grid enterprises in China. Traditional UAV intelligent inspection AI model development is faced with two major challenges: one is how to efficiently annotate massive data; Second, there are as many as hundreds of defects, requiring dozens of AI identification models and high development costs.
Huawei Cloud cooperated with State Grid Chongqing Yongchuan Power Supply Company. In the development of UAV intelligent inspection AI model, Huawei Cloud Pangu CV large model showed its strong advantages compared with the traditional development mode.
In terms of data annotation, pangu CV model use the mass without annotation power data for training, combined with a small amount of labeled samples fine-tuning the efficient development mode, the originality in view of the electric power industry, this paper presents the preliminary training model, and makes the sample screening efficiency, about 30 times, screening of quality improvement about 5 times, yongchuan 50000 sharp image collected every day, for example, Can save manual labeling time 170 people days.
In terms of model versatility, combined with pangu’s automatic data augmentation and category adaptive loss function optimization strategy, one model can adapt to hundreds of defects and replace more than 20 original small models, greatly reducing the model maintenance cost, increasing the average accuracy by 18.4% and reducing the model development cost by 90%.
The support behind pangu’s great model
Pangu NLP large model involves 100 billion parameters and 40TB training data, which poses great challenges to algorithms, computing power, massive data processing and parallel optimization.
In terms of algorithms, huawei Cloud’s algorithms team and Recurrent AI’s NLP team have worked together to break through the problem of fine-tuning a large model.
Pengcheng Cloud Brain II, the largest AI training cluster of Pengcheng Laboratory in China, has demonstrated its powerful AI computing power and data throughput ability in pangu NLP model training, laying a solid foundation for pangu MODEL training.
On the other hand, huawei’s underlying software, training framework and ModelArts platform are coordinated and optimized to fully release computing power and achieve optimal full-stack performance. Firstly, for the performance of the underlying operator, operator quantization, operator fusion optimization and other technologies are adopted based on Huawei CANN, which improves the performance of single operator by more than 30%. Secondly, Huawei MindSpore innovatively adopts the multi-dimensional automatic hybrid parallel technology of “pipeline parallelism, model parallelism and data parallelism”, which greatly reduces the workload of manual coding and improves the cluster linearity by 20%. Huawei Cloud ModelArts platform provides e-class computing force scheduling and dynamic route planning capabilities combined with physical network topology, providing optimal network communication capabilities for large-scale model training. In addition, with the efficient processing of massive data of ModelArts platform, 40TB text data processing was completed in only 7 days.
So far, Huawei Cloud has implemented ARTIFICIAL intelligence in more than 600 projects in more than 10 industries across China, helping intelligent upgrades in cities, transportation, medical care, steel, textile, energy, finance and other industries. In the future, Huawei cloud will continue to drive industrial intelligent upgrading through technological innovation.
Click to follow, the first time to learn about Huawei cloud fresh technology ~