Abstract: VEGA is a full-process AutoML algorithm set developed by Huawei Noah’s Ark Lab, which provides basic automation capabilities of full-process machine learning such as architecture search, hyperparameter optimization, data enhancement and model compression.
This article is shared by Kourei from VEGA: Introduction to Noah AutoML High-performance Open Source Algorithm Set.
VEGA is a set of full-process AutoML algorithms developed by Huawei Noah’s Ark Lab, providing basic automation capabilities of full-process machine learning, such as architecture search, hyperparameter optimization, data enhancement, and model compression. At present, most of the integrated algorithms have been integrated into Huawei DaVinci full stack AI solution CANN+ MindSpore. Some simple tests show that it has considerable advantages over GPU. It is expected that the next version of Vega will provide support for DaVinci.
As an automatic machine learning tool tailored for researchers and algorithm engineers, VEGA was released internally in Huawei in December 2019, supporting the research of automatic machine learning algorithms by various teams within Noah (computational vision, recommendation search and AI basic research). At the meeting at the top of the related AI (CVPR/ICCV, ECCV/AAAI/ICLR/NIPS) output 20 + algorithm. The following is a brief introduction of the representative AutoML algorithm:
Automated Network Architecture Search (NAS)
Efficient Classification Network Search Scheme Based on Hardware Constraints (CARS)
In different application scenarios, the constraints of computing resources are different, so there are naturally different requirements for search results. In addition, although NAS method based on evolutionary algorithm has achieved good performance, it requires repeated training to evaluate each generation of samples, which greatly affects the search efficiency. In this paper, we propose a multi-objective efficient neural network structure search method (CARS) based on continuous evolution. CARS maintains an optimal model solution set and uses the models in the solution set to update parameters in the supernetwork. The parameters of the network can be directly inherited from the super network in the process of generating the next generation of the evolutionary algorithm, which effectively improves the evolutionary efficiency. CARS can obtain a series of models of different sizes and precision in a single search, so that users can select corresponding models according to the resource constraints in practical applications. Related work is published at CVPR2020: arxiv.org/abs/1909.04… .
Lightweight Hyperpartitioned Network Structure Search (ESR-EA)
Noah proposed a lightweight structure search algorithm for hyperpartitioned networks, which constructed efficient basic modules of hyperpartitioned networks from the perspectives of channel, convolution and feature scale. Based on efficient modules, the algorithm takes the number of model parameters, computation and model accuracy as its objective, and uses multi-objective evolutionary optimization algorithm to search the structure of lightweight hyperpartitioned networks. This algorithm can compress the redundancy of hyperpartitioned networks from three aspects: channel, convolution and feature scale. Experimental results show that with the same number of parameters or computation, the ESRN obtained by this algorithm achieves better performance on standard test sets (Set5, Set14, B100, Urban100) than the hand-designed network structures (CARN, etc.). In addition, the algorithm can also reduce the amount of computation on the premise of ensuring the accuracy of the algorithm, and meet the delay and power consumption constraints of mobile devices. Related paper published at AAAI 2020: www.aaai.org/Papers/AAAI… .
End-to-end Detection Network Architecture Search Scheme (SM-NAS)
The existing target detection model can be decoupled into several main parts: Backbone, Neck, RPN and RCNN head. Each part may have different modules and structural designs, so how to balance the computational cost and accuracy of different combinations is an important issue. Existing target detection NAS methods (NAS-FPN, DetNas, etc.) focus only on searching for better designs of individual modules, such as backbone networks or feature fusion networks, without considering the system as a whole. To solve this problem, in this paper we propose a two-stage neural network search strategy from Structural to Modular, named Structural to Modular NAS (SM-NAS). Specifically, in the structuring stage, rough search of model architecture is carried out to determine the optimal model architecture for the current task (such as single-stage detector or two-stage detector, which type of backbone is used, etc.), as well as the size of matching input images. In the modular search stage, the backbone module is optimized in detail to further improve the performance of the model. In terms of search strategy, we adopt evolutionary algorithm and consider the dual optimization of model efficiency and model performance. Non-dominate sorting is used to construct Pareto front, so as to obtain a series of network structures that are simultaneously optimal on multiple targets. In addition, we explore an effective training strategy to make network convergence faster without Imagenet pretrain than with Pretrain, so that the performance of arbitrary backbone can be evaluated more quickly and accurately. On COCO data set, the model obtained by our search is significantly ahead of the traditional target detection architecture in terms of speed and accuracy. For example, our E2 model is twice as fast as the Faster-RCNN model, and mAP reaches 40% (1% higher). Our E5 model is similar to MaskRCNN’s speed, with mAP achieving 46% (6% improvement). This work is published at AAAI2020: arxiv.org/abs/1911.09… .
Efficient Detection Network Backbone Architecture Search Scheme (SP-NAS)
We use neural network structure search (NAS) techniques to automatically design task-specific backbone networks to bridge the domain gap between classification tasks and detection tasks. Common deep learning object detectors typically use a backbone network designed and trained for ImageNet classification tasks. The existing algorithm DetNAS turns the problem of searching and detecting backbone network into a pre-trained super-network with shared weight to select the optimal subnetwork structure. However, this pre-determined super network cannot reflect the actual performance level of the substructure sampled, and the search space is very small. We hope to design a flexible and task-oriented detection backbone with NAS algorithm: a two-stage search algorithm (serial-to-parallel search) named SP-NAS is proposed. Specifically, the serial search phase aims to efficiently find the serial sequence with the best receptive field ratio and output channel in the feature hierarchy through the search algorithm of “switching, expanding, and focusing fire”. The parallel search phase then automatically searches and assembs several substructures, as well as the previously generated backbone network, into a backbone network of more powerful parallel structures. We verified the effect of sp-nas on multiple detection datasets, and the architectures searched achieved SOTA results, which achieved the top performance of # 1 in EuroCityPersons’ public pedestrian detection rankings (LAMR: 0.042); It is superior to DetNAS and AutoFPN in accuracy and speed. Relevant work published in CVPR2020: openaccess.thecvf.com/content_CVP… .
Automatic training (AutoTrain)
Training Regularization beyond Google (Disout)
In order to extract important features from a given data set, deep neural networks usually contain a large number of trainable parameters. On the one hand, a large number of trainable parameters enhance the performance of deep networks. On the other hand, they introduce the problem of overfitting. To this end, the Dropout based approach disables certain elements of the output characteristic graph during training to reduce co-adaptation between neurons. Although these methods can enhance the generalization ability of the resulting model, Dropout based on whether to discard elements is not the best solution. Therefore, we study the empirical Rademacher complexity related to the middle layer of deep neural networks, and propose a characteristic perturbation method (Disout) to solve the above problems. During training, the randomly selected elements in the feature graph are replaced with specific values by exploring the upper bound of generalization error. Experiments show that the proposed feature graph perturbation method has higher accuracy in multiple image data sets. This work is published in AAAI 2020: arxiv.org/abs/2002.11… .
Suppression of Automatic Data Amplification Noise by Knowledge Distillation (KD+AA)
The main idea of this algorithm is to solve some disadvantages of automatic data amplification (AA). AA searches for the best data enhancement strategy for the whole data set, although AA can make the data more differentiated and make the final model performance better from a global perspective. However, AA is relatively rough and does not optimize a single image, so it has a certain line of defense. When the intensity of data enhancement is relatively high, it is easy to bring semantic confusion to some images (that is, the image semantics change due to excessive elimination of discriminant information. This is what we call semantic confusion. Obviously, in the model training, it is not appropriate for us to use the previous fox tag for constraint guidance. To solve this problem, we use the knowledge distillation (KD) method to generate soft labels from a pre-trained model that can guide the image through AA to what its best label should be. This algorithm is simple and effective, and when combined with large model, the current optimal performance on ImageNet is 85.8%. ECCV 2020: arxiv.org/abs/2003.11… .
Automated Data Generation (AutoData)
Low Cost Image Enhancement Data Acquisition Scheme Based on Generation Model (CylceSR)
Image enhancement in specific tasks (with super is divided into the case), because it is hard to get into pairs of data in the real scene, so most academic research using the algorithm of synthetic data in pairs, but through synthetic algorithm model of data for performance is not good, often in the real scene in order to solve the above problems, we proposed a novel algorithm: The algorithm takes synthetic low-quality image as a bridge, and converts synthetic image domain to real scene image domain through unsupervised image transformation. Meanwhile, the transformed image is used for supervised training image enhancement network. The algorithm is flexible enough to integrate any unsupervised transformation model and image model. In this method, the image transformation network and supervisory network are combined to achieve better learning and over-score performance. The proposed method achieves good performance on NTIRE 2017 and NTIRE 2018 datasets, even comparable to the supervised method; This method was adopted by aITA-NOAH team in NTIRE2020 Real World Super Resolution competition and achieved the first place in IPIPS and the second place in MOS index in Track1. Relevant papers published in CVPR 2020 Workshop on NTIRE: openaccess.thecvf.com/content_CVP… .
Automatic Network Compression (AutoCompress)
Automatic compression based on evolutionary strategy neural network
The technology for compression, neural network automatically from the compression model recognition accuracy, computation, storage capacity, speed, and many other indicators, the use of multi-objective optimization evolutionary algorithms, neural network for quantitative mixed bits, thinning and pruning, such as compression, search out the optimal compression super each layer parameters, get a model contains a number of excellent compression performance of non dominated solution set, It can meet the different needs of users for different indicators. The technology is suitable for high performance and weak computing performance of mobile devices, cloud server for high-performance cloud server can provide the algorithm with high accuracy and computation and memory consumption within a certain range of the model, for mobile devices, can be on the premise of ensure the precision of algorithm reduce the computation and memory consumption, meet the time delay and power consumption constraints of mobile devices. Related papers are published in KDD 2018: www.kdd.org/kdd2018/acc… .
This open source release of the initial stable version, the future will continue to add the most cutting-edge algorithms, increase the support for new algorithms and DaVinci. The open source address is github.com/huawei-noah… , please try it and give your feedback.
Vega has the following advantages:
-
High performance Model Zoo: Presets a large number of Noah’s performance leading deep learning models, providing optimal performance models on ImageNet/MSCOCO/NuScenes /NITRE data sets. These models represent Noah’s latest work on AutoML and can be used directly: github.com/huawei-noah… .
-
** Hardware affinity model optimization: ** To achieve hardware affinity, Vege defined the Evaluator module, which can be directly deployed to devices for reasoning, supporting simultaneous operation of mobile phones, Davinci chips and other devices.
-
Benchmark: The Benchmark tool is provided to help you reproduce Vega algorithms.
-
** Supports multiple links in the deep learning lifecycle and makes flexible calls based on pipeline choreography: ** has built-in components such as architecture search, overparameter optimization, loss function design, data expansion, full training and so on. Each component is called a Step. Multiple steps can be connected in series to form an end-to-end scheme, which is convenient for people to test different ideas, improve the searchable range and find a better model.
Finally, VEGA provides plenty of sample documentation to help developers get started quickly. For complete Documentation in Both English and Chinese, please refer to: github.com/huawei-noah… .
Click to follow, the first time to learn about Huawei cloud fresh technology ~