On September 10, 2018, Tencent AI Lab announced that it would open source the project “Tencent ML-images” at the end of September, which is composed of ML-images, a multi-label image data set, and ResNET-101, a deep residual network with the highest accuracy among similar deep learning models in the industry.
The open source of the project is a release of the basic capabilities accumulated by Tencent AI Lab in the field of computer vision. It provides sufficient high-quality training data and easy-to-use, high-performance deep learning models for researchers and engineers in the field of artificial intelligence, and promotes the common development of the artificial intelligence industry.
The image data set mL-images released by Tencent AI Lab this time contains 18 million Images and more than 11,000 common object categories. It is the largest public multi-label image data set in the industry and is enough to meet the use scenarios of general scientific research institutions and small and medium-sized enterprises. In addition, Tencent AI Lab will also provide resNET-101, a deep residual network trained based on ML-images. This model has excellent visual representation ability and generalization performance, and has the highest accuracy among similar models in the current industry. It will provide strong support for visual tasks including image and video, and help improve the technical level of image classification, object detection, object tracking, semantic segmentation and so on.
Deep learning technology, typically represented by deep neural network, has fully demonstrated its excellent ability in many fields, especially in the field of computer vision, including important tasks such as image and video classification, understanding and generation. However, in order to give full play to the visual representation ability of deep learning, it is necessary to build on sufficient high-quality training data, excellent model structure and model training methods, as well as powerful computing resources and other basic abilities.
Tech companies are taking ai infrastructure very seriously, building large image datasets that are only for their own use, such as Google’s JFT-300M and Facebook’s Instagram. However, these data sets and the models trained have not been made public. For general scientific research institutions and small and medium-sized enterprises, these ai basic capabilities have a very high threshold.
The largest multi-label image dataset currently available in the industry is Google’s Open Images, which contains 9 million training Images and more than 6,000 object categories. Tencent AI Lab’s open source ML-images data set includes 18 million training Images and more than 11,000 common object categories, which may become a new benchmark data set for the industry. In addition to data sets, Tencent AI Lab team will also introduce in detail in this open source project:
1) Construction method of large-scale multi-label image data set, including image source, image candidate category set, category semantic relationship and image annotation. During the construction of ML-images, the team made full use of the semantic relationship between categories to help accurately label Images.
2) Training method of deep neural network based on ML-images. The team’s well-designed loss function and training method can effectively suppress the negative effects of class imbalance on model training in large-scale multi-label datasets.
3) ResNET-101 model based on ML-images training has excellent visual representation ability and generalization performance. Through transfer learning, the model achieves a top-1 classification accuracy of 80.73% on ImageNet validation set, which exceeds the accuracy of Google’s similar model (transfer learning mode). Moreover, it is worth noting that the size of ML-images is only about 1/17 of JFT-300M. This fully demonstrates the high quality of ML-images and the effectiveness of training methods. Detailed comparison is shown in the following table.
The open source “Tencent ML-images” project of Tencent AI Lab shows Tencent’s efforts in artificial intelligence basic capacity building and its vision of promoting the common development of the industry through the opening of basic capacity.
The deep learning model of the “Tencent ML-Images” project has played an important role in a number of Tencent businesses, such as the image quality evaluation and recommendation function of “Tiantian Express”.
As shown in the figure below, the quality of the cover image of Tiantian Express news has been significantly improved.
Before and after optimization
In addition, Tencent AI Lab team also migrated the ResNET-101 model based on Tencent ML-images to many other visual tasks, including image object detection, image semantic segmentation, video object segmentation, video object tracking, etc. These visual transfer tasks further verify the strong visual representation capability and excellent generalization performance of the model. The “Tencent ML-images” project will play an important role in more visual-related products in the future.
On making for the first time since 2016, tencent released open source project (https://github.com/Tencent), which has accumulated open-source covering areas such as artificial intelligence, mobile development, small programs of 57 project. In order to further contribute to the open source community, Tencent has successively joined Hyperledger, LF Networking and Open Network Foundation, and become the primary founding member of LF Deep Learning Foundation and platinum member of Linux Foundation. As the embodiment of Tencent’s “open” strategy in the field of technology, Tencent Open Source will continue to promote technology research and development to share, reuse and open source, release Tencent’s RESEARCH and development strength to the outside, provide technical support for the domestic and foreign open source community, and inject research and development vitality.