High performance Deep Learning Reasoning Platform OpenPPL officially Open Source

Today we continue to talk about the origin of OpenPPL, which also has to start from SensePPL:

What is SensePPL?

SensePPL is a multi-back-end deep learning reasoning deployment engine carefully built by Sensetime HPC team since 2015. The model trained by the training platform can be converted to onNX and other standard formats for rapid inference deployment using SensePPL.

SensePPL will load and transform the model, generate directed graphs and execution plans, carry out graph-level optimization, and call the deeply tuned operator library for inference calculation at run time. All the core framework and operator library code, completely developed by the team, almost no dependence on third parties.

That the origins of OpenPPL

SensePPL has been used and polished in Sensetime for many years, and has accumulated many technologies and business practices of deep learning reasoning in CV field. Of course, this has led to a number of proprietary customization features that are strongly relevant to the company’s business, similar to many closed-source technologies in the industry.

When we decided to give back to the technology community and choose open source, we decided to select a standard for the open source version of SensePPL — the ONNX model format — in order to allow more developers to use our reasoning engine smoothly, and to reconstruct the core framework of SensePPL.

The new PPL is born out of open source, and we call it “OpenPPL”, indicating that we will embrace open source and embrace standardization in the industry.

Website 丨 openppl. Ai

OpenPPL features

OpenPPL is a new start, with the first release starting with V0.1. Includes basic support for fp32 data types for x86 architecture, as well as fp16 data types for NVIDIA GPU’s Turing architecture.

At the same time, we will develop reasoning schemes for these two architectures for the core key network of OpenMMLab. These two architectures cover a fair portion of deployment requirements in the cloud and server space, but are still far from adequate.

Over the next six to a year, we will iterate on OpenPPL to a commercially available V1.0 version. V1.0 will include, but not be limited to, the following features:

1. X86 CPUS: X86 processors remain the cornerstone of cloud and server scenarios and are the most widely deployed cloud computing architecture. OpenPPL will be further refined and, depending on market feedback, may support additional x86 instruction sets; It will also support AMD’s Zen architecture and several homegrown x86 processors

2. NVIDIA GPU: Continue to substantially optimize operator performance and framework support on gpus; It also supports int8/ INT4 data format reasoning with lower precision such as Turing architecture, and opens up the relevant quantization tool chain code; We will also support NVIDIA’s latest amps architecture.

These will greatly improve OpenPPL availability on CUDA architecture

3. ARM Server: SensePPL supports and optimizes the ARM architecture for the longest time, but has always been the flagship product for mobile and IoT scenarios.

ARM architecture has excellent power consumption ratio and strong ecosystem. With the rapid improvement of ARM processor performance, ARM Server has finally passed the threshold of large-scale application in the field of cloud computing, representing the direction of cloud data center in the future.

OpenPPL will transfer support for ARM processors in the mobile space to the cloud and server space, with initial support for ARMv8/ V9 architectures in v1.0.

Longer term planning

OpenPPL will absorb the needs of the industry, maintain and improve the types of operators and models supported in the long term, and optimize the model inference chain in the long term. In addition to the reasoning of the model itself, post-model processing and Serving techniques will also be introduced.

At present, OpenPPL is still a traditional directed graph representation + operator library, which can be used for model fusion and graph optimization is limited. The HPC team has done a lot of practice in areas such as automatic code generation, and will continue to introduce related technologies into OpenPPL in the future to make model optimization capabilities more thorough.

We will also keep an eye on the industry’s progress and introduce more technology and support, such as model technologies such as Transformer, which have become very popular recently. The development of AI back-end architecture is also more diversified. Many AI processors have occupied a considerable market share. We will also expand cooperation with more AI chips and processors in the industry according to demands, and transfer our technology accumulation on NVIDIA Gpus and CPUS to support more scenarios and chips.

At the same time, SensePPL’s technology accumulation on the side will also be gradually opened in the 1.0 version. We hope to establish in-depth cooperation with more upstream and downstream organizations and manufacturers in the industry.

  • GitHub address: ppl.nn
  • Wen: Gao Yang