Abstract: Take the single operator development as an example, take you to understand the whole process of operator development and testing.
Why custom operators
Deep learning algorithms are composed of computing units, which we call operators (Op for short). An operator is a mapping from a function space to a function space O: X→X; In a broad sense, an operation on any function can be considered an operator. For us, the operator we developed is the computation function involved in the network model. In Caffe, operators correspond to the computing logic in the Layer. For example, the Convolution algorithm in the Convolution Layer is an operator. The weight summation process in the Fully-connected Layer (FC Layer) is also an operator.
Ascend model transition navigation
The vast majority of cases, as a result of rising AI software stack to support the vast majority of operators, developers do not need to undertake the development of custom operators, need to offer deep learning model files, through the offline model generator (OMG) transformation can get offline model file, thus further use of the process choreographer (Matrix) to generate the specific application. So why do you need a custom operator? This is because in the process of model transformation, operators are not supported. For example, centerm AI software stack does not support the operators in the model, developers want to modify the calculation logic of existing operators, or developers want to develop their own operators to improve the calculation performance, so the development of custom operators is needed.
TBE operator development process
Centerm AI software stack provides a TBE operator development framework based on which developers can develop custom operators using Python language. First, let’s look at what TBE is. Tensor Boost Engine TBE is a Tensor Boost Engine. It is an operator development tool developed by Huawei. It can run on the Neural network Processing Unit (NPU). The Tensor Virtual Machine (TVM) is an extension of the well-known open source project TVM. It provides a Python API to implement development activities. In this development practice, NPU specifically refers to centerm AI processor.
There are two approaches to operator development through TBE: Domain-specific language development (DSL development) and TVM primitive development (TIK development). DSL development is relatively simple and suitable for entry-level developers. The TBE tool provides automatic optimization mechanism and provides a better scheduling process. Developers only need to understand neural networks and TBE DSL knowledge to specify the target generation code, which can be further compiled into a dedicated kernel. TIK is difficult to develop and is suitable for developers who are familiar with TVM programming and Da Vinci structures. In this way, the interface is low-level and the hardware scheduling of data flow and operator needs to be controlled by the developer. As an introduction, we will use DSL development.
TBE operator development process
Let’s take a look at the development process using a simple single-operator development as an example.
- Goal:
Develop an Sqrt operator in TBE-DSL mode
- Determine operator function:
So what Sqrt does is you take the square root of every atom in the Tensor, y equals
- Determine the computing interface to use:
According to the calculation description API supported by the current TBE framework, the following formula can be used to express the calculation process of Sqrt operator
The realization of operator code can be divided into the following steps:
1) Operator entry parameter
You need a Tensor, you need a Tensor, you need a list or a tuple, for example (3, 2, 3), (4, 10).
Dtype: the Tensor’s data type, represented as strings, such as “float32”, “float16”, “INT8”, etc.
2) Enter the Tensor placeholder
data = tvm.placeholder(shape, name=”data”, dtype=input_dtype)
Placeholder () is an API for the TVM framework. It’s used to store the values that the operator will receive when it executes. It’s the same as % D and % S in C, and it returns a Tensor object. The input parameters are shape, name, dtype, which are properties of the Tensor object.
3) Define the calculation process
4) Define the scheduling process
5) Operator construction
6) Test verification
Wait a minute. It’s not over yet. Only when the correctness of operator function is verified in simulation environment, the development of custom operator can be completed.
ST Test Process
We need to use ST Test (i.e. System Test) to Test the correctness of operator logic and whether it can generate. O and. Json files correctly in the simulation environment. Want to know exactly how it was tested? Instead of reading a long and boring text description, it would be more intuitive to come to the sandbox lab and experience it yourself.
Now that you’ve read this, are you more confident in your ability to develop custom operators? Why not come to Huawei Cloud Institute to learn courses, do experiments, research books to verify it? There it is: operator development based on centerm AI processor
→ Click direct to Huawei Cloud Institute to get more new skills
Click to follow, the first time to learn about Huawei cloud fresh technology ~