“This is the first day of my participation in the First Challenge 2022. For details: First Challenge 2022”

Welcome to my public account [Jizhi Vision]

Hello everyone, I am Jizhi vision, this article introduces the TensorRT8 custom operator Plugin implementation method.

When we accelerate the deep learning model with Nvidia TensorRT, we need to serialize the model from the original model such as PyTorch, Tensorflow, Caffe or Darkent into TensorRT into Engine model structure. And then deserialize for reasoning. One difficulty in this process is that the TensorRT implementation of new operators often requires handwriting, and handwriting operators generally have two methods:

(1) Splicing the fine-grained functions already supported by TensorRT into new operators;

(2) Use plugin to develop custom operator, and the operator is implemented in CUDA C;

This article mainly introduces the method of using Plugin to develop custom operator in TensorRT8. Since the development of Plugin has been greatly changed in TensorRT8, it is necessary to introduce it.

Implementing plugin in TensorRT8 requires the construction of two classes: custom_plugin and create_plugin, from which custom_plugin is called. So let’s start.

Construct the custom_plugin

Class custom_plugin inherits from IPluginV2Ext or IPluginV2 or IPluginV2DynamicExt.

class ClipPlugin : public IPluginV2
{
public:
    ClipPlugin(const std::string name, float clipMin, float clipMax);
    ClipPlugin(const std::string name, const void* data, size_t length);
    // ClipPlugin does not make sense without arguments, so the default construct is removed.
    ClipPlugin() = delete;
    int getNbOutputs(a) const noexcept override;
    Dims getOutputDimensions(int index, const Dims* inputs, int nbInputDims) noexcept override;
    int initialize(a) noexcept override;
    void terminate(a) noexcept override;
    size_t getWorkspaceSize(int) const noexcept override
    {
        return 0;
    };
    int enqueue(int batchSize, const void* const* inputs, void* const* outputs, void* workspace,
        cudaStream_t stream) noexcept override;   // Implemented through CUDA C
    size_t getSerializationSize(a) const noexcept override;
    void serialize(void* buffer) const noexcept override;
    void configureWithFormat(const Dims* inputDims, int nbInputs, const Dims* outputDims, int nbOutputs, DataType type,
        PluginFormat format, int maxBatchSize) noexcept override;
    bool supportsFormat(DataType type, PluginFormat format) const noexcept override;
    const char* getPluginType(a) const noexcept override;
    const char* getPluginVersion(a) const noexcept override;
 
    void destroy(a) noexcept override;
    nvinfer1::IPluginV2* clone(a) const noexcept override;   // call addPluginV2, which calls the constructor
    void setPluginNamespace(const char* pluginNamespace) noexcept override;
    const char* getPluginNamespace(a) const noexcept override;
private:   // Operator arguments
    const std::string mLayerName;
    float mClipMin, mClipMax;
    size_t mInputVolume;
    std::string mNamespace;
};
Copy the code

Construct create_plugin

class ClipPluginCreator : public IPluginCreator
{
public:
    ClipPluginCreator(a);const char* getPluginName(a) const noexcept override;
    const char* getPluginVersion(a) const noexcept override;
    const PluginFieldCollection* getFieldNames(a) noexcept override;
    IPluginV2* createPlugin(const char* name, const PluginFieldCollection* fc) noexcept override; Custom_plugin is called when this method is implemented
    IPluginV2* deserializePlugin(const char* name, const void* serialData, size_t serialLength) noexcept override;
    void setPluginNamespace(const char* pluginNamespace) noexcept override;
    const char* getPluginNamespace(a) const noexcept override;
private:
    static PluginFieldCollection mFC;
    static std::vector<PluginField> mPluginAttributes;
    std::string mNamespace;
};
Copy the code

3, call

Call custom_plugin (create_plugin, custom_plugin, custom_plugin, custom_plugin)

nvinfer1::IPluginV2 *clip = new ClipPlugin(scale, 512, Dtype);
nvinfer1::IPluginV2Layer *Clip = m_network->addPluginV2(&Layers[inputName], 1, *clip);
Copy the code

ClipPlugin is custom_plugin.

TensorRT8, TensorRT8, TensorRT8, TensorRT8


“[Model reasoning] TensorRT8 custom operator Plugin implementation method”