This series is the first of the new TensorRT, why is it called new, because two previous articles have been written about TensorRT- version 5.0. It’s been a long time since I wrote about TensorRT, but I’m glad TO start with the new guy

TensorRT, which will be explained next, will be based on version 7.0.

TensorRT has changed a lot at the beginning of version 7, adding many new features, but the core workings of TensorRT remain the same.

TensorRT is used to accelerate deep learning
Speed up neural Network with TensorRT (Read ONNX model and run)

The content of this article is mainly to explain:

How to use TensorRT custom plug-in
How do I add my own custom operator

After reading this can let you less step on a lot of pit, guest officer remember to often come to see ah.

preface

As tensorRT continues to evolve (V5 ->v6-> V7), the way tensorRT plug-ins are used is constantly updated. The plugin interface is also constantly changing, from v5 version IPluginV2IOExt to V6 version IPluginV2DynamicExt. We don’t know if there will be a new API in the future, but that’s not an issue to worry about, because TensorRT’s post-compatibility is pretty good, so there’s no need to worry about writing an old plugin that won’t work on the new version.

Current plugin-API:

The main purpose of the TensorRT plugin is to allow us to implement operators that are not currently supported by TensorRT. At this point, we need to use the TensorRT plugin to implement our own OP. At this point, we need to implement our own OP through the interface provided by TensorRT, so the life cycle of the plugin also needs to follow TensorRT rules.

A simple understanding

As of this writing, the master branch of the TensorRT plugin is version 7.2:

Github.com/NVIDIA/Tens…

There are already quite a few plugins available, and TensorRT has an open source plugin section. . You can also see the source code and learn how the plugin was written by imitating it.

If you want to add your own operator, you can modify it in the official Plugin library, and then compile the official Plugin library. Replace the generated libnvinfer_plugin.so.7 with the original. Or write a component that looks like an official plugin, replace the name with.so, and reference the dynamic link library in TensorRT’s inference project.

In the following introduction, the IPlugin we need to write is called plug-in OP for short.

Start writing plug-ins

If you are interested, you can take a look at TensorRT’s official documentation. And the purpose of this article is to let you as little pit as possible.

First of all, according to the arrangement of official plugin, the following randomly selected official plugin:

Prepare your own plugins :custom. CPP and custom.h, copy and paste the official code and replace it with your own. Use the latest IPluginV2DynamicExt class as the interface.

We need to write two classes:

MyCustomPlugin, inheritanceIPluginV2DynamicExt, is a plug-in class that writes plug-in concrete implementations
MyCustomPluginCreator, inheritanceBaseCreatorIs a plug-in factory class that is used to create the plug-in on demand

By the way, the plugin class inherits IPluginV2DynamicExt to support dynamic sizing, and other plugin class interfaces such as IPluginV2IOExt are largely similar to IPluginV2IOExt.

// Inherit IPluginV2DynamicExt
class MyCustomPlugin final : public nvinfer1::IPluginV2DynamicExt

class MyCustomPluginCreator : public BaseCreator
Copy the code

MyCustomPlugin plug-in class

Overview:

class MyCustomPlugin final : public nvinfer1::IPluginV2DynamicExt
{

public:

  MyCustomPlugin( int in_channel,
                  const std::vector<float>& weight,
                  const std::vector<float>& bias);
                            
  MyCustomPlugin( int in_channel,
                  nvinfer1::Weights const& weight,
                  nvinfer1::Weights const& bias);

  MyCustomPlugin(void const* serialData, size_t serialLength);
  MyCustomPlugin() = delete;
  ~MyCustomPlugin(a)override;
  int getNbOutputs(a) const override;
  DimsExprs getOutputDimensions(int outputIndex, const nvinfer1::DimsExprs* inputs, int nbInputs, nvinfer1::IExprBuilder& exprBuilder) override;
  int initialize(a) override;
  void terminate(a) override;
  size_t getWorkspaceSize(const nvinfer1::PluginTensorDesc* inputs, int nbInputs, const nvinfer1::PluginTensorDesc* outputs, int nbOutputs) const override;
  int enqueue(const nvinfer1::PluginTensorDesc* inputDesc, const nvinfer1::PluginTensorDesc* outputDesc, 
              const void* const* inputs, void* const* outputs, 
              void* workspace, 
              cudaStream_t stream) override;
  size_t getSerializationSize(a) const override;
  void serialize(void* buffer) const override;
  bool supportsFormatCombination(int pos, const nvinfer1::PluginTensorDesc* inOut, int nbInputs, int nbOutputs) override;
  const char* getPluginType(a) const override;
  const char* getPluginVersion(a) const override;
  void destroy(a) override;
  nvinfer1::IPluginV2DynamicExt* clone(a) const override;
  void setPluginNamespace(const char* pluginNamespace) override;
  const char* getPluginNamespace(a) const override;
  DataType getOutputDataType(int index, const nvinfer1::DataType* inputTypes, int nbInputs) const override;
  void attachToContext(cudnnContext* cudnn, cublasContext* cublas, nvinfer1::IGpuAllocator* allocator) override;
  void detachFromContext(a) override;
  void configurePlugin(const nvinfer1::DynamicPluginTensorDesc* in, int nbInputs, 
                       const nvinfer1::DynamicPluginTensorDesc* out, int nbOutputs) override;
private:
    int _in_channel;
    std::vector<float> weight;
    std::vector<float> bias;
    float* weight;
    float* bias;
    bool _initialized;
    const char* mPluginNamespace;
    std::string mNamespace;
};
Copy the code

Member variables

If your plugin has weights(like weights and bias) and parameters (like kernel-size and padding in conv), you need to define them as member variables, which are of type private:

Taking MyCustomPlugin as an example, suppose our MyCustomPlugin has two weights, weight and bias, and one parameter, in_channel:

private:
    int  _in_channel; / / parameters
    std::vector<float> _weight; // Weight is stored in CPU space
    std::vector<float> _bias;   // Offset the weight in CPU space
    float* _d_weight;           // Weight is stored in GPU space
    float* _d_bias;
    bool _initialized;
    cudnnHandle_t _cudnn_handle;
    const char* mPluginNamespace;
    std::string mNamespace;
Copy the code

Constructors and destructors

Constructors are typically set to three.

The first is used in the Parse phase, where PluginCreator is used to call the constructor when the plug-in is created, passing weight information and parameters.

The second constructor is used to copy the plugin during the Clone phase.

The third is used in the Deserialize phase to pass serialized weights and parameters into the plugin and create love you oh.

Take our MyCustomPlugin for example:

MyCustomPlugin(int in_channel, nvinfer1::Weights const& weight, nvinfer1::Weights const& bias);
MyCustomPlugin(float in_channel, const std::vector<float>& weight, const std::vector<float>& bias);
MyCustomPlugin(void const* serialData, size_t serialLength);
Copy the code

The destructor needs to execute terminate, which frees up some display space before the op:

MyCustomPlugin::~MyCustomPlugin()
{
    terminate(a); }Copy the code

Note that the default constructor needs to be deleted:

MyCustomPlugin() = delete;
Copy the code

getNbOutputs

For example, MyCustomPlugin will only output one Tensor, so return 1:

// MyCustomPlugin returns one output.
int MyCustomPlugin::getNbOutputs(a) const
{
    return 1;
}
Copy the code

initialize

Initialize the function to be executed before the plug-in is ready to run.

Mainly initialize some parameters that open up space in advance, generally those required by CUDA operations (for example, conv operation needs to perform convolution operation, so we need to open up weight and bias video memory in advance). If our operator needs these parameters, we need to open up video memory in advance here.

Note that if the plug-in operator needs to open a relatively large video memory space, it is not recommended to apply for video memory space, you can use the workspace pointer passed by the official Tensorrt interface to obtain video memory space. Because if the plug-in is called many times by a network, and the plug-in OP needs to open up a lot of video memory, TensorRT will open up a lot of video memory according to the number of times the plug-in is called when building the network, which will easily lead to video memory overflow.

getOutputDataType

Generally speaking, our plugin op returns the same result type as the input type:

nvinfer1::DataType InstanceNormalizationPlugin::getOutputDataType(
    int index, const nvinfer1::DataType* inputTypes, int nbInputs) const
{
    ASSERT(inputTypes && nbInputs > 0 && index == 0);
    return inputTypes[0];
}
Copy the code

getWorkspaceSize

This function needs to return the actual size (bytesize) of the intermediate video variable required by the plug-in op, which is obtained through the TensorRT interface in a canonical manner.

We need to determine how much video memory space the OP needs to run, so that we can directly use TensorRT to create space instead of applying for video memory space by ourselves.

size_t MyCustomPlugin::getWorkspaceSize(const nvinfer1::PluginTensorDesc* inputs, int nbInputs, const nvinfer1::PluginTensorDesc* outputs, int nbOutputs) const 
{ 
    // Calculate the amount of intermediate video memory you think is required during the op forward process
    size_t need_num;
    return need_num * sizeof(float);
}
Copy the code

enqueue

The implementation function of the plugin op, our own cuda operation can be put here (of course, the op written in C++ can also be put in, but because it is CPU implementation, the speed is slower), as usual, accept input inputs to generate outputs, pass to the corresponding pointer.

int enqueue(const nvinfer1::PluginTensorDesc* inputDesc, const nvinfer1::PluginTensorDesc* outputDesc,
        const void* const* inputs, void* const* outputs, void* workspace, cudaStream_t stream){

            // If fun is the intermediate variable you need, you can use TensorRT directly to create video memory space for you
            fun  = static_cast<float*>(workspace);

        }
Copy the code

Note that if our operation requires some intermediate variables distributed in video memory, we can get them by passing the pointer parameter workspace. The above code briefly explains how to use them.

Cu is fp32 by default, TensorRT will automatically switch to FP32 when it runs in a plugin op that does not support FP16, and then switch back when the plugin op is finished.

getOutputDimensions

When TensorRT supports dynamic-Shape, the batch dimension must be explicit. That is, the dimensions TensorRT handles are changed from three dimensions [3,-1,-1] to [1,3,-1,-1]. The latest onnX-Tensorrt must also set the explicit batchsize, and this Batch dimension is available in getOutputDimensions.

In the old IPluginV2 class, getOutputDimensions was defined as follows:

  virtual Dims getOutputDimensions(int index, const Dims* inputs, int nbInputDims) TRTNOEXCEPT = 0;
Copy the code

The new IPluginV2DynamicExt class defines the following:

virtual DimsExprs getOutputDimensions(int outputIndex, const DimsExprs* inputs, int nbInputs, IExprBuilder& exprBuilder) = 0;
Copy the code

What we need to do is deduce the output dimension of the model from the input dimension in this member function. It should be noted that although the output dimension is determined by the input dimension, the output dimension is actually “determined” (that is, calculated before calculation). If the output dimension of our plugin op needs to be computed by running it, then this function will not satisfy us.

set/getPluginNamespace

Set a namespace name for this plugin. If you do not set a namespace name, the default is “”. Note that plugins with the same namespace name will conflict.

PluginFieldCollection

This is a member variable and will also be the return type of the getFieldNames member function. The main function of PluginFieldCollection is to pass the weights and parameters required by the plugin op. It is not used in the actual engine reasoning process, but in the parse (e.g. caffe2trt, onnx2trt).

When the op is parsed using these parses, the weights and parameters of the OP go through Models -> TensorRT Engine -> TensorRT Runtime.

For example, in onnx-tensorrt, we used DEFINE_BUILTIN_OP_IMPORTER to register the OP, and then parse the ONNX model, and build the model one by one according to the registered OP. If we define the op as my_custom_op, in DEFINE_BUILTIN_OP_IMPORTER(my_custom_op) we will do this:

DEFINE_BUILTIN_OP_IMPORTER(mycustom_op)
{
    ASSERT(inputs.at(0).is_tensor(), ErrorCode::kUNSUPPORTED_NODE); .const std::string pluginName = "CUSTOM-OP";
    const std::string pluginVersion = "001";
    // This f holds the weights and parameters required by the op, obtained from the onnx model
    std::vector<nvinfer1::PluginField> f;

    f.emplace_back("in_channel", &in_channel, nvinfer1::PluginFieldType::kINT32, 1);
    f.emplace_back("weight", kernel_weights.values, nvinfer1::PluginFieldType::kFLOAT32, kernel_weights.count());
    f.emplace_back("bias", bias_weights.values, nvinfer1::PluginFieldType::kFLOAT32, bias_weights.count);

    // This gets the plugin from the plugin factory and passes in the weights and parameters
    nvinfer1::IPluginV2* plugin = importPluginFromRegistry(ctx, pluginName, pluginVersion, node.name(), f);

    RETURN_FIRST_OUTPUT(ctx->network() - >addPluginV2(tensors.data(), tensors.size(), *plugin));
}

Copy the code

Inside the importPluginFromRegistry function, you can see that the parameter is passed to plugin via the FC variable createPlugin:

nvinfer1::IPluginV2* importPluginFromRegistry(IImporterContext* ctx, const std::string& pluginName,
    const std::string& pluginVersion, const std::string& nodeName,
    const std::vector<nvinfer1::PluginField>& pluginFields)
{
    const auto mPluginRegistry = getPluginRegistry(a);const auto pluginCreator
        = mPluginRegistry->getPluginCreator(pluginName.c_str(), pluginVersion.c_str(), "ONNXTRT_NAMESPACE");

    if(! pluginCreator) {return nullptr;
    }
    // Accept the weight and parameter information passed to the plugin
    nvinfer1::PluginFieldCollection fc;
    fc.nbFields = pluginFields.size(a); fc.fields = pluginFields.data(a);return pluginCreator->createPlugin(nodeName.c_str(), &fc);
}
Copy the code

In the above steps, pluginName and pluginVersion are provided to initialize the MyCustomPluginCreator, where the createPlugin member function is what we need to write (described below).

configurePlugin

Configure the plug-in op to determine whether the number of input and output types is correct. This configuration tells TensorRT to select the appropriate algorithm to tune the model.

However, automatic tuning has not been tried so far, we generally write our own plugin execution code is fixed, the so-called tuning steps may be more for the official OP.

The configurePlugin function in the plugin below simply identifies the inputs and outputs and types.

void MyCustomPluginDynamic::configurePlugin(
    const nvinfer1::DynamicPluginTensorDesc *inputs, int nbInputs,
    const nvinfer1::DynamicPluginTensorDesc *outputs, int nbOutputs) {
  // Validate input arguments
  assert(nbOutputs == 1);
  assert(nbInputs == 2);
  assert(mType == inputs[0].desc.type);
}
Copy the code

clone

Clone the plugin object into TensorRT’s Builder, network, or engine. This member function calls the second constructor mentioned above:

MyCustomPlugin(float in_channel, const std::vector<float>& weight, const std::vector<float>& bias);
Copy the code

The weights and arguments of the plugin to be cloned are passed to the constructor.

IPluginV2DynamicExt* MyCustomPlugin::clone(a) const
{ 
    // 
    auto plugin = new MyCustomPlugin{_in_channel, _weight, _bias};
    plugin->setPluginNamespace(mPluginNamespace);
    return plugin;
}
Copy the code

The clone member function is mainly used to transfer constant weights and parameters, and to copy plugin n copies, so that it can be used by different engines, Builders or networks.

getSerializationSize

Returns how many bytes need to be written to buffer during serialization.

size_t MyCustomPlugin::getSerializationSize(a) const
{
    return (serialized_size(_in_channel) +
            serialized_size(_weight) +
            serialized_size(_bias)
            );
}
Copy the code

supportsFormatCombination

TensorRT calls this method to determine whether the input/output of the POS index supports the format/data type specified by inOut[pos]. Format and inOut[pos].type.

Returns true if the plug-in supports the format/data type at inOut[pos]. If support depends on other input/output formats/data types, the plug-in can make its results depend on the format/data type in inOut[0..pos-1], which will be set to values supported by the plug-in. This function does not need to check inOut[pos + 1..nbInputs + nbOutputs-1] and pos decisions must be based only on inOut[0..pos].

bool MyCustomPlugin::supportsFormatCombination(
    int pos, const nvinfer1::PluginTensorDesc* inOut, int nbInputs, int nbOutputs)
{
    // Suppose there is one input and one output
    assert(0 <= pos && pos < 2);
    const auto *in = inOut;
    const auto *out = inOut + nbInputs;
    switch (pos) {
        case 0:
        return in[0].type == DataType::kFLOAT &&
                in[0].format == nvinfer1::TensorFormat::kLINEAR;
        case 1:
        return out[0].type == in[0].type &&
                out[0].format == nvinfer1::TensorFormat::kLINEAR; }}Copy the code

serialize

Serialize the required data into buffer in sequence.

void MyCustomPlugin::serialize(void *buffer) const
{
    serialize_value(&buffer, _in_channel);
    serialize_value(&buffer, _weight);
    serialize_value(&buffer, _bias);
}
Copy the code

attachToContext

If the op uses something else, such as the Cublas Handle, it can directly use the Cublas Handle provided internally by TensorRT:

void MyCustomPlugin::attachToContext(cudnnContext* cudnnContext, cublasContext* cublasContext, IGpuAllocator* gpuAllocator)
{
     mCublas = cublasContext;
}
Copy the code

MyCustomPluginCreator plug-in factory class

Overview:

class MyCustomPluginCreator : public BaseCreator
{
public:
  MyCustomPluginCreator(a); ~MyCustomPluginCreator(a)override = default;
  const char* getPluginName(a) const override;    / / no introduction
  const char* getPluginVersion(a) const override; / / no introduction
  const PluginFieldCollection* getFieldNames(a) override; / / no introduction
  IPluginV2DynamicExt* createPlugin(const char* name, const nvinfer1::PluginFieldCollection* fc) override;
  IPluginV2DynamicExt* deserializePlugin(const char* name, const void* serialData, size_t serialLength) override;
private:
  static PluginFieldCollection mFC;
  static std::vector<PluginField> mPluginAttributes;
  std::string mNamespace;
};
Copy the code

The constructor

Create an empty mPluginAttributes to initialize the mFC.

MyCustomPluginCreator::MyCustomPluginCreator()
{
    mPluginAttributes.emplace_back(PluginField("in_channel".nullptr, PluginFieldType::kFLOAT32, 1));
    mPluginAttributes.emplace_back(PluginField("weight".nullptr, PluginFieldType::kFLOAT32, 1));
    mPluginAttributes.emplace_back(PluginField("bias".nullptr, PluginFieldType::kFLOAT32, 1));
    
    mFC.nbFields = mPluginAttributes.size(a); mFC.fields = mPluginAttributes.data(a); }Copy the code

createPlugin

This member function creates the plugin using PluginFieldCollection, takes the weights and arguments required by the op one by one, and calls the first constructor mentioned above:

MyCustomPlugin(int in_channel, nvinfer1::Weights const& weight, nvinfer1::Weights const& bias);
Copy the code

To create a plugin.

MyCustomPlugin example:

IPluginV2DynamicExt* MyCustomPlugin::createPlugin(const char* name, const nvinfer1::PluginFieldCollection* fc)
{
    int in_channel;
    std::vector<float> weight;
    std::vector<float> bias;
    const PluginField* fields = fc->fields;
    for (int i = 0; i < fc->nbFields; ++i)
    {
        const char* attrName = fields[i].name;
        if (!strcmp(attrName, "in_channel"))
        {
            ASSERT(fields[i].type == PluginFieldType::kINT32);
            in_channel= *(static_cast<const int32_t*>(fields[i].data));
        }
        else if (!strcmp(attrName, "weight"))
        {
            ASSERT(fields[i].type == PluginFieldType::kFLOAT32);
            int size = fields[i].length;
            h_weight.reserve(size);
            const auto* w = static_cast<const float*>(fields[i].data);
            for (int j = 0; j < size; j++)
            {
                h_weight.push_back(*w); w++; }}else if (!strcmp(attrName, "bias"))
        {
            ASSERT(fields[i].type == PluginFieldType::kFLOAT32);
            int size = fields[i].length;
            h_bias.reserve(size);
            const auto* w = static_cast<const float*>(fields[i].data);
            for (int j = 0; j < size; j++)
            {
                h_bias.push_back(*w);
                w++;
            }
        }
    }

    Weights weightWeights{DataType::kFLOAT, weight.data(), (int64_t) weight.size(a)}; Weights biasWeights{DataType::kFLOAT, bias.data(), (int64_t)_bias.size(a)}; MyCustomPlugin* obj =new MyCustomPlugin(in_channel, weightWeights, biasWeights);
    obj->setPluginNamespace(mNamespace.c_str());
    return obj;
}

Copy the code

deserializePlugin

This function is called by a transformation op of onnX-Tensorrt called TRT_PluginV2, which reads the onNX model’s data and deserializes it into the network.

Some official plugin considerations

There are minor issues with using official plug-ins.

Topk problem

The official topk plugin supports k<=3840 at most. Otherwise:

[TensorRT] ERROR: Parameter check failed at: .. /builder/Layers.cpp::TopKLayer::3137, condition: k > 0 && k <= MAX_TOPK_K

Related questions: github.com/tensorflow/…

Batchednms problem

The official BatchedNMS has a maximum topK of 4096, which will crash if it is too large. However, you can modify the source code to break this number, but there are still bugs:

  void (*kernel[])(const int.const int.const int.const int.const float.const bool.const bool.float *, T_SCORE *, int *,
                     T_SCORE *, int *, bool) = {
      P(1), P(2), P(3), P(4), P(5), P(6), P(7), P(8), P(9), P(10),
      P(11), P(12), P(13), P(14), P(15), P(16)};Copy the code

About plugin registration

A brief description of the plugin registration process.

When you load the nvinferruntimecommon.h header file, you get a getPluginRegistry, which contains all registered iPlugIncreators, At the time of use we get the corresponding IPluginCreator using the getPluginCreator function.

There are two ways to register a plugin. The first way is to look at the official plugin code.

extern "C" {
bool initLibNvInferPlugins(void* logger, const char* libNamespace)
{ initializePlugin<nvinfer1::plugin::GridAnchorPluginCreator>(logger, libNamespace); initializePlugin<nvinfer1::plugin::NMSPluginCreator>(logger, libNamespace); initializePlugin<nvinfer1::plugin::ReorgPluginCreator>(logger, libNamespace); .return true;
}
Copy the code

Where the initializePlugin function executes the addPluginCreator function:

template <typename CreatorType>
void initializePlugin(void* logger, const char* libNamespace)
{
    PluginCreatorRegistry::getInstance().addPluginCreator<CreatorType>(logger, libNamespace);
}
Copy the code

The addPluginCreator function then executes getPluginRegistry()->registerCreator to register the pluginCreator, which completes the registration task:

void addPluginCreator(void* logger, const char* libNamespace)
{...if (mRegistryList.find(pluginType) == mRegistryList.end())
        {
            bool status = getPluginRegistry() - >registerCreator(*pluginCreator, libNamespace);
            if (status)
            {
                mRegistry.push(std::move(pluginCreator));
                mRegistryList.insert(pluginType);
                verboseMsg = "Plugin creator registration succeeded - " + pluginType;
            }
            else
            {
                errorMsg = "Could not register plugin creator: "+ pluginType; }}else
        {
            verboseMsg = "Plugin creator already registered - "+ pluginType; }... }Copy the code

Another type of registration can be registered directly via REGISTER_TENSORRT_PLUGIN:

/ /!
/ /! \brief Return the plugin registry
/ /!
// When the 'nvinferruntimecommon. h' header file is loaded, a 'getPluginRegistry' is obtained
extern "C" TENSORRTAPI nvinfer1::IPluginRegistry* getPluginRegistry(a);

namespace nvinfer1
{

template <typename T>
class PluginRegistrar
{
public:
    PluginRegistrar() { getPluginRegistry() - >registerCreator(instance, ""); }
private:
    T instance{};
};

#define REGISTER_TENSORRT_PLUGIN(name) \
    static nvinfer1::PluginRegistrar<name> pluginRegistrar##name {}

} // namespace nvinfer1

Copy the code

That is, if we have already executed REGISTER_TENSORRT_PLUGIN(BatchedNMSPluginCreator) in the.h file of the plugin; There is no need to create an official initLibNvInferPlugins() function to register one by one.

Refer to the link

Github.com/NVIDIA/Tens… Github.com/triton-infe… Blog.csdn.net/u010552731/… Docs.nvidia.com/deeplearnin… Forums.developer.nvidia.com/t/tensorrt-… Forums.developer.nvidia.com/t/tensorrt-… Github.com/NVIDIA/Tens… Forums.developer.nvidia.com/t/unable-to…

DCNv2-github

Github.com/CharlesShan… Github.com/chengdazhi/…

communication

If you are like-minded with me, Lao Pan is willing to communicate with you; If you like Lao Pan’s content, welcome to follow and support. The blog is updated with an in-depth original article every week. Follow the public account “Oldpan Blog” and don’t miss the latest articles. Lao Pan will also organize some of his own private collection, hope to help you, the public account reply “888” to get Lao Pan learning route information and article summary, there are more waiting for you to dig. If you don’t want to miss Penn’s latest tweets, check out the mystery link.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Implement Tensorrt-7.0 plugin free! (Use the TensorRT plugin function if you do not step on pits)

preface

A simple understanding

Start writing plug-ins

MyCustomPlugin plug-in class

Member variables

Constructors and destructors

getNbOutputs

initialize

getOutputDataType

getWorkspaceSize

enqueue

getOutputDimensions

set/getPluginNamespace

PluginFieldCollection

configurePlugin

clone

getSerializationSize

supportsFormatCombination

serialize

attachToContext

MyCustomPluginCreator plug-in factory class

The constructor

createPlugin

deserializePlugin

Some official plugin considerations

Topk problem

Batchednms problem

About plugin registration

Refer to the link

DCNv2-github

communication

Implement Tensorrt-7.0 plugin free! (Use the TensorRT plugin function if you do not step on pits)

preface

A simple understanding

Start writing plug-ins

MyCustomPlugin plug-in class

Member variables

Constructors and destructors

getNbOutputs

initialize

getOutputDataType

getWorkspaceSize

enqueue

getOutputDimensions

set/getPluginNamespace

PluginFieldCollection

configurePlugin

clone

getSerializationSize

supportsFormatCombination

serialize

attachToContext

MyCustomPluginCreator plug-in factory class

The constructor

createPlugin

deserializePlugin

Some official plugin considerations

Topk problem

Batchednms problem

About plugin registration

Refer to the link

DCNv2-github

communication

Related Posts

From algorithm implementation to MiniFlow implementation, build an infrastructure platform for machine learning

My Road to Silicon Valley – Five top Internet offers in five days

Recognition and statistics of banknote denomination based on MATLAB GUI morphology