After previous migration training, it’s actually better than the generic model in some ways, but it’s still not as good as going straight into production, so read the Paddleocr source code here to get a sense of the pre-processing and post-processing available.

1. The preprocessing

Preprocessing can also be divided into CLS classification, DB detection, EAST detection and REC identification. There is no uniform folder in FAQ file, so go to find it manually. PaddleOCR is the directory structure of PaddleOCR

  • \PaddleOCR\tools



    Read the source code, in which,

    predict_cls.pyThere is only one class in the fileclass TextClassifier(object)There are two important functions

    1. Resize_norm_img (self, img), make the padding/ scaling for the image, make the padding for the small image and reduce the large image, control all the input batch image in a similar range
    2. __call__()Function, returnsimg_list(Picture list),cls_resClassification results, including tags and scores Elapseelapse += time.time() - starttime)

predict_det.pyfile

As you can see, it is actually a class with six functions — the Order_points_clockwise functions — which are also control functions for detection.


The predict_rec.py file is also mainly used to determine the different kinds of decoding methods involved in the character set, re-transform the image size. It does not involve the blur, corrosion, expansion, edge extraction and other operations of traditional CV fields.


  • In addition, there isPaddleOCR\deploy\cpp_infer\include\preprocess_op.hHowever, the preprocessing here is also very basic. This file also comes with the corresponding.cppFile, where pre-processing and post-processing are post-deployment (pre-processing and post-processing used for reasoning)

  • But the preprocessing here is not quite what you might think of as preprocessing — the preprocessing here, mostly from configuration files, is not quite what I would think of as preprocessing before images are fed into the model.

The preprocessing in PaddleOCR’s deploy folder mainly includes the following (all written in C++ for speed) :

  1. void Permute::Run(Rearrange), this function is to rearrange the input image pixels (mainly channels) to match the Format of OpencV.

The double colon is the field operator.

  1. void Normalize::Run(normalization), divide the pixel point by 255, then normalize, reduce the value, control in the same order of magnitude.
  2. void ResizeImgType0::RunWhen the length or width of the image exceedsmax_size_len, you need to rescale the size of the graph
  3. void CrnnResizeImg::Run
  4. void ClsResizeImg::Run

2. The post-processing

Is there post-processing according to the decoding part of the model?

Some post-processing of tests is in the pPOCr/postProcess path

If you look in this folder you can see that the main thing isCLS classification.The db test.East detection.Rec recognitionPost-processing of these cases



Among them,

  • There is only one class in the post-processing file of CLS classification: Class ClsPostProcess(Object), which is used to convert between text labels and text indexes.

  • The post-processing file for db detection is also a class, class DBPostProcess(Object), which has several functions:

    • boxes_from_bitmap()Retrieves boxes from the binary image and returns the coordinates and scores of the marked boxes;
    • unclip()Controls the distance between boxes and text. This function returns an expansion factor (the number of boxes to expand)
    • get_mini_boxes()Returns the smallest box
    • box_score_fastReturns boxes’ score
    • _call_()The boxes function returns a list of all boxes
  • The reC identifies more post-processing files, with four classes,

  • class BaseRecLabelDecode(object)The main purpose of this class is to determine the character set, which specifies the supported character set (Chinese/English, etc.), special characters, text ordinals and text index conversion
  • class CTCLabelDecode(BaseRecLabelDecode)The following two classes are similar to the above base class, which involves decoding text, converting text ordinals and text indexes, determining character sets/special characters/ignoring characters, etc.
  • class AttnLabelDecode(BaseRecLabelDecode).
  • class SRNLabelDecode(BaseRecLabelDecode).

In fact, the post-processing of db checks above is similar to the post-processing of c++ written in deployment.

Post-deployment processing

The file is located in PaddleOCR\deploy\cpp_infer\ SRC \postprocess_op. CPP and contains the following functions:

  1. void PostProcessor::GetContourAreaGet contour area
  2. cv::RotatedRect PostProcessor::UnClip
  3. float **PostProcessor::Mat2Vec, converts the image’s matrix to an array of type float
  4. std::vector<std::vector<int>>PostProcessor::OrderPointsClockwise5. Order points clockwise (left to right, top to bottom)
  5. std::vector<std::vector<float>> PostProcessor::Mat2VectorReturns a vector that converts the image’s matrix to type float
  6. bool PostProcessor::XsortFp32(std::vector<float> a, std::vector<float> b)Determines the precision of a vector whose elements are floats, returning false if the precision of the elements in A is not equal to the precision of the elements in b
  7. bool PostProcessor::XsortInt(std::vector<int> a, std::vector<int> b)Very similar to the above, except that floating point numbers become integers
  8. std::vector<std::vector<float>> PostProcessor::GetMiniBoxesThis function returns the smallest boxs
  9. float PostProcessor::BoxScoreFast(std::vector<std::vector<float>> box_array,cv::Mat pred)Returns the score
  10. std::vector<std::vector<std::vector<int>>> PostProcessor::BoxesFromBitmap(const cv::Mat pred, const cv::Mat bitmap, const float &box_thresh, const float &det_db_unclip_ratio)This should be something DB (difference binarization) relates tobox_thresh(Boxs below this threshold are not displayed) anddet_db_unclip_ratio(The expansion coefficient of text box is related to the size of text box)
  11. std::vector<std::vector<std::vector<int>>> PostProcessor::FilterTagDetRes(std::vector<std::vector<std::vector<int>>> boxes, float ratio_h, float ratio_w, cv::Mat srcimg)