PaddleOCR implements OCR technology using Baidu’s PaddlePaddle framework

PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.

The most powerful part of this project is that it can recognize the vertical text and has a high recognition rate

Github project address: github.com/PaddlePaddl…

Summary of the project (mainly from the Readme of the project) :

In this part of the tutorial, we can quickly find the Installation and Quick Start sections, which are enough for testing and using. In the part of in-depth understanding of the code, We can look at the following sections of the algorithm, the model, the inference, the data, and so on.

Requirements

  • PaddleOCR working environment:
    • PaddlePaddle 2.0.0
    • python3.7
    • Glibc 2.23(in my tests, GLIbc 2.27 was used)

Installation instructions: github.com/PaddlePaddl…

  • On the website, it is better to install according to the way of Docker (I did not use).
  • Install directly in the form of the installation library environment

test

python3 tools/infer/predict_system.py --image_dir="./doc/imgs/007.png" - det_model_dir = ". / inference/ch_ppocr_server_v2. 0 _det_infer/" - rec_model_dir = ". / inference/ch_ppocr_server_v2. 0 _rec_infer/" - cls_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _cls_infer/" - use_angle_cls = True - use_space_char = TrueCopy the code

Practical steps

First of all, CUDA 10.1 and cuDNN 7.6.5 are required for GPU version. Here, we use the ready-made environment (easy-to-learn intelligent platform), which can avoid all kinds of problems in the installed environment and quickly realize the construction of the project. Disadvantages: cost of money

1. Configure the environment

1.0 Skip the purchase process of the easy learning intelligent machine

This section is well documented in my previous blog: juejin.cn/post/696575…

1.1 Open the TERMINAL and switch to the specified environment

The environment we need is Py37-CUDa101 and the switch command is

conda activate py37-cuda101
Copy the code

1.2 installation PaddlePaddle

You can find the Installation commands in installation.md

Python3 -m PIP install paddlepaddle - gpu = = 2.0.0 -i https://mirror.baidu.com/pypi/simpleCopy the code

1.3 Download PaddleOCR project

This download is fast and can be done directly using git clone

git clone https://github.com/PaddlePaddle/PaddleOCR
Copy the code

1.4 Enter the project and install the environment

When installing requirements. TXT, it will fail several times due to network fluctuation. Just try several times

cd PaddleOCR
pip3 install -r requirements.txt
Copy the code

2. Model download

For this part we can read the Quick Start section

We can see the list of two models in the figure above, one is small and lightweight, and the smaller part is mobile; The other, bigger and bulkier, is the server side.

2.1 Download the model as instructed

2.1.1 Creating a folder and accessing the folder
mkdir inference && cd inference
Copy the code
2.1.2 Download the model and decompress it

Right-click for download links, using wget download wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar after the download is complete, Run the tar command to decompress tar -xvf ch_ppocr_server_v2.0_det_inferCopy the code
2.1.3 Directory tree Structure after download
├── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── Ch_ppocr_mobile_v2. 0 _det_infer │ ├ ─ ─ inference. Pdiparams │ ├ ─ ─ inference. Pdiparams. Info │ └ ─ ─ inference. Pdmodel ├ ─ ─ Ch_ppocr_mobile_v2. 0 _rec_infer ├ ─ ─ inference. Pdiparams ├ ─ ─ inference. Pdiparams. Info └ ─ ─ inference. PdmodelCopy the code

Test the model

The following code implements text detection, Angle class and recognition process.

Change the parameter to the path corresponding to your project

  • parameter
    • image_dir:the path of a single image or image set
    • det_model_dir:the path to the detection inference model
    • rec_model_dir:the path to the recognition inference model
    • use_angle_cls: whether to use the direction classifier
    • cls_model_dir:the path to the direction classifier model
    • use_space_char: whether to predict the space char
  • result
    • saved to the ./inference_results folder by default
# Predict a single image specified by image_dir python3 tools/infer/predict_system.py --image_dir="./doc/imgs/007.png" - det_model_dir = ". / inference/ch_ppocr_server_v2. 0 _det_infer/" - rec_model_dir = ". / inference/ch_ppocr_server_v2. 0 _rec_infer/" -- CLs_model_DIR ="./inference/ CH_pPOcr_mobile_V2.0_clS_infer /" -- USe_ANGle_cls =True --use_space_char=True # Predict imageset specified by image_dir python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" - det_model_dir = ". / inference/ch_ppocr_server_v2. 0 _det_infer/" - rec_model_dir = ". / inference/ch_ppocr_server_v2. 0 _rec_infer/" Cls_model_dir ="./inference/ CH_pPOcr_mobile_V2.0_clS_infer /" -- USe_ANGLE_clS =True --use_space_char=True # If you want to use the CPU for prediction, you need to set the use_gpu parameter to False python3 tools/infer/predict_system.py --image_dir="./doc/imgs/007.png" - det_model_dir = ". / inference/ch_ppocr_server_v2. 0 _det_infer/" - rec_model_dir = ". / inference/ch_ppocr_server_v2. 0 _rec_infer/" - cls_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _cls_infer/" - use_angle_cls = True - use_space_char = True - use_gpu = FalseCopy the code

Results show

Group 1 (almost all pairs)

Group 2 (almost all pairs)

Group 3 (for vertical text recognition, the effect is very good)