PaddleOCR implements OCR technology using Baidu’s PaddlePaddle framework
PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.
The most powerful part of this project is that it can recognize the vertical text and has a high recognition rate
Github project address: github.com/PaddlePaddl…
Summary of the project (mainly from the Readme of the project) :
In this part of the tutorial, we can quickly find the Installation and Quick Start sections, which are enough for testing and using. In the part of in-depth understanding of the code, We can look at the following sections of the algorithm, the model, the inference, the data, and so on.
Requirements
- PaddleOCR working environment:
- PaddlePaddle 2.0.0
- python3.7
- Glibc 2.23(in my tests, GLIbc 2.27 was used)
Installation instructions: github.com/PaddlePaddl…
- On the website, it is better to install according to the way of Docker (I did not use).
- Install directly in the form of the installation library environment
test
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/007.png" - det_model_dir = ". / inference/ch_ppocr_server_v2. 0 _det_infer/" - rec_model_dir = ". / inference/ch_ppocr_server_v2. 0 _rec_infer/" - cls_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _cls_infer/" - use_angle_cls = True - use_space_char = TrueCopy the code
Practical steps
First of all, CUDA 10.1 and cuDNN 7.6.5 are required for GPU version. Here, we use the ready-made environment (easy-to-learn intelligent platform), which can avoid all kinds of problems in the installed environment and quickly realize the construction of the project. Disadvantages: cost of money
1. Configure the environment
1.0 Skip the purchase process of the easy learning intelligent machine
This section is well documented in my previous blog: juejin.cn/post/696575…
1.1 Open the TERMINAL and switch to the specified environment
The environment we need is Py37-CUDa101 and the switch command is
conda activate py37-cuda101
Copy the code
1.2 installation PaddlePaddle
You can find the Installation commands in installation.md
Python3 -m PIP install paddlepaddle - gpu = = 2.0.0 -i https://mirror.baidu.com/pypi/simpleCopy the code
1.3 Download PaddleOCR project
This download is fast and can be done directly using git clone
git clone https://github.com/PaddlePaddle/PaddleOCR
Copy the code
1.4 Enter the project and install the environment
When installing requirements. TXT, it will fail several times due to network fluctuation. Just try several times
cd PaddleOCR
pip3 install -r requirements.txt
Copy the code
2. Model download
For this part we can read the Quick Start section
We can see the list of two models in the figure above, one is small and lightweight, and the smaller part is mobile; The other, bigger and bulkier, is the server side.
2.1 Download the model as instructed
2.1.1 Creating a folder and accessing the folder
mkdir inference && cd inference
Copy the code
2.1.2 Download the model and decompress it
Right-click for download links, using wget download wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar after the download is complete, Run the tar command to decompress tar -xvf ch_ppocr_server_v2.0_det_inferCopy the code
2.1.3 Directory tree Structure after download
├── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── ── Ch_ppocr_mobile_v2. 0 _det_infer │ ├ ─ ─ inference. Pdiparams │ ├ ─ ─ inference. Pdiparams. Info │ └ ─ ─ inference. Pdmodel ├ ─ ─ Ch_ppocr_mobile_v2. 0 _rec_infer ├ ─ ─ inference. Pdiparams ├ ─ ─ inference. Pdiparams. Info └ ─ ─ inference. PdmodelCopy the code
Test the model
The following code implements text detection, Angle class and recognition process.
Change the parameter to the path corresponding to your project
- parameter
image_dir
:the path of a single image or image setdet_model_dir
:the path to the detection inference modelrec_model_dir
:the path to the recognition inference modeluse_angle_cls
: whether to use the direction classifiercls_model_dir
:the path to the direction classifier modeluse_space_char
: whether to predict the space char
- result
- saved to the ./inference_results folder by default
# Predict a single image specified by image_dir python3 tools/infer/predict_system.py --image_dir="./doc/imgs/007.png" - det_model_dir = ". / inference/ch_ppocr_server_v2. 0 _det_infer/" - rec_model_dir = ". / inference/ch_ppocr_server_v2. 0 _rec_infer/" -- CLs_model_DIR ="./inference/ CH_pPOcr_mobile_V2.0_clS_infer /" -- USe_ANGle_cls =True --use_space_char=True # Predict imageset specified by image_dir python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" - det_model_dir = ". / inference/ch_ppocr_server_v2. 0 _det_infer/" - rec_model_dir = ". / inference/ch_ppocr_server_v2. 0 _rec_infer/" Cls_model_dir ="./inference/ CH_pPOcr_mobile_V2.0_clS_infer /" -- USe_ANGLE_clS =True --use_space_char=True # If you want to use the CPU for prediction, you need to set the use_gpu parameter to False python3 tools/infer/predict_system.py --image_dir="./doc/imgs/007.png" - det_model_dir = ". / inference/ch_ppocr_server_v2. 0 _det_infer/" - rec_model_dir = ". / inference/ch_ppocr_server_v2. 0 _rec_infer/" - cls_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _cls_infer/" - use_angle_cls = True - use_space_char = True - use_gpu = FalseCopy the code
Results show
Group 1 (almost all pairs)