1, the preface

Hello everyone, I am Xiao Zhang ~

This issue introduces you to a Github project for OCR text recognition; In the previous tutorial, I wrote two articles on implementing OCR recognition in Python:

A python package with a few lines of code for OCR text recognition. Tesseract is based on traditional machine learning methods, which are good for English character recognition, but not good for Chinese character recognition

Easy OCR, a Github project for text recognition, is a Python library based on deep learning techniques

Easy-ocr is developed based on deep learning technology, the recognition effect is better than Tesserart, support the recognition of 70+ national languages, in addition to text recognition can also complete the detection function of text block area, and wire-frame the relevant areas marked on the original map

However, after testing, it is found that the library is not very accurate for some road signs recognition

2 PaddleOCR introduction

This article introduces a new Github project, also used for OCR recognition, called PaddleOCR, a branch of Paddle; PaddleOCR is based on deep learning, so you need a well-trained weight file to use, but we don’t need to worry about this because the official version has ~

This section is a brief introduction to the PaddleOCR project, but if you are only interested in the steps, you can skip section 3

PaddleOCR has tested very well, and the following two images are taken from the PaddleOCR introduction

FIG. 1

Figure 2

In order to test the recognition performance of the project, I found a picture of coupon online. The text in the picture was complicated, vertical and italic, etc. There’s a combination of Chinese and English, and even a decimal point

The final test result is as follows. No matter how high the text complexity of the left picture is, the text in the picture can be recognized basically, which is very Nice 👍

There are several characteristics about the PaddleOCR model

  • PaddleOCR was released on May 14, 2020.The function has been in the process of continuous improvement since the project iteration.

  • In PaddleOCR recognition, three tasks are successively completed: detection, direction classification and text recognition;

  • PaddleOCR classifies pre-training weights into two categories according to the size of the file they provide:

    • One is lightweight, (detection + classification + recognition) three types of weight combined with a total size of 9.4m, suitable for mobile terminal and server deployment;
    • Another category (detection + classification + recognition) category three weight memory total 143.4 MB, suitable for server deployment;
    • Regardless of whether the model is lightweight or not, the recognition effect is comparable to the commercial effect, and lightweight weights will be used for testing in this tutorial;
  • Support multi-language recognition, currently can support more than 80 languages;

  • In addition to the Recognition of Chinese, English, numbers, but also to cope with the font tilt, text contains decimal characters and other complex situations

  • We provide a wealth of OCR domain related tools for us to use, convenient for us to make our own data sets for training

    • Semi-automatic data annotation tool;
    • Data synthesis tools;
  • Support PIP installation, easy to use;

3 PaddleOCR use

With that brief introduction, here’s a step-by-step guide to PaddleOCR,

3.1 Environment Introduction

Describe the test environment used this time

  • OS: Win10;
  • Python: 3.7.9;

3.2 installation PaddlePaddle2.0

PaddleOCR needs to run under PaddlePaddle2.0. Make sure PaddlePaddle2.0 is installed before you start.

Pip3 install, upgrade PIP # python3 -m PIP install paddlepaddle = = 2.0.0 -i https://mirror.baidu.com/pypi/simpleCopy the code

3.2 Cloning the PaddleOCR repository

Use git clone or Download to Download the project repository directly to your local PC

git clone https://github.com/PaddlePaddle/PaddleOCR
Copy the code

Here I’m using the git command

3.3 Installing the PaddleOCR Third-party dependency Package

Go to the PaddleOCR folder

cd PaddleOCR
Copy the code

Install third-party dependencies

pip3 install -r requirements.txt
Copy the code

If there is an error in this step, it is recommended to install the change project in a virtual environment. If you use a virtual environment, remember to install the PaddlePaddle package as well

Python3 -m PIP install paddlepaddle = = 2.0.0 -i https://mirror.baidu.com/pypi/simpleCopy the code

3.4 Downloading weight Files

The weight link addresses are posted below, which need to be downloaded to the local; Test the weight

https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
Copy the code

Direction classification weight

https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar
Copy the code

Identify the weight

https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar
Copy the code

After downloading it to the local area, extract it and create an inference folder, and then put the inference folder into the inference folder, and then put the inference folder into the PaddleOCR. Finally, the tree directory structure is as follows:

**3.5 PaddleOCR uses **

After the above environment is configured, you can use PaddleOCR to identify the text. Open the terminal in the PaddleOCR project environment and enter one of the following three types according to your own situation to complete text recognition

**1, use GPU, recognize single image **

python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" - det_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _det_infer/" - rec_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _rec_infer/" - cls_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _cls_infer/" - use_angle_cls = True - use_space_char = TrueCopy the code

2. Use GPU to recognize multiple images

python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" - det_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _det_infer/" - rec_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _rec_infer/" - cls_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _cls_infer/" - use_angle_cls = True - use_space_char = TrueCopy the code

3, do not use GPU, recognize a single picture

python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" - det_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _det_infer/" - rec_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _rec_infer/" - cls_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _cls_infer/" - use_angle_cls = True - use_space_char = True - use_gpu = FalseCopy the code

There are two parameters that need to be configured. Parameter description:

  • image_dir -> to identify the image path or folder;
  • det_model_dir-> Store the identified image path or folder;

PaddleOCR recognizes a picture quickly, in two or three seconds using only the CPU

4. Data and source code acquisition

For convenience, I have packaged the test data and project code together. After downloading it, you can use it normally by completing the following two steps (refer to section 3.5 for usage method).

  • Create a virtual environment;
  • PIP tool installation dependencies;
Python3 -m PIP install paddlepaddle = = 2.0.0 -i https://mirror.baidu.com/pypi/simple # dependencies pip3 install - r requirements.txtCopy the code

Obtain method: follow wechat public Number: Xiao Zhang Python (this number), background reply keyword: 210612

Five little summary

Paddle-OCR is an application of the Paddle framework. Paddle has many other interesting models besides THE Paddle framework, and key developers provide pre-weighted files with training and lower barriers to use

In the later stage, I also plan to pick some interesting projects and teach you to run by hand through the way of blog posts

So that’s it for PaddleOCR, give me a thumbs-up if it’s helpful

Thank you for reading, and we’ll see you next time