1, the preface
Hello everyone, I am Xiao Zhang ~
This issue introduces you to a Github project for OCR text recognition; In the previous tutorial, I wrote two articles on implementing OCR recognition in Python:
A python package with a few lines of code for OCR text recognition. Tesseract is based on traditional machine learning methods, which are good for English character recognition, but not good for Chinese character recognition
Easy OCR, a Github project for text recognition, is a Python library based on deep learning techniques
Easy-ocr is developed based on deep learning technology, the recognition effect is better than Tesserart, support the recognition of 70+ national languages, in addition to text recognition can also complete the detection function of text block area, and wire-frame the relevant areas marked on the original map
However, after testing, it is found that the library is not very accurate for some road signs recognition
2 PaddleOCR introduction
This article introduces a new Github project, also used for OCR recognition, called PaddleOCR, a branch of Paddle; PaddleOCR is based on deep learning, so you need a well-trained weight file to use, but we don’t need to worry about this because the official version has ~
This section is a brief introduction to the PaddleOCR project, but if you are only interested in the steps, you can skip section 3
PaddleOCR has tested very well, and the following two images are taken from the PaddleOCR introduction
FIG. 1
Figure 2
In order to test the recognition performance of the project, I found a picture of coupon online. The text in the picture was complicated, vertical and italic, etc. There’s a combination of Chinese and English, and even a decimal point
The final test result is as follows. No matter how high the text complexity of the left picture is, the text in the picture can be recognized basically, which is very Nice 👍
There are several characteristics about the PaddleOCR model
- PaddleOCR was released on May 14, 2020.The function has been in the process of continuous improvement since the project iteration.
-
In PaddleOCR recognition, three tasks are successively completed: detection, direction classification and text recognition;
-
PaddleOCR classifies pre-training weights into two categories according to the size of the file they provide:
- One is lightweight, (detection + classification + recognition) three types of weight combined with a total size of 9.4m, suitable for mobile terminal and server deployment;
- Another category (detection + classification + recognition) category three weight memory total 143.4 MB, suitable for server deployment;
- Regardless of whether the model is lightweight or not, the recognition effect is comparable to the commercial effect, and lightweight weights will be used for testing in this tutorial;
-
Support multi-language recognition, currently can support more than 80 languages;
-
In addition to the Recognition of Chinese, English, numbers, but also to cope with the font tilt, text contains decimal characters and other complex situations
-
We provide a wealth of OCR domain related tools for us to use, convenient for us to make our own data sets for training
- Semi-automatic data annotation tool;
- Data synthesis tools;
-
Support PIP installation, easy to use;
3 PaddleOCR use
With that brief introduction, here’s a step-by-step guide to PaddleOCR,
3.1 Environment Introduction
Describe the test environment used this time
- OS: Win10;
- Python: 3.7.9;
3.2 installation PaddlePaddle2.0
PaddleOCR needs to run under PaddlePaddle2.0. Make sure PaddlePaddle2.0 is installed before you start.
Pip3 install, upgrade PIP # python3 -m PIP install paddlepaddle = = 2.0.0 -i https://mirror.baidu.com/pypi/simpleCopy the code
3.2 Cloning the PaddleOCR repository
Use git clone or Download to Download the project repository directly to your local PC
git clone https://github.com/PaddlePaddle/PaddleOCR
Copy the code
Here I’m using the git command
3.3 Installing the PaddleOCR Third-party dependency Package
Go to the PaddleOCR folder
cd PaddleOCR
Copy the code
Install third-party dependencies
pip3 install -r requirements.txt
Copy the code
If there is an error in this step, it is recommended to install the change project in a virtual environment. If you use a virtual environment, remember to install the PaddlePaddle package as well
Python3 -m PIP install paddlepaddle = = 2.0.0 -i https://mirror.baidu.com/pypi/simpleCopy the code
3.4 Downloading weight Files
The weight link addresses are posted below, which need to be downloaded to the local; Test the weight
https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
Copy the code
Direction classification weight
https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar
Copy the code
Identify the weight
https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar
Copy the code
After downloading it to the local area, extract it and create an inference folder, and then put the inference folder into the inference folder, and then put the inference folder into the PaddleOCR. Finally, the tree directory structure is as follows:
**3.5 PaddleOCR uses **
After the above environment is configured, you can use PaddleOCR to identify the text. Open the terminal in the PaddleOCR project environment and enter one of the following three types according to your own situation to complete text recognition
**1, use GPU, recognize single image **
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" - det_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _det_infer/" - rec_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _rec_infer/" - cls_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _cls_infer/" - use_angle_cls = True - use_space_char = TrueCopy the code
2. Use GPU to recognize multiple images
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" - det_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _det_infer/" - rec_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _rec_infer/" - cls_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _cls_infer/" - use_angle_cls = True - use_space_char = TrueCopy the code
3, do not use GPU, recognize a single picture
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" - det_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _det_infer/" - rec_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _rec_infer/" - cls_model_dir = ". / inference/ch_ppocr_mobile_v2. 0 _cls_infer/" - use_angle_cls = True - use_space_char = True - use_gpu = FalseCopy the code
There are two parameters that need to be configured. Parameter description:
image_dir
-> to identify the image path or folder;det_model_dir
-> Store the identified image path or folder;
PaddleOCR recognizes a picture quickly, in two or three seconds using only the CPU
4. Data and source code acquisition
For convenience, I have packaged the test data and project code together. After downloading it, you can use it normally by completing the following two steps (refer to section 3.5 for usage method).
- Create a virtual environment;
- PIP tool installation dependencies;
Python3 -m PIP install paddlepaddle = = 2.0.0 -i https://mirror.baidu.com/pypi/simple # dependencies pip3 install - r requirements.txtCopy the code
Obtain method: follow wechat public Number: Xiao Zhang Python (this number), background reply keyword: 210612
Five little summary
Paddle-OCR is an application of the Paddle framework. Paddle has many other interesting models besides THE Paddle framework, and key developers provide pre-weighted files with training and lower barriers to use
In the later stage, I also plan to pick some interesting projects and teach you to run by hand through the way of blog posts
So that’s it for PaddleOCR, give me a thumbs-up if it’s helpful
Thank you for reading, and we’ll see you next time