A, takeaway

If you are an OCR engineer, you must know about this OCR open source project: PaddleOCR

Take a look at PaddleOCR’s performance on GitHub in just a few months since it opened to the public this year:

  • In July, the 8.6m ultra-lightweight model was released and GitHub Trending ranked first in the global Trend list daily.
  • In August, open source CVPR2020 will top SOTA algorithm, and then on the GitHub trend list!
  • In September, pp-OCR algorithm was released, 3.5m super lightweight model was opened source, and then the trend list of Paperswithcode ranked first
  • GitHub Trending Day again on October 28!

Github developers are naturally aware of this. Currently, the cumulative Star number of the project has exceeded 6K and is still growing. How did such achievements come about? Let’s take a look.

Let’s take a look at the features of the REPo, which is indeed full of dry goods, and go straight to the official introduction:

In terms of numbers, PaddleOCR has released three series of models to meet the needs of various scenarios on mobile and server. Moreover, multi-language arrangement is no problem, all training code and model without reservation open source. The 3.5m ultra-light text recognition model is the lightest OCR model in the open source industry.

In terms of quality, is the effect of such a lightweight model guaranteed? Skip the ads and go straight to the cure.

Let’s start with a few common common scene recognition effects:

Train tickets, forms, metal nameplates, flipped pictures, foreign languages are fine

3.5m model can achieve this recognition accuracy, is absolutely conscience!



** Paper download link: **arxiv.org/abs/2009.09…

Quickly experience PaddleOCR’s 3.5m ultra-light OCR model
  • PC quick try: (Open the web page, select a picture, you can see the results in real time)


  • Mobile App installation experience

PaddleOCR has opened its text recognition APP demo on Baidu Brain EasyEdge.

The sample effect is as follows (download qr code on github home page)

Comparison of multiple open source REPO tests

A quick comparison of the core capabilities of the current mainstream OCR-oriented open source REPO:

In terms of performance indicators: \

  • For the actual application scenarios of OCR, including contracts, license plates, nameplates, train tickets, test sheets, forms, certificates, street view text, business cards, digital display, etc., collect 300 images, each image has an average of 17 text boxes, PaddleOCR f1-score is more than 0.5, which is very good performance.

In terms of functional completeness:

  • Pre-training model size: EasyOCR has no ultra-lightweight model at present. The latest chineseocr_lite model is about 4.7m, while PaddleOCR provides 3.5m, which is undoubtedly the lightest known in the industry.
  • PIP installation: currently only supported by PaddleOCR and EasyOCR.
  • Custom training: In actual business scenarios, the pre-training model often fails to meet the requirements. Custom training and Finetuning, Chineseocr_lite and EasyOCR are not supported
  • Deployment: The EasyOCR model is too large to be suitable for end-to-end deployment. Both Chineseocr_lite and PaddleOCR are capable of end-to-end deployment.

Developers can choose their own open source solutions according to their actual needs.

For PaddleOCR’s 3.5MB ultra-lightweight model, an explanation is also given in the REPo.

The 3.5m ultra-lightweight model applies a set of ultra-lightweight OCR system PP-OCR, which is mainly composed of DB text detection, detection frame correction and CRNN text recognition. The system uses 19 effective strategies to optimize and reduce the effect of each module model from eight aspects: selection and adjustment of backbone network, design of prediction head, data enhancement, learning rate transformation strategy, regularization parameter selection, use of pre-training model and automatic clipping quantization of model. Among them, PaddleSlim, the flying paddle model compression library, provides the core technical support for the realization of PaddleOCR ultra-lightweight model. From the ultra-lightweight model of 8.1m compression to 3.5m, the model size is reduced by 56.79%, among which the detection model speed is increased by 21%, and the overall model accuracy is also improved.

In addition to the 3.5m ultra-lightweight OCR model, PaddleOCR offers multi-language pre-training models (English, German, French, Korean, Japanese) that support custom training and rich deployment modes.

If you think this project is not bad, support open source work, also hope you can star attention oh

To learn more, join the PaddleOCR Technical Exchange group and get your first technical support. \

Note: Xiaobian found that it has now increased to 6 groups, developer friends hurry to get on the bus ah.

Scan code to add Paddle sister, and reply [OCR] after passing verification, you will be invited to join the group!

Recruitment Notice

On November 7th, Baidu AI fast track will enter Chengdu, and PaddleOCR r&d team will drop into Chengdu. Welcome developers from OCR to Chengdu to sign up for “Advanced Camp of Open Source Framework”. We will gather at Chengdu Jinkai International Apartment – Zhicheng Hall. We will also visit Xi ‘an (November 14), Wuhan, Xiamen, Beijing and other cities every week.

In addition, NLP common tasks, ERNIE semantic understanding technology and platform introduction, and optimization skills, data imbalance, unsupervised data use, text coding welcome to participate in EasyDL zero threshold model training camp, ERNIE lecturer team will be waiting for you in the hall of Zhi he!

For more information about flying OARS, see below. \

Official website: www.paddlepaddle.org.cn

PaddleOCR Project Address:

Making: github.com/PaddlePaddl…

Gitee: gitee.com/paddlepaddl…

PaddleSlim Project Address:

Making: github.com/PaddlePaddl…

Gitee: gitee.com/paddlepaddl…

Pp-ocr Technical article:

Download link: arxiv.org/abs/2009.09…

To register, come to Chengdu Jinkai International Apartment to participate in “Open Source Framework Advanced Camp”