The title

OCR service of deep learning in Xiaoxin Discovery

introduce

Accidentally discovered a deep learning framework, PaddlePaddle, he is baidu open source a set of deep learning framework. Compared to Tensor,Pytorch, and mostly its whole ecology, it has a lot of tools to use, so it’s easier for us to understand and use. This time, we simply bring a Paddle into the paddle-hub to publish a word recognition OCR, so that we can build an OCR service with a minimum of code and feel the charm of Paddle. In the follow-up, I will slowly experience the Paddle and summarize the actual combat of a set of Paddle.

Versions and Downloads

Paddlepaddle 2.0.0rc1 Download address python 3.7

Set up the environment

I am a Mac system, so I use the Mac environment. In addition, it is very convenient for him to provide Docker.

Another two source, here is a baidu mirror.baidu.com/pypi/simple tsinghua pypi.tuna.tsinghua.edu.cn/simple

Execute the following statement separately, I will not execute the effect here, because I have installed once again, no big problem, very smooth.

$PIP install paddlepaddle = = 2.0.0 rc -i https://mirror.baidu.com/pypi/simple
$PIP install paddlehub = = 2.0.0 rc0 -i https://mirror.baidu.com/pypi/simple
$ pip install shapely -i https://pypi.tuna.tsinghua.edu.cn/simple
$ pip install pyclipper -i https://pypi.tuna.tsinghua.edu.cn/simple
Copy the code

This Module relies on the third-party libraries Shapely and PyClipper. Please install Shapely and PyClipper before using this Module

Chinese_ocr_db_crn_server and Chinese_ocr_db_crnn_mobile are two models available for loading and training. The first is chinese_ocr_db_crn_server and the second is chinese_ocr_db_crn_mobile. Let’s use something with high precision. Let’s test this picture first, Download address is https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/c555f23d746a4fdc83118b3d44a004d5~tplv-k3u1fbpfcp-zoom-1.image, Download it and name it test_ocr.jpg

$ hub run chinese_ocr_db_crnn_mobile --input_path test_ocr.jpg --visualization=True --output_dir='ocr_result'
Copy the code

To explain the above command, it is localized to run, to detect a problem in an image, is not yet servified. In addition, the command and file are in a folder, we set up their own, meaning commands are executed in this file, pictures in this folder. You can use the following command to create a visualization: input_path: input_path: input_path: input_path: output_dir: output_dir: output directory: create your own visualization in the current directory: think= True

Perform screenshots.

Is the command execution result not intuitive? So remember that output image, straight up above.

Personally, the accuracy is quite high. Try id again. Picture address http://ww1.sinaimg.cn/large/8a53ebb9ly1gn147b5eq4j20n20fe76n.jpg

$ hub run chinese_ocr_db_crnn_mobile --input_path test_ocr1.jpg --visualization=True --output_dir='ocr_result'
Copy the code

The results of

The next step is to build a Service and provide the interface.

$ hub serving start -m chinese_ocr_db_crnn_server
Copy the code

-m is for which module to load, i.e. training model. -p is for setting port. Default is 8866

Start, look at the above is the prompt, you may want to pay attention.

Test OCR interface address is http://127.0.0.1:8866/predict/chinese_ocr_db_crnn_server, a post request, the request to have “the content-type” : “application/json”

This picture is http://ww1.sinaimg.cn/large/8a53ebb9ly1gn14laee68j20iw0ck40o.jpg

In the request body body, images are images of base64 address, thought base64 is too large, here is not paste, to a web site to convert https://tool.lu/base64image/, but remember, this does not need to translate the front logo, Data :image/jpeg; Base64, I’m going to get rid of that, as long as you have base64, remember, otherwise you’re going to get object error.

{
    "images": [
        ""]}Copy the code

{
    "msg": ""."results": [{"data": [{"confidence": "" , // This is to identify the text result confidence
                    "text": "".// This is the identified text
                    "text_box_position": [] // This is the pixel coordinates of the text box in the original image, a 4*2 matrix, representing the coordinates of the lower left, lower right, upper right and upper left vertices of the text box in turn. If there is no recognition result, data is [].}]."save_path": "" // Save_path = '' if no image is saved}]."status": "000"
}
Copy the code

conclusion

Of course, this is an attempt to get started. The training model is also provided by the government. In fact, after in-depth study, you can use PaddlePaddle to build your own training model, prediction model and identification model.

Refer to the connection

PaddleHub one key identification OCR PaddlePaddle official website

OCR service of deep learning in Xiaoxin Discovery

The title

introduce

Versions and Downloads

Set up the environment

conclusion

Refer to the connection

Related Posts

Python Tip no. 11: Encapsulation of object-oriented programming and the like

SQL Will know will know – Swastika Essence – chapters 8 to 13

Does gradient leak training data? MIT’s new method steals training data from gradients in just a few steps