The title
OCR service of deep learning in Xiaoxin Discovery
introduce
Accidentally discovered a deep learning framework, PaddlePaddle, he is baidu open source a set of deep learning framework. Compared to Tensor,Pytorch, and mostly its whole ecology, it has a lot of tools to use, so it’s easier for us to understand and use. This time, we simply bring a Paddle into the paddle-hub to publish a word recognition OCR, so that we can build an OCR service with a minimum of code and feel the charm of Paddle. In the follow-up, I will slowly experience the Paddle and summarize the actual combat of a set of Paddle.
Versions and Downloads
Paddlepaddle 2.0.0rc1 Download address python 3.7
Set up the environment
I am a Mac system, so I use the Mac environment. In addition, it is very convenient for him to provide Docker.
Another two source, here is a baidu mirror.baidu.com/pypi/simple tsinghua pypi.tuna.tsinghua.edu.cn/simple
Execute the following statement separately, I will not execute the effect here, because I have installed once again, no big problem, very smooth.
$PIP install paddlepaddle = = 2.0.0 rc -i https://mirror.baidu.com/pypi/simple
$PIP install paddlehub = = 2.0.0 rc0 -i https://mirror.baidu.com/pypi/simple
$ pip install shapely -i https://pypi.tuna.tsinghua.edu.cn/simple
$ pip install pyclipper -i https://pypi.tuna.tsinghua.edu.cn/simple
Copy the code
This Module relies on the third-party libraries Shapely and PyClipper. Please install Shapely and PyClipper before using this Module
Chinese_ocr_db_crn_server and Chinese_ocr_db_crnn_mobile are two models available for loading and training. The first is chinese_ocr_db_crn_server and the second is chinese_ocr_db_crn_mobile. Let’s use something with high precision. Let’s test this picture first, Download address is https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/c555f23d746a4fdc83118b3d44a004d5~tplv-k3u1fbpfcp-zoom-1.image, Download it and name it test_ocr.jpg
$ hub run chinese_ocr_db_crnn_mobile --input_path test_ocr.jpg --visualization=True --output_dir='ocr_result'
Copy the code
To explain the above command, it is localized to run, to detect a problem in an image, is not yet servified. In addition, the command and file are in a folder, we set up their own, meaning commands are executed in this file, pictures in this folder. You can use the following command to create a visualization: input_path: input_path: input_path: input_path: output_dir: output_dir: output directory: create your own visualization in the current directory: think= True
Perform screenshots.
Is the command execution result not intuitive? So remember that output image, straight up above.
Personally, the accuracy is quite high. Try id again. Picture address http://ww1.sinaimg.cn/large/8a53ebb9ly1gn147b5eq4j20n20fe76n.jpg
$ hub run chinese_ocr_db_crnn_mobile --input_path test_ocr1.jpg --visualization=True --output_dir='ocr_result'
Copy the code
The results of
The next step is to build a Service and provide the interface.
$ hub serving start -m chinese_ocr_db_crnn_server
Copy the code
-m is for which module to load, i.e. training model. -p is for setting port. Default is 8866
Start, look at the above is the prompt, you may want to pay attention.
Test OCR interface address is http://127.0.0.1:8866/predict/chinese_ocr_db_crnn_server, a post request, the request to have “the content-type” : “application/json”
This picture is http://ww1.sinaimg.cn/large/8a53ebb9ly1gn14laee68j20iw0ck40o.jpg
In the request body body, images are images of base64 address, thought base64 is too large, here is not paste, to a web site to convert https://tool.lu/base64image/, but remember, this does not need to translate the front logo, Data :image/jpeg; Base64, I’m going to get rid of that, as long as you have base64, remember, otherwise you’re going to get object error.
{
"images": [
""]}Copy the code
{
"msg": ""."results": [{"data": [{"confidence": "" , // This is to identify the text result confidence
"text": "".// This is the identified text
"text_box_position": [] // This is the pixel coordinates of the text box in the original image, a 4*2 matrix, representing the coordinates of the lower left, lower right, upper right and upper left vertices of the text box in turn. If there is no recognition result, data is [].}]."save_path": "" // Save_path = '' if no image is saved}]."status": "000"
}
Copy the code
conclusion
Of course, this is an attempt to get started. The training model is also provided by the government. In fact, after in-depth study, you can use PaddlePaddle to build your own training model, prediction model and identification model.
Refer to the connection
PaddleHub one key identification OCR PaddlePaddle official website