Less than 100 lines of code to do Python OCR identification cards, text, all kinds of fonts
Don’t tell you I used it with Python simple development of OCR recognition, with you to identify handwriting, printing, ID card, etc. N, attached code, free under!
How big is OCR in your mind (… , good, good, very good…) ?
It’s this big:
As big as ever:
Anyway in my heart and Ma Yun’s father as big.
No matter how old, to fulfill my promise before, even more within a month, to finish several articles.
Follow my wechat public account and push it to you for the first time:
Reply menu, more good gift, surprise is waiting for you.
Recently, I was involved in the verification of some documents and paper documents. I wanted to take pictures of paper documents and check each other with words. Think of the previous call youdao Wisdom cloud interface to do a document translation. After looking at the API interface of OCR character recognition, Youdao provides a variety of different INTERFACES for OCR recognition, including handwriting, printing, forms, whole problem recognition, shopping receipt recognition, ID card, business card, etc. Simply this time continue to use Youdao Wisdom cloud interface to do a small demo, these functions have been tried, when practice, but also when for the future may use the function to prepare.
(1) Display of handwritten achievements
(1) Display of printing achievements
(3) Business card recognition achievements display
Here I found a business card template, seems to be accurate or ok
(4) Id card (also template) achievement display
(V) Display of Form recognition Achievements:
(This super long JSON, >_< emmm…)
(VI) Demonstration of whole problem identification Results:
(Formula identification is also done, identification result JSON is long, it looks not so intuitive, I will not paste here).
First of all, you need to create an instance, create an application, bind the application and instance on the personal page of Youdao Wisdom Cloud, and obtain the ID and key of the application. Details of the individual registration process and application creation process are listed in the first article above.
The following describes the specific code development process:
This demo is developed using PYTHon3 and includes maindow. Py, ocrprocesser.py, and ocrTools. py files.
In the interface part, in order to simplify the development process, python tkinter library is used to provide the function of selecting the file to be recognized and the recognition type, and displaying the recognition results. Ocrprocesser.py calls the appropriate API interface based on the selected type to complete the recognition process and return the result; Ocrtools. py encapsulates all kinds of YOUdao OCR apis after finishing, and implements categorical calls.
(I) Development interface
Part of the interface code is as follows, using Tkinter grid to arrange elements.
root=tk.Tk() root.title("netease youdao ocr test") frm = tk.Frame(root) frm.grid(padx='50', Pady ='50') btn_get_file = tk.Button(FRM, text=' select image ', command=get_files) btn_get_file.grid(row=0, column=0, padx='10', pady='20') text1 = tk.Text(frm, width='40', height='5') text1.grid(row=0, column=1) combox=ttk.Combobox(frm,textvariable=tk.StringVar(),width=38) combox["value"]=img_type_dict combox.current(0) Combox. bind("<<ComboboxSelected>>",get_img_type) combox.grid(row=1,column=1) label=tk. label (FRM,text=" ") label.grid(row=2,column=0) text_result=tk.Text(frm,width='40',height='10') text_result.grid(row=2,column=1) Btn_sure =tk.Button(FRM,text=" start ",command=ocr_files) btn_sure. Grid (row=3,column=1) Btn_clean =tk.Button(FRM,text=" clean ",command=clean_text) btn_clean. Grid (row=3,column=2) root.mainloop() btn_clean=tk.Button(FRM,text=" clean ",command=clean_text)Copy the code
Ocr_files method
Where the bTN_sure binding event ocr_Files () passes the file path and identification type into the ocrProcesser:
def ocr_files(): if ocr_model.img_paths: ocr_result=ocr_model.ocr_files() text_result.insert(tk.END,ocr_result) else : Tk.messagebox.showinfo (" prompt "," no file ")Copy the code
The main method in ocrProcesser is ocr_files(), which processes the image base64 and calls the encapsulated API.
def ocr_files(self):
for img_path in self.img_paths:
img_file_name=os.path.basename(img_path).split('.')[0]
f=open(img_path,'rb')
img_code=base64.b64encode(f.read()).decode('utf-8')
f.close()
print(img_code)
ocr_result= self.ocr_by_netease(img_code, self.img_type)
print(ocr_result)
return ocr_result
Copy the code
(2) get_ocr_result
After reading through and sorting out the document with AN API, it can be roughly divided into the following four API entrances: handwriting/print recognition, ID card/business card recognition, form recognition and whole question recognition. The URL of each interface is different, and the request parameters are not all the same. Therefore, the demo first distinguishes them according to the recognition type:
def get_ocr_result(img_code,img_type):
if img_type==0 or img_type==1:
return ocr_common(img_code)
elif img_type==2 or img_type==3 :
return ocr_card(img_code,img_type)
elif img_type==4:
return ocr_table(img_code)
elif img_type==5:
return ocr_problem(img_code)
else:
return "error:undefined type!"
Copy the code
(3) recognition of ordinary characters function development
Then organize data and other fields according to the parameters required by the interface, and perform simple parsing and processing for the return values of different interfaces, and return:
def ocr_common(img_code):
YOUDAO_URL='https://openapi.youdao.com/ocrapi'
data = {}
data['detectType'] = '10012'
data['imageType'] = '1'
data['langType'] = 'auto'
data['img'] =img_code
data['docType'] = 'json'
data=get_sign_and_salt(data,img_code)
response=do_request(YOUDAO_URL,data)['regions']
result=[]
for r in response:
for line in r['lines']:
result.append(line['text'])
return result
Copy the code
(4) identification paper function development
def ocr_card(img_code,img_type):
YOUDAO_URL='https://openapi.youdao.com/ocr_structure'
data={}
if img_type==2:
data['structureType'] = 'idcard'
elif img_type==3:
data['structureType'] = 'namecard'
data['q'] = img_code
data['docType'] = 'json'
data=get_sign_and_salt(data,img_code)
return do_request(YOUDAO_URL,data)
Copy the code
(5) Functional development of identification forms
def ocr_table(img_code):
YOUDAO_URL='https://openapi.youdao.com/ocr_table'
data = {}
data['type'] = '1'
data['q'] = img_code
data['docType'] = 'json'
data=get_sign_and_salt(data,img_code)
return do_request(YOUDAO_URL,data)
Copy the code
(6) Functional development of identification questions
def ocr_problem(img_code):
YOUDAO_URL='https://openapi.youdao.com/ocr_formula'
data = {}
data['detectType'] = '10011'
data['imageType'] = '1'
data['img'] = img_code
data['docType'] = 'json'
data=get_sign_and_salt(data,img_code)
response=do_request(YOUDAO_URL,data)['regions']
result = []
for r in response:
for line in r['lines']:
for l in line:
result.append(l['text'])
return result
Copy the code
(7) get_sign_AND_salt method signature information
Get_sign_and_salt () adds the necessary signatures and other information to data:
def get_sign_and_salt(data,img_code): data['signType'] = 'v3' curtime = str(int(time.time())) data['curtime'] = curtime salt = str(uuid.uuid1()) signStr = APP_KEY + truncate(img_code) + salt + curtime + APP_SECRET sign = encrypt(signStr) data['appKey'] = APP_KEY data['salt'] = salt data['sign'] = sign return dataCopy the code
Overall, the function is still very powerful, all kinds of support. That is, visual algorithm engineers do not have the classification function, so they need to call each type of image separately by interface, and the interface cannot be mixed at all. For example, in the development process, I submitted the business card picture to THE API as id card, and the result returned “Items not found!” , it is a bit troublesome for developers to call the API, of course, it also improves the identification accuracy to a certain extent, and I guess it is also for the convenience of charging by interface: P.
Project address: github.com/LemonQH/Wor…