AIWIN Handwritten OCR Recognition Contest based on PaddleOCR

1. Background…

The bank’s daily business involves the identification and entry of all kinds of vouchers, such as id card entry, check entry, statement entry and so on. In the past, the input method is mainly manual input, with low efficiency and high labor cost. In recent years, OCR technology is gradually replacing the traditional manual input method with its characteristics of automatic execution and less human intervention. But OCR technology, there are some problems in practical application, document of all kinds of fields of recognition, handwriting due to their different large fonts, of no fixed, semantic relevance is low, credentials background interference and so on reasons, lead to OCR recognition accuracy is not high, need a lot of artificial correction, the daily bank entry caused a certain influence.

2. Task

In this competition, handwritten image slice data set will be provided. The data set will be obtained from real business scenes through slice desensitization, and the participating teams will obtain corresponding recognition results through recognition technology. That is:

Input: Handwritten image slice data set

Output: Corresponding identification result

The competition questions are divided into two independent tasks in the competition schedule. Training sets, test sets and modeling environments under different conditions are set respectively, which are summarized as follows:

  • Task 1: Provide an open downloadable training set and test set, allowing offline or online modeling of Notebook environment and Terminal container environment (offline), and output identification results to complete the contest.
  • Task 2: Provide undownloadable training set, which requires online modeling through the Terminal container environment (offline) and submitting the model. The system inputs the test set (that is, invisible to the players) and outputs the recognition results to complete the competition.

3. Basic information of data

Task a

Task 2

Training set (including verification set, please divide by yourself)

Eight thousand images, including two kinds of information, year and amount

30,000 images, including bank name, year, month, date, amount of 5 kinds of information.

The test set

Two thousand images

Set AB list:

A list: 5,000 images

B: 5,000 images

The original handwritten images are divided into three categories, respectively involving bank name, year, month and date, and amount, as shown below:

There may be a certain amount of interference information mixed in the corresponding image slices. The examples are as follows:

Two, environment Settings

PaddleOCR… It is the most powerful OCR tool library in the universe, out of the box, fast.

# Download the PaddleOCR code from Gitee or from the GitHub link! git clone --depth=1
--2022-01-09 19:54:27--
Resolving (,
2409:8c04:1001:1002:0:ff:b001:368a
Connecting to (||:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 490184704 (467M) [application/x-tar]
Saving to: 'ch_ppocr_server_v2.0_rec_pre.tar'

ch_ppocr_server_v2. 100%[===================>] 467.48M  48.1MB/s    in 13s     

2022-01-09 19:54:40 (35.6 MB/s) - 'ch_ppocr_server_v2.0_rec_pre.tar' saved [490184704/490184704]
# upgrade PIP! pip install -U pip# install dependencies
%cd ~/PaddleOCR
%pip install -r requirements.txt
/home/aistudio/PaddleOCR
Looking in indexes:
Collecting shapely
  Downloading shapely-1.8.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.1 MB)
[... package installation output truncated ...]
Successfully installed PyWavelets-1.2.0 cssselect-1.1.0 cssutils-2.3.0 imgaug-0.4.0 lmdb-1.3.0 lxml-4.7.1 opencv-contrib-python- premailer-3.10.0 pyclipper-1.3.0.post2 python-levenshtein-0.12.2 scikit-image-0.19.1 shapely-1.8.0 tifffile-2021.11.2

3. Data preparation

The main tasks are:

  • Data decompression
  • Det data set formatting and data set partitioning
  • Rec data set formatting and data set partitioning
# decompression%cd ~ ! Unzip-qoa 2021A_T1_Task1_ Data set Contains training sets and test
View the data set
from PIL import Image"Training set/amount/images / 8 bb39447774eb21a01777a9efa890543. JPG")
Copy the code

1. Amount data processing

%cd ~
View the data set! Head training set /amount/gt.jsonCopy the code
{"8bb39426774ee53f017770203bab0bc5.jpg": "伍佰贰拾元整", "8bb39447760a31c801762283f9dd63cb.jpg": "贰仟零壹元整", "8bb1943d774eb211017784b7af783c23.jpg": "叁佰元整", "8bb194277657bb0501768d5379a4262b.jpg": "伍佰叁拾元叁角捌分", "8bb3942b7657bb83017674d349786868.jpg": "伍佰元整", "8bb1943d760a31b70176275a31832557.jpg": "壹万贰千贰佰元整", "8bb19437760a2b5f017641f9743b41b4.jpg": "叁万叁千壹佰伍拾元捌角陆分", "8bb1941c7657bb01017674b446cc2a2e.jpg": "贰仟零壹拾元陆角整", "8bb39441760a31b601764a13149e3008.jpg": "玖仟伍佰零伍元整"
import glob, codecs, json, os
import numpy as np

amount_jpgs = glob.glob('./ Training set /amount/images/*.jpg')
lines ='./ Training set /amount/gt.json', encoding='utf-8').readlines()
lines = ' '.join(lines)
amount_gt = json.loads(lines.replace(',\n}'.'} '))
%cd ~/
# Divide train and eval
Write to the list file
f_train=open("./ training set /amount/train_list.txt".'w')
f_val=open("./ Training set /amount/val_list.txt".'w')

for key in amount_gt:
    if i%10= =0:
        f_val.write(key+ '\t'+amount_gt[key]+'\n')
        f_train.write(key+ '\t'+amount_gt[key]+'\n')
! Head./ training set /amount/train_list.txtCopy the code
8bb39447760a31c801762283f9dd63cb.jpg	贰仟零壹元整
8bb1943d774eb211017784b7af783c23.jpg	叁佰元整
8bb194277657bb0501768d5379a4262b.jpg	伍佰叁拾元叁角捌分
8bb3942b7657bb83017674d349786868.jpg	伍佰元整
8bb1943d760a31b70176275a31832557.jpg	壹万贰千贰佰元整
8bb19437760a2b5f017641f9743b41b4.jpg	叁万叁千壹佰伍拾元捌角陆分
8bb1941c7657bb01017674b446cc2a2e.jpg	贰仟零壹拾元陆角整
8bb39441760a31b601764a13149e3008.jpg	玖仟伍佰零伍元整
8bb3943c774eb20601775cf697f0456b.jpg	壹万叁千贰佰叁拾元整
8bb194207657bb1201765fd7645934b5.jpg	叁仟零叁拾贰角整
s = ' '
for x in date_gt:
    s += date_gt[x]
char_list = list(set(list(s)))
char_list = char_list

with open('./ training set /amount/vocabulary. TXT '.'w') as up:
    for x in char_list:
        up.write(x + '\n')
Copy the code
! /amount/vocabulary. TXTCopy the code
玖 佰 零 壹 贰 叁 肆 伍 陆 柒 捌 仟 万 拾 元 角 分 整

2. Date data processing

%cd ~
date_jpgs = glob.glob('./ Training set /date/images/*.jpg')

lines ='./ Training set /date/gt.json', encoding='utf-8').readlines()
lines = ' '.join(lines)
date_gt = json.loads(lines.replace(',\n}'.'} '))
# Divide train and eval
Write to the list file
f_train=open("./ trainset /date/train_list.txt".'w')
f_val=open("./ training set /date/val_list.txt".'w')

for key in date_gt:
    if i%10= =0:
        f_val.write(key+ '\t'+date_gt[key]+'\n')
        f_train.write(key+ '\t'+date_gt[key]+'\n')
s = ' '
for x in date_gt:
    s += date_gt[x]
char_list = list(set(list(s)))
char_list = char_list

with open('./ training set /date/ gloss.txt '.'w') as up:
    for x in char_list:
        up.write(x + '\n')
! Cat./ Training set /date/ gloss.txtCopy the code
玖 零 贰 壹 叁 柒 陆 捌 肆 伍

Iv. Amount training and evaluation

Use PaddleOCR/configs/rec/ ch_pPOcr_v2.0 / rec_chinese_common_train_V2.0.yml as the reference

1. Money training

Global: use_gpu: true epoch_num: 500 log_smooth_window: 20 print_batch_step: 10 save_model_dir: . / output/rec_chinese_common_v2. 0 save_epoch_step: 3 # evaluation is run every 5000 iterations after the 4000th iteration eval_batch_step: [0, 2000] cal_metric_during_train: True pretrained_model: checkpoints: save_inference_dir: use_visualdl: False infer_img: doc/imgs_words/ch/word_1.jpg # for data or label process character_dict_path: TXT max_text_length: 25 infer_mode: False use_space_char: True save_res_path: ./output/rec/predicts_chinese_common_v2.0.txt Optimizer: name: Adam beta1:0.9 beta2:0.999 LR: name: Cosup_epoch: 5 Regularizer: name: 'L2' Factor: 0.00004 Architecture: Model_type: rec algorithm: CRNN Transform: Backbone: name: ResNet layers: 34 Neck: name: SequenceEncoder encoder_type: RNN hidden_size: 256 Head: name: CTCHead fc_decay: 0.00004 Loss: name: CTCLoss PostProcess: name: CTCLabelDecode Metric: name: RecMetric main_indicator: acc Train: dataset: name: SimpleDataSet data_dir: /home/aistudio/ training set /amount/images label_file_list: ["/home/aistudio/ training set /amount/train_list.txt"] transforms: - DecodeImage: # load image img_mode: BGR channel_first: False - RecAug: - CTCLabelEncode: # Class handling label - RecResizeImg: image_shape: [3, 32, 320] - KeepKeys: keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order loader: shuffle: True batch_size_per_card: 256 drop_last: True num_workers: 8 Eval: dataset: name: SimpleDataSet data_dir: /home/aistudio/ training set /amount/images label_file_list: ["/home/aistudio/ training set /amount/val_list.txt"] Transforms: - DecodeImage: # load image img_mode: BGR channel_first: False - CTCLabelEncode: # Class handling label - RecResizeImg: image_shape: [3, 32, 320] - KeepKeys: keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order loader: shuffle: False drop_last: False batch_size_per_card: 256 num_workers: 8Copy the code

2. Download the pre-training model

%cd ~/PaddleOCR/
# server model! wget 0/ch/ch_ppocr_server_v2. 0_rec_pre.tar ! tar -xf ch_ppocr_server_v2. 0_rec_pre.tar
/home/aistudio/PaddleOCR --2022-01-09 19:54:27-- Resolving (, 2409:8c04:1001:1002:0:ff:b001:368a Connecting to ( | | : 443... connected. HTTP request sent, awaiting response... 200 OK Length: 490184704 (467M) [application/x-tar] Saving to: 'ch_ppocr_server_v2. 0 _rec_pre. Tar' ch_ppocr_server_v2. 100% [= = = = = = = = = = = = = = = = = = = >] 467.48 M 48.1 MB/s in 13 in the late 2022-01-09 s 19:54:40 (35.6 MB/s) - 'CH_pPOcr_server_v2.0_rec_pre-.tar' SAVED [490184704/490184704]Copy the code

3. Money training

Overwrite the configuration file! cp -f .. /rec_chinese_common_train_v2. 0.yml ./configs/rec/ch_ppocr_v2. 0/rec_chinese_common_train_v2. 0.yml 
# server model%cd ~/PaddleOCR/ ! python tools/ -c ./configs/rec/ch_ppocr_v2. 0/rec_chinese_common_train_v2. 0.yml
Copy the code

Training log

[2022/01/09 20:21:27] root INFO: epoch: [35/500], iter: 480, lr: 0.000992, loss: 0.322123, acc: 0.972652, norm_edit_dis: 0.994506, reader_cost: 0.57496s, batch_cost: 0.89715s, samples: 1280, ips: 142.67360
[2022/01/09 20:21:43] root INFO: epoch: [35/500], iter: 489, lr: 0.000992, loss: 0.256427, acc: 0.974606, norm_edit_dis: 0.995029, reader_cost: 0.00015s, batch_cost: 0.55269s, samples: 2304, ips: 416.86967
[2022/01/09 20:21:47] root INFO: save model in ./output/rec_chinese_common_v2.0/latest
[2022/01/09 20:21:55] root INFO: epoch: [36/500], iter: 490, lr: 0.000992, loss: 0.249966, acc: 0.974606, norm_edit_dis: 0.995029, reader_cost: 0.00045s, batch_cost: 0.61365s, samples: 256, ips: 36.59348
[2022/01/09 20:22:13] root INFO: epoch: [36/500], iter: 500, lr: 0.000991, loss: 0.246262, acc: 0.968746, norm_edit_dis: 0.995142, reader_cost: 0.00045s, batch_cost: 0.61365s, samples: 2560, ips: 417.17631
[2022/01/09 20:22:15] root INFO: cur metric, acc: 0.9699975750060625, norm_edit_dis: 0.9909632884925447, fps: 456.3268190347603

3. Amount model evaluation

# server model! python -m paddle.distributed.launch tools/ -c configs/rec/ch_ppocr_v2. 0/rec_chinese_common_train_v2. 0.yml \
    -o Global.checkpoints=./output/rec_chinese_common_v2. 0/best_accuracy.pdparams
[2022/01/09 20:25:46] root INFO: Architecture : 
[2022/01/09 20:25:46] root INFO:     Backbone : 
[2022/01/09 20:25:46] root INFO:         layers : 34
[2022/01/09 20:25:46] root INFO:         name : ResNet
[2022/01/09 20:25:46] root INFO:     Head : 
[2022/01/09 20:25:46] root INFO:         fc_decay : 4e-05
[2022/01/09 20:25:46] root INFO:         name : CTCHead
[2022/01/09 20:25:46] root INFO:     Neck : 
[2022/01/09 20:25:46] root INFO:         encoder_type : rnn
[2022/01/09 20:25:46] root INFO:         hidden_size : 256
[2022/01/09 20:25:46] root INFO:         name : SequenceEncoder
[2022/01/09 20:25:46] root INFO:     Transform : None
[2022/01/09 20:25:46] root INFO:     algorithm : CRNN
[2022/01/09 20:25:46] root INFO:     model_type : rec
[2022/01/09 20:25:46] root INFO: Eval : 
[2022/01/09 20:25:46] root INFO:     dataset : 
[2022/01/09 20:25:46] root INFO:         data_dir : /home/aistudio/训练集/amount/images
[2022/01/09 20:25:46] root INFO:         label_file_list : ['/home/aistudio/训练集/amount/val_list.txt']
[2022/01/09 20:25:46] root INFO:         name : SimpleDataSet
[2022/01/09 20:25:46] root INFO:         transforms : 
[2022/01/09 20:25:46] root INFO:             DecodeImage : 
[2022/01/09 20:25:46] root INFO:                 channel_first : False
[2022/01/09 20:25:46] root INFO:                 img_mode : BGR
[2022/01/09 20:25:46] root INFO:             CTCLabelEncode : None
[2022/01/09 20:25:46] root INFO:             RecResizeImg : 
[2022/01/09 20:25:46] root INFO:                 image_shape : [3, 32, 320]
[2022/01/09 20:25:46] root INFO:             KeepKeys : 
[2022/01/09 20:25:46] root INFO:                 keep_keys : ['image', 'label', 'length']
[2022/01/09 20:25:46] root INFO:     loader : 
[2022/01/09 20:25:46] root INFO:         batch_size_per_card : 256
[2022/01/09 20:25:46] root INFO:         drop_last : False
[2022/01/09 20:25:46] root INFO:         num_workers : 8
[2022/01/09 20:25:46] root INFO:         shuffle : False
[2022/01/09 20:25:46] root INFO: Global : 
[2022/01/09 20:25:46] root INFO:     cal_metric_during_train : True
[2022/01/09 20:25:46] root INFO:     character_dict_path : /home/aistudio/训练集/amount/vocabulary.txt
[2022/01/09 20:25:46] root INFO:     checkpoints : ./output/rec_chinese_common_v2.0/best_accuracy.pdparams
[2022/01/09 20:25:46] root INFO:     debug : False
[2022/01/09 20:25:46] root INFO:     distributed : False
[2022/01/09 20:25:46] root INFO:     epoch_num : 500
[2022/01/09 20:25:46] root INFO:     eval_batch_step : [100, 100]
[2022/01/09 20:25:46] root INFO:     infer_img : doc/imgs_words/ch/word_1.jpg
[2022/01/09 20:25:46] root INFO:     infer_mode : False
[2022/01/09 20:25:46] root INFO:     log_smooth_window : 20
[2022/01/09 20:25:46] root INFO:     max_text_length : 25
[2022/01/09 20:25:46] root INFO:     pretrained_model : ./ch_ppocr_server_v2.0_rec_pre/best_accuracy
[2022/01/09 20:25:46] root INFO:     print_batch_step : 10
[2022/01/09 20:25:46] root INFO:     save_epoch_step : 3
[2022/01/09 20:25:46] root INFO:     save_inference_dir : None
[2022/01/09 20:25:46] root INFO:     save_model_dir : ./output/rec_chinese_common_v2.0
[2022/01/09 20:25:46] root INFO:     save_res_path : ./output/rec/predicts_chinese_common_v2.0.txt
[2022/01/09 20:25:46] root INFO:     use_gpu : True
[2022/01/09 20:25:46] root INFO:     use_space_char : True
[2022/01/09 20:25:46] root INFO:     use_visualdl : False
[2022/01/09 20:25:46] root INFO: Loss : 
[2022/01/09 20:25:46] root INFO:     name : CTCLoss
[2022/01/09 20:25:46] root INFO: Metric : 
[2022/01/09 20:25:46] root INFO:     main_indicator : acc
[2022/01/09 20:25:46] root INFO:     name : RecMetric
[2022/01/09 20:25:46] root INFO: Optimizer : 
[2022/01/09 20:25:46] root INFO:     beta1 : 0.9
[2022/01/09 20:25:46] root INFO:     beta2 : 0.999
[2022/01/09 20:25:46] root INFO:     lr : 
[2022/01/09 20:25:46] root INFO:         learning_rate : 0.001
[2022/01/09 20:25:46] root INFO:         name : Cosine
[2022/01/09 20:25:46] root INFO:         warmup_epoch : 5
[2022/01/09 20:25:46] root INFO:     name : Adam
[2022/01/09 20:25:46] root INFO:     regularizer : 
[2022/01/09 20:25:46] root INFO:         factor : 4e-05
[2022/01/09 20:25:46] root INFO:         name : L2
[2022/01/09 20:25:46] root INFO: PostProcess : 
[2022/01/09 20:25:46] root INFO:     name : CTCLabelDecode
[2022/01/09 20:25:46] root INFO: Train : 
[2022/01/09 20:25:46] root INFO:     dataset : 
[2022/01/09 20:25:46] root INFO:         data_dir : /home/aistudio/训练集/amount/images
[2022/01/09 20:25:46] root INFO:         label_file_list : ['/home/aistudio/训练集/amount/train_list.txt']
[2022/01/09 20:25:46] root INFO:         name : SimpleDataSet
[2022/01/09 20:25:46] root INFO:         transforms : 
[2022/01/09 20:25:46] root INFO:             DecodeImage : 
[2022/01/09 20:25:46] root INFO:                 channel_first : False
[2022/01/09 20:25:46] root INFO:                 img_mode : BGR
[2022/01/09 20:25:46] root INFO:             RecAug : None
[2022/01/09 20:25:46] root INFO:             CTCLabelEncode : None
[2022/01/09 20:25:46] root INFO:             RecResizeImg : 
[2022/01/09 20:25:46] root INFO:                 image_shape : [3, 32, 320]
[2022/01/09 20:25:46] root INFO:             KeepKeys : 
[2022/01/09 20:25:46] root INFO:                 keep_keys : ['image', 'label', 'length']
[2022/01/09 20:25:46] root INFO:     loader : 
[2022/01/09 20:25:46] root INFO:         batch_size_per_card : 256
[2022/01/09 20:25:46] root INFO:         drop_last : True
[2022/01/09 20:25:46] root INFO:         num_workers : 8
[2022/01/09 20:25:46] root INFO:         shuffle : True
[2022/01/09 20:25:46] root INFO: profiler_options : None
[2022/01/09 20:25:46] root INFO: train with paddle 2.0.2 and device CUDAPlace(0)
[2022/01/09 20:25:46] root INFO: Initialize indexs of datasets:['/home/aistudio/训练集/amount/val_list.txt']
W0109 20:25:46.365166  6137] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0109 20:25:46.370265  6137] device: 0, cuDNN Version: 7.6.
[2022/01/09 20:25:52] root INFO: resume from ./output/rec_chinese_common_v2.0/best_accuracy
[2022/01/09 20:25:52] root INFO: metric in ckpt ***************
[2022/01/09 20:25:52] root INFO: acc:0.9699975750060625
[2022/01/09 20:25:52] root INFO: norm_edit_dis:0.9909632884925447
[2022/01/09 20:25:52] root INFO: fps:456.3268190347603
[2022/01/09 20:25:52] root INFO: best_epoch:36
[2022/01/09 20:25:52] root INFO: start_epoch:37

eval model::   0%|          | 0/2 [00:00<?, ?it/s]
eval model::  50%|█████     | 1/2 [00:01<00:01,  1.72s/it]
eval model:: 100%|██████████| 2/2 [00:02<00:00,  1.29s/it]
eval model:: 100%|██████████| 2/2 [00:02<00:00,  1.11s/it]
[2022/01/09 20:25:54] root INFO: metric eval ***************
[2022/01/09 20:25:54] root INFO: acc:0.9699975750060625
[2022/01/09 20:25:54] root INFO: norm_edit_dis:0.9909632884925447
[2022/01/09 20:25:54] root INFO: fps:439.9205808964122
INFO 2022-01-09 20:25:56,825] Local processes completed.
V. Date training and evaluation

Overwrite the configuration file! cp ./configs/rec/ch_ppocr_v2. 0/rec_chinese_common_train_v2. 0_date.yml ~/rec_chinese_common_train_v2. 0_date.yml
Copy the code

1. The configuration

Global: use_gpu: true epoch_num: 500 log_smooth_window: 20 print_batch_step: 10 save_model_dir: . / output/rec_chinese_common_v2. 0 _date save_epoch_step: 3 # evaluation is run every 5000 iterations after the 4000th iteration eval_batch_step: [100, 100] Cal_metric_during_train: parameter Description value Pretrained_model:./ ch_pPOcr_server_v2.0_rec_pre /best_accuracy I/O initialization: changing cpus save_inference_dir: use_visualdl: False infer_img: doc/imgs_words/ch/word_1.jpg # for data or label process character_dict_path: TXT max_text_length: 25 infer_mode: False use_space_char: True save_res_path: ./output/rec/predicts_chinese_common_v2.0.txt Optimizer: name: Adam beta1:0.9 beta2:0.999 LR: name: Cosup_epoch: 5 Regularizer: name: 'L2' Factor: 0.00004 Architecture: Model_type: rec algorithm: CRNN Transform: Backbone: name: ResNet layers: 34 Neck: name: SequenceEncoder encoder_type: RNN hidden_size: 256 Head: name: CTCHead fc_decay: 0.00004 Loss: name: CTCLoss PostProcess: name: CTCLabelDecode Metric: name: RecMetric main_indicator: acc Train: dataset: name: SimpleDataSet data_dir: /home/aistudio/ trainset /date/images Transforms label_file_list: ["/home/aistudio/ trainset /date/train_list.txt"] Transforms: - DecodeImage: # load image img_mode: BGR channel_first: False - RecAug: - CTCLabelEncode: # Class handling label - RecResizeImg: image_shape: [3, 32, 320] - KeepKeys: keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order loader: shuffle: True batch_size_per_card: 256 drop_last: True num_workers: 8 Eval: dataset: name: SimpleDataSet data_dir: /home/aistudio/ trainset /date/images Transforms label_file_list: ["/home/aistudio/ trainset /date/val_list.txt"] Transforms: -decodeImage: # load image img_mode: BGR channel_first: False - CTCLabelEncode: # Class handling label - RecResizeImg: image_shape: [3, 32, 320] - KeepKeys: keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order loader: shuffle: False drop_last: False batch_size_per_card: 256 num_workers: 8Copy the code

2. Date model training

# server model%cd ~/PaddleOCR/ ! python tools/ -c ./configs/rec/ch_ppocr_v2. 0/rec_chinese_common_train_v2. 0_date.yml
Copy the code

Training log

[2022/01/09 21:15:41] root INFO: Initialize indexs of datasets:['/home/aistudio/训练集/date/train_list.txt']
[2022/01/09 21:15:56] root INFO: epoch: [69/500], iter: 410, lr: 0.000963, loss: 0.282951, acc: 0.978514, norm_edit_dis: 0.991211, reader_cost: 0.46780s, batch_cost: 0.81243s, samples: 1536, ips: 189.06251
[2022/01/09 21:16:06] root INFO: epoch: [69/500], iter: 413, lr: 0.000962, loss: 0.272608, acc: 0.979490, norm_edit_dis: 0.992269, reader_cost: 0.00150s, batch_cost: 0.34571s, samples: 1536, ips: 444.29798
[2022/01/09 21:16:10] root INFO: save model in ./output/rec_chinese_common_v2.0_date/latest
[2022/01/09 21:16:39] root INFO: epoch: [70/500], iter: 419, lr: 0.000961, loss: 0.259662, acc: 0.983397, norm_edit_dis: 0.993701, reader_cost: 0.51926s, batch_cost: 1.21329s, samples: 3072, ips: 253.19623
[2022/01/09 21:17:09] root INFO: epoch: [71/500], iter: 425, lr: 0.000960, loss: 0.241655, acc: 0.985350, norm_edit_dis: 0.995117, reader_cost: 0.00232s, batch_cost: 0.57741s, samples: 2560, ips: 443.35555

3. Date model evaluation

# server model! python -m paddle.distributed.launch tools/ -c configs/rec/ch_ppocr_v2. 0/rec_chinese_common_train_v2. 0_date.yml \
    -o Global.checkpoints=./output/rec_chinese_common_v2. 0_date/best_accuracy.pdparams
[2022/01/09 21:17:27] root INFO: Architecture : 
[2022/01/09 21:17:27] root INFO:     Backbone : 
[2022/01/09 21:17:27] root INFO:         layers : 34
[2022/01/09 21:17:27] root INFO:         name : ResNet
[2022/01/09 21:17:27] root INFO:     Head : 
[2022/01/09 21:17:27] root INFO:         fc_decay : 4e-05
[2022/01/09 21:17:27] root INFO:         name : CTCHead
[2022/01/09 21:17:27] root INFO:     Neck : 
[2022/01/09 21:17:27] root INFO:         encoder_type : rnn
[2022/01/09 21:17:27] root INFO:         hidden_size : 256
[2022/01/09 21:17:27] root INFO:         name : SequenceEncoder
[2022/01/09 21:17:27] root INFO:     Transform : None
[2022/01/09 21:17:27] root INFO:     algorithm : CRNN
[2022/01/09 21:17:27] root INFO:     model_type : rec
[2022/01/09 21:17:27] root INFO: Eval : 
[2022/01/09 21:17:27] root INFO:     dataset : 
[2022/01/09 21:17:27] root INFO:         data_dir : /home/aistudio/训练集/date/images
[2022/01/09 21:17:27] root INFO:         label_file_list : ['/home/aistudio/训练集/date/val_list.txt']
[2022/01/09 21:17:27] root INFO:         name : SimpleDataSet
[2022/01/09 21:17:27] root INFO:         transforms : 
[2022/01/09 21:17:27] root INFO:             DecodeImage : 
[2022/01/09 21:17:27] root INFO:                 channel_first : False
[2022/01/09 21:17:27] root INFO:                 img_mode : BGR
[2022/01/09 21:17:27] root INFO:             CTCLabelEncode : None
[2022/01/09 21:17:27] root INFO:             RecResizeImg : 
[2022/01/09 21:17:27] root INFO:                 image_shape : [3, 32, 320]
[2022/01/09 21:17:27] root INFO:             KeepKeys : 
[2022/01/09 21:17:27] root INFO:                 keep_keys : ['image', 'label', 'length']
[2022/01/09 21:17:27] root INFO:     loader : 
[2022/01/09 21:17:27] root INFO:         batch_size_per_card : 256
[2022/01/09 21:17:27] root INFO:         drop_last : False
[2022/01/09 21:17:27] root INFO:         num_workers : 8
[2022/01/09 21:17:27] root INFO:         shuffle : False
[2022/01/09 21:17:27] root INFO: Global : 
[2022/01/09 21:17:27] root INFO:     cal_metric_during_train : True
[2022/01/09 21:17:27] root INFO:     character_dict_path : /home/aistudio/训练集/date/vocabulary.txt
[2022/01/09 21:17:27] root INFO:     checkpoints : ./output/rec_chinese_common_v2.0_date/best_accuracy.pdparams
[2022/01/09 21:17:27] root INFO:     debug : False
[2022/01/09 21:17:27] root INFO:     distributed : False
[2022/01/09 21:17:27] root INFO:     epoch_num : 500
[2022/01/09 21:17:27] root INFO:     eval_batch_step : [100, 100]
[2022/01/09 21:17:27] root INFO:     infer_img : doc/imgs_words/ch/word_1.jpg
[2022/01/09 21:17:27] root INFO:     infer_mode : False
[2022/01/09 21:17:27] root INFO:     log_smooth_window : 20
[2022/01/09 21:17:27] root INFO:     max_text_length : 25
[2022/01/09 21:17:27] root INFO:     pretrained_model : ./ch_ppocr_server_v2.0_rec_pre/best_accuracy
[2022/01/09 21:17:27] root INFO:     print_batch_step : 10
[2022/01/09 21:17:27] root INFO:     save_epoch_step : 3
[2022/01/09 21:17:27] root INFO:     save_inference_dir : None
[2022/01/09 21:17:27] root INFO:     save_model_dir : ./output/rec_chinese_common_v2.0_date
[2022/01/09 21:17:27] root INFO:     save_res_path : ./output/rec/predicts_chinese_common_v2.0.txt
[2022/01/09 21:17:27] root INFO:     use_gpu : True
[2022/01/09 21:17:27] root INFO:     use_space_char : True
[2022/01/09 21:17:27] root INFO:     use_visualdl : False
[2022/01/09 21:17:27] root INFO: Loss : 
[2022/01/09 21:17:27] root INFO:     name : CTCLoss
[2022/01/09 21:17:27] root INFO: Metric : 
[2022/01/09 21:17:27] root INFO:     main_indicator : acc
[2022/01/09 21:17:27] root INFO:     name : RecMetric
[2022/01/09 21:17:27] root INFO: Optimizer : 
[2022/01/09 21:17:27] root INFO:     beta1 : 0.9
[2022/01/09 21:17:27] root INFO:     beta2 : 0.999
[2022/01/09 21:17:27] root INFO:     lr : 
[2022/01/09 21:17:27] root INFO:         learning_rate : 0.001
[2022/01/09 21:17:27] root INFO:         name : Cosine
[2022/01/09 21:17:27] root INFO:         warmup_epoch : 5
[2022/01/09 21:17:27] root INFO:     name : Adam
[2022/01/09 21:17:27] root INFO:     regularizer : 
[2022/01/09 21:17:27] root INFO:         factor : 4e-05
[2022/01/09 21:17:27] root INFO:         name : L2
[2022/01/09 21:17:27] root INFO: PostProcess : 
[2022/01/09 21:17:27] root INFO:     name : CTCLabelDecode
[2022/01/09 21:17:27] root INFO: Train : 
[2022/01/09 21:17:27] root INFO:     dataset : 
[2022/01/09 21:17:27] root INFO:         data_dir : /home/aistudio/训练集/date/images
[2022/01/09 21:17:27] root INFO:         label_file_list : ['/home/aistudio/训练集/date/train_list.txt']
[2022/01/09 21:17:27] root INFO:         name : SimpleDataSet
[2022/01/09 21:17:27] root INFO:         transforms : 
[2022/01/09 21:17:27] root INFO:             DecodeImage : 
[2022/01/09 21:17:27] root INFO:                 channel_first : False
[2022/01/09 21:17:27] root INFO:                 img_mode : BGR
[2022/01/09 21:17:27] root INFO:             RecAug : None
[2022/01/09 21:17:27] root INFO:             CTCLabelEncode : None
[2022/01/09 21:17:27] root INFO:             RecResizeImg : 
[2022/01/09 21:17:27] root INFO:                 image_shape : [3, 32, 320]
[2022/01/09 21:17:27] root INFO:             KeepKeys : 
[2022/01/09 21:17:27] root INFO:                 keep_keys : ['image', 'label', 'length']
[2022/01/09 21:17:27] root INFO:     loader : 
[2022/01/09 21:17:27] root INFO:         batch_size_per_card : 512
[2022/01/09 21:17:27] root INFO:         drop_last : True
[2022/01/09 21:17:27] root INFO:         num_workers : 8
[2022/01/09 21:17:27] root INFO:         shuffle : True
[2022/01/09 21:17:27] root INFO: profiler_options : None
[2022/01/09 21:17:27] root INFO: train with paddle 2.0.2 and device CUDAPlace(0)
[2022/01/09 21:17:27] root INFO: Initialize indexs of datasets:['/home/aistudio/训练集/date/val_list.txt']
W0109 21:17:27.794884 15547] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0109 21:17:27.800077 15547] device: 0, cuDNN Version: 7.6.
[2022/01/09 21:17:33] root INFO: resume from ./output/rec_chinese_common_v2.0_date/best_accuracy
[2022/01/09 21:17:33] root INFO: metric in ckpt ***************
[2022/01/09 21:17:33] root INFO: acc:0.991173555371896
[2022/01/09 21:17:33] root INFO: norm_edit_dis:0.9953431509515168
[2022/01/09 21:17:33] root INFO: fps:443.3070479891199
[2022/01/09 21:17:33] root INFO: best_epoch:67
[2022/01/09 21:17:33] root INFO: start_epoch:68

eval model::   0%|          | 0/2 [00:00<?, ?it/s]
eval model::  50%|█████     | 1/2 [00:01<00:01,  1.59s/it]
eval model:: 100%|██████████| 2/2 [00:01<00:00,  1.17s/it]
eval model:: 100%|██████████| 2/2 [00:01<00:00,  1.02it/s]
[2022/01/09 21:17:35] root INFO: metric eval ***************
[2022/01/09 21:17:35] root INFO: acc:0.991173555371896
[2022/01/09 21:17:35] root INFO: norm_edit_dis:0.9953431509515168
[2022/01/09 21:17:35] root INFO: fps:441.1924785284007
INFO 2022-01-09 21:17:38,107] Local processes completed.
6. Result prediction

Modify the tools/infer_rec. Py

with open(save_res_path, "w") as fout: for file in get_image_file_list(config['Global']['infer_img']):"infer_img: {}".format(file)) with open(file, 'rb') as f: img = data = {'image': img} batch = transform(data, ops) if config['Architecture']['algorithm'] == "SRN": encoder_word_pos_list = np.expand_dims(batch[1], axis=0) gsrm_word_pos_list = np.expand_dims(batch[2], axis=0) gsrm_slf_attn_bias1_list = np.expand_dims(batch[3], axis=0) gsrm_slf_attn_bias2_list = np.expand_dims(batch[4], axis=0) others = [ paddle.to_tensor(encoder_word_pos_list), paddle.to_tensor(gsrm_word_pos_list), paddle.to_tensor(gsrm_slf_attn_bias1_list), paddle.to_tensor(gsrm_slf_attn_bias2_list) ] if config['Architecture']['algorithm'] == "SAR": valid_ratio = np.expand_dims(batch[-1], axis=0) img_metas = [paddle.to_tensor(valid_ratio)] images = np.expand_dims(batch[0], axis=0) images = paddle.to_tensor(images) if config['Architecture']['algorithm'] == "SRN": preds = model(images, others) elif config['Architecture']['algorithm'] == "SAR": preds = model(images, img_metas) else: preds = model(images) post_result = post_process_class(preds) info = None if isinstance(post_result, dict): rec_info = dict() for key in post_result: if len(post_result[key][0]) >= 2: rec_info[key] = { "label": post_result[key][0][0], "score": float(post_result[key][0][1]), } info = json.dumps(rec_info) else: if len(post_result[0]) >= 2: info = post_result[0][0] + "\t" + str(post_result[0][1]) if info is not None:"\t result: {}".format(info)) fout.write(os.path.basename(file) + "\t" + post_result[0][0]+"\n")"success!" )Copy the code
1. Amount forecast

# server model! python tools/ -c configs/rec/ch_ppocr_v2. 0/rec_chinese_common_train_v2. 0.yml \
    -o Global.infer_img="/home/aistudio/ test set /amount/images" \
    Global.checkpoints=./output/rec_chinese_common_v2. 0/best_accuracy
Copy the code

The output log

[2022/01/09 21:33:25] root INFO: infer_img: /home/aistudio/测试集/amount/images/8bb1941c760a2c1d01762c63a2220571.jpg
[2022/01/09 21:33:25] root INFO: result: 叁佰元整
[2022/01/09 21:33:25] root INFO: infer_img: /home/aistudio/测试集/amount/images/8bb1941c760a2c1d01762cc9ce6e6029.jpg
[2022/01/09 21:33:25] root INFO: result: 万叁千元整
[... prediction output truncated ...]
[2022/01/09 21:33:25] root INFO: result: 柒佰肆拾壹元柒角叁分
! head ./output/rec/predicts_chinese_common_v2. 0.txt
8bb1941c760a2c1d017626c361da6c4d.jpg	壹仟零陆拾伍佰元整
8bb1941c760a2c1d01762b943a624421.jpg	壹拾壹仟元整
8bb1941c760a2c1d01762c63a2220571.jpg	叁佰元整
8bb1941c760a2c1d01762cc9ce6e6029.jpg	万叁千元整
8bb1941c760a2c1d017640031a39469f.jpg	玖仟叁千壹佰伍拾陆元陆伍角分
8bb1941c760a2c1d0176415a9ec807fe.jpg	伍万叁千元整
8bb1941c760a2c1d017645417b3b5c5e.jpg	叁万叁千壹佰伍陆伍元伍角玖分
8bb1941c760a2c1d017646019cf7387d.jpg	玖仟零伍角整元整
8bb1941c760a2c1d01764b77a53735a0.jpg	叁佰拾万元陆角叁分
8bb1941c7657bb0101765edc72b01d52.jpg	壹万贰千贰佰伍佰伍元整

2. Date prediction

# server model! python tools/ -c configs/rec/ch_ppocr_v2. 0/rec_chinese_common_train_v2. 0_date.yml \
    -o Global.infer_img="/home/aistudio/ test set /date/images" \
    Global.checkpoints=./output/rec_chinese_common_v2. 0_date/best_accuracy
Copy the code

The output log

[2022/01/09 21:34:48] root INFO: infer_img: /home/aistudio/测试集/date/images/0_8bb1941c7a248693017a421e33087817.jpg
[2022/01/09 21:34:48] root INFO: result: 二零二一
[2022/01/09 21:34:48] root INFO: infer_img: /home/aistudio/测试集/date/images/0_8bb1941c7a248693017a5653c6d32277.jpg
[2022/01/09 21:34:48] root INFO: result: 二零二一
[... prediction output truncated ...]
[2022/01/09 21:34:48] root INFO: result: 二零一
! head ./output/rec/predicts_chinese_common_v2. 0_date.txt
0_8bb1941c7a248693017a377ec36606b7.jpg	二零二一
0_8bb1941c7a248693017a421e33087817.jpg	二零二一
0_8bb1941c7a248693017a5653c6d32277.jpg	二零二一
0_8bb1941c7a248693017a56bc0a886d10.jpg	二零二一
0_8bb1941c7a248693017a8955798e0c20.jpg	二零二一
0_8bb1941c7a248693017aa87d11e203b6.jpg	二零二一
0_8bb

/output/rec/predicts_chinese_common_v2.0. TXT and /output/rec/predicts_chinese_common_v2.0_date. TXT. The competition is now closed.