HyperAI

Tutorial Details | Using PaddleOCR to Detect and Identify Container Numbers

3 years ago
Information
Jiaxin Sun
特色图像

Content at a glance: Container number detection based on PaddleOCR shortens the time for recording container numbers and improves port loading and unloading efficiency.
Keywords: PaddleOCR text recognition online tutorial

According to a set of data released by Alphaliner, an international shipping consulting and analysis agency, in March this year, among the top 30 container throughput lists in 2021,Shanghai Port topped the list with a "report card" of 47.025 million TEUs.
Compared with the same period last year, the container throughput of Shanghai Port was 8.1%, which is nearly 10 million TEUs ahead of its nearest competitor Singapore.

The world's top 100 container ports completed a total container throughput of 676 million TEUs in 2021.Such a large number of containers has increased the pressure on container number identification. The traditional method of manually identifying and recording container numbers is costly, inefficient, and has backward operating conditions.

With the development of economy and society, the introduction of artificial intelligence in port operations has become the key to the transformation and upgrading of traditional ports in market competition.

This article will cover environment preparation to model training.Demonstrate how to use PaddleOCR to detect and identify container numbers.

Directly view the code tutorial:
https://openbayes.com/console/open-tutorials/containers/XJsxhLTnKNu

Using a small amount of data to achieve box number detection and recognition tasks

The container number refers to the number of the container that ships the exported goods.This item is required when filling out the consignment note. The structure of the standard container number adopts the ISO6346 (1995) standard and consists of 11 digits. Taking the container number CBHU 123456 7 as an example, it consists of 3 parts:

The first part consists of 4 English letters. The first three letters indicate the owner and operator of the container, and the fourth letter indicates the container type. CBHU indicates that the owner and operator of the container is COSCO Container's standard container.

The second part consists of 6 digits. Indicates the container registration code, which is the unique identification of the container.

The third part is the check code, which is obtained by calculating the verification rules based on the previous 4 letters and 6 digits, and is used to identify whether an error occurs during verification.

Containers piled up at the port waiting to be shipped
This tutorial is based on PaddleOCR for container number detection and recognition tasks. Use a small amount of data to train the detection and recognition models separately, and finally connect them together to achieve the task of container number detection and recognition.

Environment Preparation

  1. Start a "model training" container in the OpenBayes console. Select PaddlePaddle 2.3 for environment and RTX 3090 or other GPU type for resources.

If you do not have a platform account, please visit the following address to register first: https://openbayes.com/console/signup

  1. Open a Terminal window in Jupyter. Then execute the following command:
cd PaddleOCR-release-2.5 #进入 PaddleOCR-release-2.5 文件夹
pip install -r requirements.txt #安装 PaddleOCR 所需依赖
python setup.py install #安装 PaddleOCR

Dataset Introduction

This tutorial uses the Container Number-OCR Dataset, which contains 3003 container images with a resolution of 1920×1080.

To view the dataset details, please visit:
https://openbayes.com/console/open-tutorials/datasets/BzuGVEOJv2T/3

  1. The PaddleOCR detection model training annotation rules are as follows, separated by "\t":
" 图像文件名                    json.dumps 编码的图像标注信息"
ch4_test_images/img_61.jpg    [{"transcription": "MASA", "points": [[310, 104], [416, 141], [418, 216], [312, 179]]}, {...}]

The image annotation information before json.dumps encoding is a list containing multiple dictionaries. The points in the dictionary represent the coordinates (x, y) of the four points of the text box, arranged clockwise from the point in the upper left corner.

transcription represents the text in the current text box. When its content is "###", it means that the text box is invalid and will be skipped during training.

  1. The PaddleOCR recognition model training annotation rules are as follows, separated by "\t":
" 图像文件名                 图像标注信息 "

train_data/rec/train/word_001.jpg   简单可依赖
train_data/rec/train/word_002.jpg   用科技让复杂的世界更简单

## Data collation

3.1 Data preparation required for detection model

Divide the 3,000 images in the dataset into training and validation sets at a ratio of 2:1 and run the following code:

from tqdm import tqdm
finename = "all_label.txt"
f = open(finename)
lines = f.readlines() 
t = open('det_train_label.txt','w')
v = open('det_eval_label.txt','w')
count = 0
for line in tqdm(lines):
    if count < 2000:
        t.writelines(line)
        count += 1
    else:
        v.writelines(line)
f.close()
t.close()
v.close()

3.2 Data preparation required for identification model

According to the annotations of the detection part, crop the data set to contain only the text part of the picture as the recognition data, and run the following code:
insert image description here
For the complete code, see: https://openbayes.com/console/open-tutorials/containers/XJsxhLTnKNu

## Experiment

Since the data is relatively small, in order to achieve better and faster model convergence, the PP-OCRv3 model in PaddleOCR is used for detection and recognition.

Based on PP-OCRv2, the end-to-end Hmean index of PP-OCRv3 for Chinese scenes is improved by 5% compared with PP-OCRv2, and the end-to-end effect of English digital models is improved by 11%.

Please refer to the PP-OCRv3 technical report for detailed optimization details.

4.1 Detection Model

4.1.1 Detection model configuration

PaddleOCR provides many detection models. On the path PaddleOCR-release-2.5/configs/det The models and their configuration files can be found underh_PP-OCRv3_det_student.yml, the configuration file path is:PaddleOCR-release-2.5/configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_student.yml

Before use, necessary settings need to be made, such as training parameters, dataset path, etc. Some key configurations are shown below:

#关键训练参数
use_gpu: true #是否使用显卡
epoch_num: 1200 #训练 epoch 个数
save_model_dir: ./output/ch_PP-OCR_V3_det/ #模型保存路径
save_epoch_step: 200 #每训练 200epoch,保存一次模型
eval_batch_step: [0, 100] #训练每迭代 100 次,进行一次验证
pretrained_model: ./PaddleOCR-release
2.5/pretrain_models/ch_PP-OCR_V3_det/best_accuracy.pdparams #预训练模型路径
#训练集路径设置
Train:
  dataset:
    name: SimpleDataSet
    data_dir: /input0/images #图片文件夹路径
    label_file_list:
      - ./det_train_label.txt #标签路径

4.1.2 Model Fine-tuning

Run the following command in the notebook to fine-tune the model, where -c passes in the configured model file path:

leOCR-release-2.5/tools/train.py \
    -c PaddleOCR-release-2.5/configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_student.yml

Using default hyperparameters, the model ch_PP-OCRv3_det_student After 385 epochs of training on the training set, the hmean of the model on the validation set reached: 96.96%, and there was no significant increase thereafter:

[2022/10/11 06:36:09] ppocr INFO: best metric, hmean: 0.969551282051282, precision: 0.9577836411609498,
recall: 0.981611681990265, fps: 20.347745459258228, best_epoch: 385

4.2 Identification Model

4.2.1 Identification model configuration

PaddleOCR provides many recognition models. PaddleOCR-release-2.5/configs/rec The model and its configuration files can be found under

If we choose the model ch_PP-OCRv3_rec_distillation, its configuration file path is:
PaddleOCR-release-2.5/configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.yml

Necessary settings need to be made before use, such as training parameters, dataset path, etc. Some key configurations are shown below:

#关键训练参数
use_gpu: true #是否使用显卡
epoch_num: 1200 #训练 epoch 个数
save_model_dir: ./output/rec_ppocr_v3_distillation #模型保存路径
save_epoch_step: 200 #每训练 200epoch,保存一次模型
eval_batch_step: [0, 100] #训练每迭代 100 次,进行一次验证
pretrained_model: ./PaddleOCR-release-2.5/pretrain_models/PPOCRv3/best_accuracy.pdparams #预训练模型路径
#训练集路径设置
Train:
  dataset:
    name: SimpleDataSet
    data_dir: ./RecTrainData/ #图片文件夹路径
    label_file_list:
      - ./rec_train_label.txt #标签路径

4.2.2 Model Fine-tuning

Run the following command in the notebook to fine-tune the model, where -c passes in the configured model file path:

%run PaddleOCR-release-2.5/tools/train.py \
    -c PaddleOCR-release-2.5/configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.yml

Using default hyperparameters, the model ch_PP-OCRv3_rec_distillation After 136 epochs of training on the training set, the model's accuracy on the validation set reached 96.11%, and there was no significant increase thereafter:

[2022/10/11 20:04:28] ppocr INFO: best metric, acc: 0.9610600272522444, norm_edit_dis: 0.9927426548965615,
Teacher_acc: 0.9540291998159589, Teacher_norm_edit_dis: 0.9905629345025616, fps: 246.029195787707, best_epoch: 136

Results

5.1 Detection Model Reasoning

Run the following command in the notebook to use the fine-tuned model to detect text in the test image:

  • Global.infer_img is the image path or image folder path
  • Global.pretrained_model is the fine-tuned model
  • Global.save_res_path is the path to save the inference results
%run PaddleOCR-release-2.5/tools/infer_det.py \
    -c PaddleOCR-release-2.5/configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_student.yml \
    -o Global.infer_img="/input0/images" Global.pretrained_model="./output/ch_PP-OCR_V3_det/best_accuracy" Global.save_res_path="./output/det_infer_res/predicts.txt"

5.2 Recognition Model Reasoning

Run the following command in the notebook to use the fine-tuned model to detect text in the test image:

  • Global.infer_img The path to the image or the path to the image folder
  • Global.pretrained_model For the fine-tuned model
  • Global.save_res_path Save the path for inference results
%run PaddleOCR-release-2.5/tools/infer_rec.py \
    -c PaddleOCR-release-2.5/configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.yml \
    -o Global.infer_img="./RecEvalData/" Global.pretrained_model="./output/rec_ppocr_v3_distillation/best_accuracy" Global.save_res_path="./output/rec_infer_res/predicts.txt"

5.3 Detection and Recognition Model Serial Reasoning

5.3.1 Model Conversion

Before serial reasoning, you first need to convert the trained and saved model into an inference model by executing the following detection commands respectively. in:

  • -c Pass in the path to the configuration file of the model to be converted
  • -o Global.pretrained_model The model file to be converted
  • Global.save_inference_dir The storage path for the inference model obtained by conversion
    insert image description here

5.3.2 Model Cascading Reasoning

After the conversion is completed, PaddleOCR provides a detection and recognition model concatenation tool, which can concatenate any trained detection model and any recognized model into a two-stage text recognition system.

The input image goes through four main stages: text detection, detection frame correction, text recognition, and score filtering to output the text location and recognition results.

The execution code is as follows:

  • image_dir The path to a single image or a collection of images
  • det_model_dir Path to detect inference model
  • rec_model_dir To identify the path of the inference model

The visual recognition results are saved in the ./inference_results folder by default.

%run PaddleOCR-release-2.5/tools/infer/predict_system.py \
--image_dir="OCRTest" \
--det_model_dir="./output/det_inference/" \
--rec_model_dir="./output/rec_inference/Student/"

Visualization results example
To view the full tutorial, visit:
https://openbayes.com/console/open-tutorials/containers/XJsxhLTnKNu

About PaddleOCR and OpenBayes

PaddleOCR is an OCR tool library based on Baidu PaddlePaddle, including an ultra-lightweight Chinese OCR with a total model of only 8.6M. It also supports training algorithms, service deployment, and end-side deployment for multiple text detection and text recognition.

For more information, please visit:
https://github.com/PaddlePaddle/PaddleOCR

Understanding OpenBayes

OpenBayes is a leading machine intelligence research institution in China.Provides a number of basic services related to AI development, including computing power containers, automatic modeling, and automatic parameter adjustment.

At the same time, OpenBayes has also launched many mainstream public resources such as data sets, tutorials, and models. For developers to quickly learn and create ideal machine learning models.

Visit openbayes.com and register now to enjoy 600 minutes/week of RTX 3090 and 300 minutes/week of free CPU computing time.

Note: The weekly gift resources will arrive every Monday afternoon

To view and run the complete tutorial, visit the following link:
https://openbayes.com/console/open-tutorials/containers/XJsxhLTnKNu

-- over--