Skip to content

Third-party Models Offline Inference - Quick Start

Third-party Models Offline Inference - Quick Start

1. Third-Party Model Support List

MindOCR supports the inference of third-party models (PaddleOCR, MMOCR, etc.), and this document displays a list of adapted models. The performance test is based on Ascend310P, and some models have no test data set yet.

1.1 Text Detection

name model backbone dataset F-score(%) FPS source config download reference
ch_pp_det_OCRv4 DBNet MobileNetV3 / / / PaddleOCR yaml infer model ch_PP-OCRv4_det
ch_pp_server_det_v2.0 DBNet ResNet18_vd MLT17 46.22 21.65 PaddleOCR yaml infer model ch_ppocr_server_v2.0_det
ch_pp_det_OCRv3 DBNet MobileNetV3 MLT17 33.89 22.40 PaddleOCR yaml infer model ch_PP-OCRv3_det
ch_pp_det_OCRv2 DBNet MobileNetV3 MLT17 42.99 21.90 PaddleOCR yaml infer model ch_PP-OCRv2_det
ch_pp_mobile_det_v2.0_slim DBNet MobileNetV3 MLT17 31.66 19.88 PaddleOCR yaml infer model ch_ppocr_mobile_slim_v2.0_det
ch_pp_mobile_det_v2.0 DBNet MobileNetV3 MLT17 31.56 21.96 PaddleOCR yaml infer model ch_ppocr_mobile_v2.0_det
en_pp_det_OCRv3 DBNet MobileNetV3 IC15 42.14 55.55 PaddleOCR yaml infer model en_PP-OCRv3_det
ml_pp_det_OCRv3 DBNet MobileNetV3 MLT17 66.01 22.48 PaddleOCR yaml infer model ml_PP-OCRv3_det
en_pp_det_dbnet_resnet50vd DBNet ResNet50_vd IC15 79.89 21.17 PaddleOCR yaml infer model DBNet
en_pp_det_psenet_resnet50vd PSE ResNet50_vd IC15 80.44 7.75 PaddleOCR yaml train model PSE
en_pp_det_east_resnet50vd EAST ResNet50_vd IC15 85.58 20.70 PaddleOCR yaml train model EAST
en_pp_det_sast_resnet50vd SAST ResNet50_vd IC15 81.77 22.14 PaddleOCR yaml train model SAST
en_mm_det_dbnetpp_resnet50 DBNet++ ResNet50 IC15 81.36 10.66 MMOCR yaml train model DBNetpp
en_mm_det_fcenet_resnet50 FCENet ResNet50 IC15 83.67 3.34 MMOCR yaml train model FCENet

Notice: When using the en_pp_det_psenet_resnet50vd model for inference, you need to modify the onnx file with the following command

python deploy/models_utils/onnx_optim/insert_pse_postprocess.py \
      --model_path=./pse_r50vd.onnx \
      --binary_thresh=0.0 \
      --scale=1.0

1.2 Text recognition

name model backbone dataset Acc(%) FPS source dict file config download reference
ch_pp_rec_OCRv4 CRNN MobileNetV1Enhance / / / PaddleOCR ppocr_keys_v1.txt yaml infer model ch_PP-OCRv4_rec
ch_pp_server_rec_v2.0 CRNN ResNet34 MLT17 (ch) 49.91 154.16 PaddleOCR ppocr_keys_v1.txt yaml infer model ch_ppocr_server_v2.0_rec
ch_pp_rec_OCRv3 SVTR MobileNetV1Enhance MLT17 (ch) 49.91 408.38 PaddleOCR ppocr_keys_v1.txt yaml infer model ch_PP-OCRv3_rec
ch_pp_rec_OCRv2 CRNN MobileNetV1Enhance MLT17 (ch) 44.59 203.34 PaddleOCR ppocr_keys_v1.txt yaml infer model ch_PP-OCRv2_rec
ch_pp_mobile_rec_v2.0 CRNN MobileNetV3 MLT17 (ch) 24.59 167.67 PaddleOCR ppocr_keys_v1.txt yaml infer model ch_ppocr_mobile_v2.0_rec
en_pp_rec_OCRv3 SVTR MobileNetV1Enhance MLT17 (en) 79.79 917.01 PaddleOCR en_dict.txt yaml infer model en_PP-OCRv3_rec
en_pp_mobile_rec_number_v2.0_slim CRNN MobileNetV3 / / / PaddleOCR en_dict.txt yaml infer model en_number_mobile_slim_v2.0_rec
en_pp_mobile_rec_number_v2.0 CRNN MobileNetV3 / / / PaddleOCR en_dict.txt yaml infer model en_number_mobile_v2.0_rec
korean_pp_rec_OCRv3 SVTR MobileNetV1Enhance / / / PaddleOCR korean_dict.txt yaml infer model korean_PP-OCRv3_rec
japan_pp_rec_OCRv3 SVTR MobileNetV1Enhance / / / PaddleOCR japan_dict.txt yaml infer model japan_PP-OCRv3_rec
chinese_cht_pp_rec_OCRv3 SVTR MobileNetV1Enhance / / / PaddleOCR chinese_cht_dict.txt yaml infer model chinese_cht_PP-OCRv3_rec
te_pp_rec_OCRv3 SVTR MobileNetV1Enhance / / / PaddleOCR te_dict.txt yaml infer model te_PP-OCRv3_rec
ka_pp_rec_OCRv3 SVTR MobileNetV1Enhance / / / PaddleOCR ka_dict.txt yaml infer model ka_PP-OCRv3_rec
ta_pp_rec_OCRv3 SVTR MobileNetV1Enhance / / / PaddleOCR ta_dict.txt yaml infer model ta_PP-OCRv3_rec
latin_pp_rec_OCRv3 SVTR MobileNetV1Enhance / / / PaddleOCR latin_dict.txt yaml infer model latin_PP-OCRv3_rec
arabic_pp_rec_OCRv3 SVTR MobileNetV1Enhance / / / PaddleOCR arabic_dict.txt yaml infer model arabic_PP-OCRv3_rec
cyrillic_pp_rec_OCRv3 SVTR MobileNetV1Enhance / / / PaddleOCR cyrillic_dict.txt yaml infer model cyrillic_PP-OCRv3_rec
devanagari_pp_rec_OCRv3 SVTR MobileNetV1Enhance / / / PaddleOCR devanagari_dict.txt yaml infer model devanagari_PP-OCRv3_rec
en_pp_rec_crnn_resnet34vd CRNN ResNet34_vd IC15 66.35 420.80 PaddleOCR ic15_dict.txt yaml infer model CRNN
en_pp_rec_rosetta_resnet34vd Rosetta Resnet34_vd IC15 64.28 552.40 PaddleOCR ic15_dict.txt yaml infer model Rosetta
en_pp_rec_vitstr_vitstr ViTSTR ViTSTR IC15 68.42 364.67 PaddleOCR EN_symbol_dict.txt yaml train model ViTSTR
en_mm_rec_nrtr_resnet31 NRTR ResNet31 IC15 67.26 32.63 MMOCR english_digits_symbols.txt yaml train model NRTR
en_mm_rec_satrn_shallowcnn SATRN ShallowCNN IC15 73.52 32.14 MMOCR english_digits_symbols.txt yaml train model SATRN

1.3 Text angle classification

name model dataset Acc(%) FPS source config download reference
ch_pp_mobile_cls_v2.0 MobileNetV3 / / / PaddleOCR yaml infer model ch_ppocr_mobile_v2.0_cls

2. Overview of Third-Party Inference

graph LR;
    A[ThirdParty models] -- xx2onnx --> B[ONNX] -- converter_lite --> C[MindIR];
    C --input --> D[infer.py] -- outputs --> eval_rec.py/eval_det.py;
    H[images] --input --> D[infer.py];

3. Third-Party Model Inference Methods

For ppocrv4, we provide Quick Convertion Tool for converting Paddle model to MindIR model.

3.1 Text Detection

Let's take ch_pp_det_OCRv4 in Third-Party Model Support List as an example to introduce the inference method:

3.1.1 Download Thirdparty model file

  • In Third-Party Model Support List, infer model indicates model file for inference; train model indicates model file for training, and it need to be converted to inference model first.
  • If the model file is infer model, like ch_pp_det_OCRv4, dowload and extract infer model and get the following folder: text ch_PP-OCRv4_det_infer/ ├── inference.pdmodel ├── inference.pdiparams ├── inference.pdiparams.info
  • If the model file is train model, like en_pp_det_psenet_resnet50vd, dowload and extract train model and get the following folder:
    det_r50_vd_pse_v2.0_train/
    ├── train.log
    ├── best_accuracy.pdopt
    ├── best_accuracy.states
    ├── best_accuracy.pdparams
    
    And it need to be converted by the following commands:
    git clone https://github.com/PaddlePaddle/PaddleOCR.git
    cd PaddleOCR
    python tools/export_model.py \
        -c configs/det/det_r50_vd_pse.yml \
        -o Global.pretrained_model=./det_r50_vd_pse_v2.0_train/best_accuracy  \
        Global.save_inference_dir=./det_db
    
    and you will get the following folder:
    det_db/
    ├── inference.pdmodel
    ├── inference.pdiparams
    ├── inference.pdiparams.info
    

3.1.2 Convert the thirdparty model to onnx file

Download and use the paddle2onnx tool

pip install paddle2onnx
and convert the inference model into an onnx file:

paddle2onnx \
     --model_dir det_db \
     --model_filename inference.pdmodel \
     --params_filename inference.pdiparams\
     --save_file det_db.onnx \
     --opset_version 11 \
     --input_shape_dict="{'x':[-1,3,-1,-1]}" \
     --enable_onnx_checker True
A brief explanation of parameters for paddle2onnx is as follows:

Parameter Description
--model_dir Configures the directory path containing the Paddle model.
--model_filename [Optional] Configures the file name storing the network structure located under --model_dir.
--params_filename [Optional] Configures the file name storing the model parameters located under --model_dir.
--save_file Specifies the directory path for saving the converted model.
--opset_version [Optional] Configures the OpSet version for converting to ONNX. Multiple versions, such as 7~16, are currently supported, and the default is 9.
--input_shape_dict Specifies the shape of the input tensor for generating a dynamic ONNX model. The format is "{'x': [N, C, H, W]}", where -1 represents dynamic shape.
--enable_onnx_checker [Optional] Configures whether to check the correctness of the exported ONNX model. It is recommended to enable this switch, and the default is False.

The value of --input_shape_dict in the parameter can be viewed by opening the inference model through the Netron tool.

Learn more about paddle2onnx

The det_db.onnx file will be generated after the above command is executed;

3.1.3 Convert onnx file to Lite MindIR file

Use converter_lite tool on Ascend310/310P to convert onnx files to mindir:

Create config.txt and specify the model input shape:

  • If converting to static shape model, like static shape of [1,3,736,1280], the config is as following
    [ascend_context]
    input_format=NCHW
    input_shape=x:[1,3,736,1280]
    
  • If converting to dynamic shape(scaling) model, the config is as following
    [ascend_context]
    input_format=NCHW
    input_shape=x:[1,3,-1,-1]
    dynamic_dims=[736,1280],[768,1280],[896,1280],[1024,1280]
    
  • If converting to dynamic shape model, the config is as following
    [acl_build_options]
    input_format=NCHW
    input_shape_range=x:[-1,3,-1,-1]
    

A brief explanation of the configuration file parameters is as follows:

Parameter Attribute Function Description Data Type Value Description
input_format Optional Specify the format of the model input String Optional values are "NCHW", "NHWC", "ND"
input_shape Optional Specify the shape of the model input. The input_name must be the input name in the original model, arranged in order of input, separated by ";" String For example: "input1:[1,64,64,3];input2:[1,256,256,3]"
dynamic_dims Optional Specify dynamic BatchSize and dynamic resolution parameters String For example: "dynamic_dims=[48,520],[48,320],[48,384]"

Learn more about Configuration File Parameters

Run the following command:

converter_lite\
     --saveType=MINDIR \
     --fmk=ONNX \
     --optimize=ascend_oriented \
     --modelFile=det_db.onnx \
     --outputFile=det_db_lite \
     --configFile=config.txt
After the above command is executed, the det_db_lite.mindir file will be generated;

A brief explanation of the converter_lite parameters is as follows:

Parameter Required Parameter Description Value Range Default Remarks
fmk Yes Input model format MINDIR, CAFFE, TFLITE, TF, ONNX - -
saveType No Set the exported model to MINDIR or MS model format. MINDIR, MINDIR_LITE MINDIR The cloud-side inference version can only infer models converted to MINDIR format
modelFile Yes Input model path - - -
outputFile Yes Output model path. Do not add a suffix, ".mindir" suffix will be generated automatically. - - -
configFile No 1) Path to the quantization configuration file after training; 2) Path to the configuration file for extended functions - - -
optimize No Set the model optimization type for the device. Default is none. none、general、gpu_oriented、ascend_oriented - -

Learn more about converter_lite

Learn more about Model Conversion Tutorial

3.1.4 Inference with Lite MindIR

Perform inference using deploy/py_infer/infer.py codes and det_db_lite.mindir model file:

python deploy/py_infer/infer.py \
    --input_images_dir=/path/to/ic15/ch4_test_images \
    --det_model_path=/path/to/mindir/det_db_lite.mindir \
    --det_model_name_or_config=ch_pp_det_OCRv4 \
    --res_save_dir=/path/to/ch_pp_det_OCRv4_results
After the execution is completed, the prediction file det_results.txt will be generated in the directory pointed to by the parameter --res_save_dir.

When doing inference, you can use the --vis_det_save_dir parameter to visualize the results

Learn more about infer.py inference parameters

3.1.5 Evalution

Evaluate the results using the following command:

python deploy/eval_utils/eval_det.py\
    --gt_path=/path/to/ic15/test_det_gt.txt\
    --pred_path=/path/to/ch_pp_det_OCRv4_results/det_results.txt

3.2 Text Recognition

Let's take ch_pp_rec_OCRv4 in Third-Party Model Support List as an example to introduce the inference method:

3.2.1 Download Thirdparty model file

  • In Third-Party Model Support List, infer model indicates model file for inference; train model indicates model file for training, and it need to be converted to inference model first.
  • If the model file is infer model, like ch_pp_rec_OCRv4, dowload and extract infer model and get the following folder:

    ch_PP-OCRv4_det_infer/
    ├── inference.pdmodel
    ├── inference.pdiparams
    ├── inference.pdiparams.info
    
  • If the model file is train model, like en_pp_rec_vitstr_vitstr, dowload and extract train model and get the following folder:

    rec_vitstr_none_ce_train/
    ├── train.log
    ├── best_accuracy.pdopt
    ├── best_accuracy.states
    ├── best_accuracy.pdparams
    
    And it need to be converted by the following commands:

    git clone https://github.com/PaddlePaddle/PaddleOCR.git
    cd PaddleOCR
    python tools/export_model.py \
        -c configs/rec/rec_vitstr_none_ce.yml \
        -o Global.pretrained_model=./rec_vitstr_none_ce_train/best_accuracy  \
        Global.save_inference_dir=./rec_vitstr
    

    and you will get the following folder:

    rec_vitstr/
    ├── inference.pdmodel
    ├── inference.pdiparams
    ├── inference.pdiparams.info
    

3.2.2 Convert the thirdparty model to onnx file

Download and use the paddle2onnx tool

pip install paddle2onnx
and convert the inference model into an onnx file:

paddle2onnx \
    --model_dir ch_PP-OCRv4_rec_infer \
    --model_filename inference.pdmodel \
    --params_filename inference.pdiparams \
    --save_file rec_crnn.onnx \
    --opset_version 11 \
    --input_shape_dict="{'x':[-1,3,48,-1]}" \
    --enable_onnx_checker True

The rec_crnn.onnx file will be generated after the above command is executed;

Please refer to 3.1.2 Convert the thirdparty model to onnx file for details about paddle2onnx.

3.2.3 Convert onnx file to Lite MindIR file

Use converter_lite tool on Ascend310/310P to convert onnx files to mindir:

Create config.txt and specify the model input shape:

  • If converting to static shape model, like static shape of [1,3,48,320], the config is as following
    [ascend_context]
    input_format=NCHW
    input_shape=x:[1,3,48,320]
    
  • If converting to dynamic shape(scaling) model, the config is as following
    [ascend_context]
    input_format=NCHW
    input_shape=x:[1,3,-1,-1]
    dynamic_dims=[48,520],[48,320],[48,384],[48,360],[48,394],[48,321],[48,336],[48,368],[48,328],[48,685],[48,347]
    
  • If converting to dynamic shape model, the config is as following
    [acl_build_options]
    input_format=NCHW
    input_shape_range=x:[-1,3,-1,-1]
    

For a brief description of the configuration parameters, please refer to 3.1.3 Convert onnx file to Lite MindIR file

Run the following command:

converter_lite \
    --saveType=MINDIR \
    --fmk=ONNX \
    --optimize=ascend_oriented \
    --modelFile=rec_crnn.onnx \
    --outputFile=rec_crnn_lite \
    --configFile=config.txt

After the above command is executed, the rec_crnn_lite.mindir.mindir file will be generated;

For a brief description of the converter_lite parameters, see the text detection example above.

Learn more about converter_lite

Learn more about Model Conversion Tutorial

3.2.4 Download the Dictionary File for Recognition

According to Third-Party Model Support List, download ppocr_keys_v1.txt which matches with ch_pp_rec_OCRv4.

3.2.5 Inference with Lite MindIR

Perform inference using deploy/py_infer/infer.py codes and rec_crnn_lite.mindir model file:

python deploy/py_infer/infer.py \
    --input_images_dir=/path/to/mlt17_ch \
    --rec_model_path=/path/to/mindir/rec_crnn_lite.mindir \
    --rec_model_name_or_config=ch_pp_rec_OCRv4 \
    --character_dict_path=/path/to/ppocr_keys_v1.txt \
    --res_save_dir=/path/to/ch_rec_infer_results

After the execution is completed, the prediction file rec_results.txt will be generated in the directory pointed to by the parameter --res_save_dir.

Learn more about infer.py inference parameters

3.2.6 Evalution

Evaluate the results using the following command:

python deploy/eval_utils/eval_rec.py \
    --gt_path=/path/to/mlt17_ch/chinese_gt.txt \
    --pred_path=/path/to/en_rec_infer_results/rec_results.txt
Refer Dataset converters for dataset preparation.

3.3 Text Direction Classification

Let's take ch_pp_mobile_cls_v2 in Third-Party Model Support List as an example to introduce the inference method:

3.3.1 Download Thirdparty model file

In Third-Party Model Support Listch_pp_mobile_cls_v2.0 is a infer model,so convertion is not needed. dowload and extract it and get the following folder:

ch_ppocr_mobile_v2.0_cls_infer/
├── inference.pdmodel
├── inference.pdiparams
├── inference.pdiparams.info

3.3.2 Convert the thirdparty model to onnx file

convert the inference model into an onnx file:

paddle2onnx \
    --model_dir cls_mv3 \
    --model_filename inference.pdmodel \
    --params_filename inference.pdiparams \
    --save_file cls_mv3.onnx \
    --opset_version 11 \
    --input_shape_dict="{'x':[-1,3,-1,-1]}" \
    --enable_onnx_checker True

The cls_mv3.onnx file will be generated after the above command is executed;

Please refer to 3.1.2 Convert the thirdparty model to onnx file for details about paddle2onnx.

3.3.3 Convert onnx file to Lite MindIR file

Refer to 3.1.3 Convert onnx file to Lite MindIR file and create config.txt, here we take dynamic shape config as example

[acl_build_options]
input_format=NCHW
input_shape_range=x:[-1,3,-1,-1]
And run the following command:

converter_lite \
    --saveType=MINDIR \
    --fmk=ONNX \
    --optimize=ascend_oriented \
    --modelFile=cls_mv3.onnx \
    --outputFile=cls_mv3_lite \
    --configFile=config.txt

After the above command is executed, the cls_mv3_lite.mindir.mindir file will be generated

3.4 End to End Inference

Prepare mindIR according to Text Detection, Text Recognition, Text Direction Classification, and run the following command to do end-to-end inference

python deploy/py_infer/infer.py \
    --input_images_dir=/path/to/ic15/ch4_test_images \
    --det_model_path=/path/to/mindir/det_db_lite.mindir \
    --det_model_name_or_config=ch_pp_det_OCRv4 \
    --cls_model_path=/path/to/mindir/cls_mv3_lite.mindir \
    --cls_model_name_or_config=ch_pp_mobile_cls_v2.0 \
    --rec_model_path=/path/to/mindir/rec_crnn_lite.mindir \
    --rec_model_name_or_config=ch_pp_rec_OCRv4 \
    --character_dict_path=/path/to/ppocr_keys_v1.txt \
    --res_save_dir=/path/to/infer_results

3.5 Quick Convertion Tool

For ppocrv4,we provide tools for converting Paddle model to MindIR model, the guidence is as following: - Make sure MindSpore Lite has been downloaded and configured successfully, please refer to MindSpore Lite. And make sure converter_lite has been added into the environment variable. - Run the following command:

cd tools
bash paddle2mindir.sh -m=${ppocr_model_name} -p=${save_dir}
- $ppocr_model_name: ppocr models to be converted. ch_PP-OCRv4, ch_PP-OCRv4_server are supported - $save_dir: folder to save downloaded ppocr models and converted mindir. Default: ppocr_models

The convertion may cost minutes, please wait. And You could get the following MindIR models after convertion:

ppocr_models
├── ${PPOCR_MODEL_NAME}_det_db_dynamic_output.mindir
├── ${PPOCR_MODEL_NAME}_rec_crnn_static_output.mindir
├── ${PPOCR_MODEL_NAME}_cls_mv4_static_output.mindir
├── ...

4.FAQ about converting and inference

For problems about converting model and inference, please refer to FAQ for solutions.