MindOCR Models Offline Inference - Quick Start
MindOCR Models Offline Inference - Quick Start¶
1. MindOCR Model Support List¶
1.1 Text Detection¶
Model | Backbone | Language | Dataset | F-score(%) | FPS | data shape (NCHW) | Config | Download |
---|---|---|---|---|---|---|---|---|
DBNet | MobileNetV3 | en | IC15 | 76.96 | 26.19 | (1,3,736,1280) | yaml | ckpt | mindir |
ResNet-18 | en | IC15 | 81.73 | 24.04 | (1,3,736,1280) | yaml | ckpt | mindir | |
ResNet-50 | en | IC15 | 85.00 | 21.69 | (1,3,736,1280) | yaml | ckpt | mindir | |
ResNet-50 | ch + en | 12 Datasets | 83.41 | 21.69 | (1,3,736,1280) | yaml | ckpt | mindir | |
DBNet++ | ResNet-50 | en | IC15 | 86.79 | 8.46 | (1,3,1152,2048) | yaml | ckpt | mindir |
ResNet-50 | ch + en | 12 Datasets | 84.30 | 8.46 | (1,3,1152,2048) | yaml | ckpt | mindir | |
EAST | ResNet-50 | en | IC15 | 86.86 | 6.72 | (1,3,720,1280) | yaml | ckpt | mindir |
MobileNetV3 | en | IC15 | 75.32 | 26.77 | (1,3,720,1280) | yaml | ckpt | mindir | |
PSENet | ResNet-152 | en | IC15 | 82.50 | 2.52 | (1,3,1472,2624) | yaml | ckpt | mindir |
ResNet-50 | en | IC15 | 81.37 | 10.16 | (1,3,736,1312) | yaml | ckpt | mindir | |
MobileNetV3 | en | IC15 | 70.56 | 10.38 | (1,3,736,1312) | yaml | ckpt | mindir | |
FCENet | ResNet50 | en | IC15 | 78.94 | 14.59 | (1,3,736,1280) | yaml | ckpt | mindir |
1.2 Text Recognition¶
Model | Backbone | Dict File | Dataset | Acc(%) | FPS | data shape (NCHW) | Config | Download |
---|---|---|---|---|---|---|---|---|
CRNN | VGG7 | Default | IC15 | 66.01 | 465.64 | (1,3,32,100) | yaml | ckpt | mindir |
ResNet34_vd | Default | IC15 | 69.67 | 397.29 | (1,3,32,100) | yaml | ckpt | mindir | |
ResNet34_vd | ch_dict.txt | / | / | / | (1,3,32,320) | yaml | ckpt | mindir | |
SVTR | Tiny | Default | IC15 | 79.92 | 338.04 | (1,3,64,256) | yaml | ckpt | mindir |
Rare | ResNet34_vd | Default | IC15 | 69.47 | 273.23 | (1,3,32,100) | yaml | ckpt | mindir |
ResNet34_vd | ch_dict.txt | / | / | / | (1,3,32,320) | yaml | ckpt | mindir | |
RobustScanner | ResNet-31 | en_dict90.txt | IC15 | 73.71 | 22.30 | (1,3,48,160) | yaml | ckpt | mindir |
VisionLAN | ResNet-45 | Default | IC15 | 80.07 | 321.37 | (1,3,64,256) | yaml(LA) | ckpt(LA) | mindir(LA) |
1.3 Text Direction Classification¶
Model | Backbone | Dataset | F-score(%) | FPS | data shape (NCHW) | Config | Download |
---|---|---|---|---|---|---|---|
MobileNetV3 | MobileNetV3 | / | / | / | (1,3,48,192) | yaml | ckpt |
2. Overview of MindOCR Inference¶
graph LR;
subgraph Step 1
A[ckpt] -- export.py --> B[MindIR]
end
subgraph Step 2
B -- converter_lite --> C[MindSpore Lite MindIR];
end
subgraph Step 3
C -- input --> D[infer.py];
end
subgraph Step 4
D -- outputs --> E[eval_rec.py/eval_det.py];
end
F[images] -- input --> D;
As shown in the figure above, the inference process is divided into the following steps:
- Use
tools/export.py
to export the ckpt model to MindIR model; - Download and configure the model converter (i.e. converter_lite), and use the converter_lite tool to convert the MindIR to the MindSpore Lite MindIR;
- After preparing the MindSpore Lite MindIR and the input image, use
deploy/py_infer/infer.py
to perform inference; - Depending on the type of model, use
deploy/eval_utils/eval_det.py
to evaluate the inference results of the text detection models, or usedeploy/eval_utils/eval_rec.py
for text recognition models.
Note: Step 1 runs on Ascend910, GPU or CPU. Step 2, 3, 4 run on Ascend310 or 310P.
3. MindOCR Inference Methods¶
3.1 Text Detection¶
Let's take DBNet ResNet-50 en
in the model support list as an example to introduce the inference method:
- Download the ckpt file in the model support list and use the following command to export to MindIR, or directly download the exported mindir file from the model support list:
``` shell
# Use the local ckpt file to export the MindIR of the `DBNet ResNet-50 en` model
# For more parameter usage details, please execute `python tools/export.py -h`
python tools/export.py --model_name_or_config dbnet_resnet50 --data_shape 736 1280 --local_ckpt_path /path/to/dbnet.ckpt
```
In the above command, ```--model_name_or_config``` is the model name in MindOCR or we can pass the yaml directory to it (for example ```--model_name_or_config configs/rec/crnn/crnn_resnet34.yaml```);
The ```--data_shape 736 1280``` parameter indicates that the size of the model input image is [736, 1280], and each MindOCR model corresponds to a fixed export data shape. For details, see **data shape** in the model support list;
```--local_ckpt_path /path/to/dbnet.ckpt``` parameter indicates that the model file to be exported is ```/path/to/dbnet.ckpt```
-
Use the converter_lite tool on Ascend310 or 310P to convert the MindIR to MindSpore Lite MindIR:
Run the following command:
In the above command:converter_lite \ --saveType=MINDIR \ --fmk=MINDIR \ --optimize=ascend_oriented \ --modelFile=dbnet_resnet50-c3a4aa24-fbf95c82.mindir \ --outputFile=dbnet_resnet50_lite
--fmk=MINDIR
indicates that the original format of the input model is MindIR, and the--fmk
parameter also supports ONNX, etc.;--saveType=MINDIR
indicates that the output model format is MindIR format;--optimize=ascend_oriented
indicates that optimize for Ascend devices;--modelFile=dbnet_resnet50-c3a4aa24-fbf95c82.mindir
indicates that the current model path to be converted isdbnet_resnet50-c3a4aa24-fbf95c82.mindir
;--outputFile=dbnet_resnet50_lite
indicates that the path of the output model isdbnet_resnet50_lite
, which can be automatically generated without adding the .mindir suffix;After the above command is executed, the
dbnet_resnet50_lite.mindir
model file will be generated;Learn more about converter_lite
Learn more about Model Conversion Tutorial
-
Perform inference using
deploy/py_infer/infer.py
codes anddbnet_resnet50_lite.mindir
file:After the execution is completed, the prediction filepython deploy/py_infer/infer.py \ --input_images_dir=/path/to/ic15/ch4_test_images \ --det_model_path=/path/to/mindir/dbnet_resnet50_lite.mindir \ --det_model_name_or_config=en_ms_det_dbnet_resnet50 \ --res_save_dir=/path/to/dbnet_resnet50_results
det_results.txt
will be generated in the directory pointed to by the parameter--res_save_dir
When doing inference, you can use the
--vis_det_save_dir
parameter to visualize the results:Visualization of text detection results
Learn more about infer.py inference parameters
-
Evaluate the results with the following command:
The result is:python deploy/eval_utils/eval_det.py \ --gt_path=/path/to/ic15/test_det_gt.txt \ --pred_path=/path/to/dbnet_resnet50_results/det_results.txt
{'recall': 0.8348579682233991, 'precision': 0.8657014478282576, 'f-score': 0.85}
3.2 Text Recognition¶
Let's take CRNN ResNet34_vd en
in the model support list as an example to introduce the inference method:
-
Download the MindIR file in the model support list;
-
Use the converter_lite tool on Ascend310 or 310P to convert the MindIR to MindSpore Lite MindIR:
Run the following command:
converter_lite \ --saveType=MINDIR \ --fmk=MINDIR \ --optimize=ascend_oriented \ --modelFile=crnn_resnet34-83f37f07-eb10a0c9.mindir \ --outputFile=crnn_resnet34vd_lite
After the above command is executed, the
crnn_resnet34vd_lite.mindir
model file will be generated;For a brief description of the converter_lite parameters, see the text detection example above.
Learn more about converter_lite
Learn more about Model Conversion Tutorial
-
Perform inference using
deploy/py_infer/infer.py
codes andcrnn_resnet34vd_lite.mindir
file:python deploy/py_infer/.py \ --input_images_dir=/path/to/ic15/ch4_test_word_images \ --rec_model_path=/path/to/mindir/crnn_resnet34vd_lite.mindir \ --rec_model_name_or_config=../../configs/rec/crnn/crnn_resnet34.yaml \ --res_save_dir=/path/to/rec_infer_results
After the execution is completed, the prediction file
rec_results.txt
will be generated in the directory pointed to by the parameter--res_save_dir
.Learn more about infer.py inference parameters
-
Evaluate the results with the following command:
python deploy/eval_utils/eval_rec.py \ --gt_path=/path/to/ic15/rec_gt.txt \ --pred_path=/path/to/rec_infer_results/rec_results.txt
3.3 Text Direction Classification¶
Let's take MobileNet
in the model support list as an example to introduce the inference method:
- Download ckpt;
- Use
export.py
and convert ckpt to mindIR- To Dynamic mindIR
python tools/export.py \ --model_name_or_config configs/cls/mobilenetv3/cls_mv3.yaml \ --save_dir /path/to/save/cls_mv3 \ --is_dynamic_shape True \ --model_type cls
- To Static mindIR
python tools/export.py \ --model_name_or_config configs/cls/mobilenetv3/cls_mv3.yaml \ --save_dir /path/to/save/cls_mv3 \ --is_dynamic_shape False \ --data_shape 48 192
- To Dynamic mindIR
-
Use the converter_lite tool on Ascend310 or 310P to convert the MindIR to MindSpore Lite MindIR:
Run the following command:
After the above command is executed, theconverter_lite \ --saveType=MINDIR \ --fmk=MINDIR \ --optimize=ascend_oriented \ --modelFile=/path/to/save/cls_mv3.mindir \ --outputFile=cls_mv3_lite
cls_mv3_lite_lite.mindir
model file will be generated;Learn more about converter_lite
Learn more about Model Conversion Tutorial
3.4 End to End Inference¶
Prepare mindIR according to Text Detection, Text Recognition, Text Direction Classification, and run the following command to do end-to-end inference
python deploy/py_infer/infer.py \
--input_images_dir=/path/to/ic15/ch4_test_images \
--det_model_path=/path/to/mindir/dbnet_resnet50_lite.mindir \
--det_model_name_or_config=en_ms_det_dbnet_resnet50 \
--cls_model_path=/path/to/mindir/cls_mv3_lite.mindir \
--cls_model_name_or_config=configs/cls/mobilenetv3/cls_mv3.yaml \
--rec_model_path=/path/to/mindir/crnn_resnet34vd_lite.mindir \
--rec_model_name_or_config=configs/rec/crnn/crnn_resnet34.yaml \
--res_save_dir=/path/to/infer_results
4.FAQ about converting and inference¶
For problems about converting model and inference, please refer to FAQ for solutions.