mirror of https://github.com/PaddlePaddle/PaddleOCR.git synced 2025-06-03 21:53:39 +08:00

Add new recognition method "ParseQ" (#10836 )

* Update PP-OCRv4_introduction.md

* Update PP-OCRv4_introduction.md (#10616)

* Update PP-OCRv4_introduction.md

* Update PP-OCRv4_introduction.md

* Update PP-OCRv4_introduction.md

* Update README.md

* Cherrypicking GH-10217 and GH-10216 to PaddlePaddle:Release/2.7 (#10655)

* Don't break overall processing on a bad image

* Add preprocessing common to OCR tasks
Add preprocessing to options

* Update requirements.txt (#10656)

added missing pyyaml library

* [TIPC]update xpu tipc script (#10658)

* fix-typo (#10642)

Co-authored-by: Dennis <dvorst@users.noreply.github.com>
Co-authored-by: shiyutang <34859558+shiyutang@users.noreply.github.com>

* 修改数据增强导致的DSR报错 (#10662) (#10681)

* 修改数据增强导致的DSR报错

* 错误修改回滚

* Update algorithm_overview_en.md (#10670)

Fixed simple spelling errors.

* Implement recoginition method ParseQ

* Document update for new recognition method ParseQ

* add prediction for parseq

* Update rec_vit_parseq.yml

* Update rec_r31_sar.yml

* Update rec_r31_sar.yml

* Update rec_r50_fpn_srn.yml

* Update rec_vit_parseq.py

* Update rec_vit_parseq.yml

* Update rec_parseq_head.py

* Update rec_img_aug.py

* Update rec_vit_parseq.yml

* Update __init__.py

* Update predict_rec.py

* Update paddleocr.py

* Update requirements.txt

* Update utility.py

* Update utility.py

---------

Co-authored-by: xiaoting <31891223+tink2123@users.noreply.github.com>
Co-authored-by: topduke <784990967@qq.com>
Co-authored-by: dyning <dyning.2003@163.com>
Co-authored-by: UserUnknownFactor <63057995+UserUnknownFactor@users.noreply.github.com>
Co-authored-by: itasli <ilyas.tasli@outlook.fr>
Co-authored-by: Kai Song <50285351+USTCKAY@users.noreply.github.com>
Co-authored-by: dvorst <87502756+dvorst@users.noreply.github.com>
Co-authored-by: Dennis <dvorst@users.noreply.github.com>
Co-authored-by: shiyutang <34859558+shiyutang@users.noreply.github.com>
Co-authored-by: Dec20B <1192152456@qq.com>
Co-authored-by: ncoffman <51147417+ncoffman@users.noreply.github.com>

2023-09-07 16:36:47 +08:00

4.1 KiB

Raw Blame History

ParseQ

1. 算法简介
2. 环境配置
3. 模型训练、评估、预测
4. 推理部署
5. FAQ

1. 算法简介

论文信息：

Scene Text Recognition with Permuted Autoregressive Sequence Models Darwin Bautista, Rowel Atienza ECCV, 2021

原论文分别使用真实文本识别数据集(Real)和合成文本识别数据集(Synth)进行训练，在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上进行评估。其中：

真实文本识别数据集(Real)包含COCO-Text, RCTW17, Uber-Text, ArT, LSVT, MLT19, ReCTS, TextOCR, OpenVINO数据集
合成文本识别数据集(Synth)包含MJSynth和SynthText数据集

在不同数据集上训练的算法的复现效果如下：

数据集	模型	骨干网络	配置文件	Acc	下载链接
Synth	ParseQ	VIT	rec_vit_parseq.yml	91.24%	训练模型
Real	ParseQ	VIT	rec_vit_parseq.yml	94.74%	训练模型

2. 环境配置

请先参考《运行环境准备》配置PaddleOCR运行环境，参考《项目克隆》克隆项目代码。

3. 模型训练、评估、预测

请参考文本识别教程。PaddleOCR对代码进行了模块化，训练不同的识别模型只需要更换配置文件即可。

训练

具体地，在完成数据准备后，便可以启动训练，训练命令如下：

#单卡训练（训练周期长，不建议）
python3 tools/train.py -c configs/rec/rec_vit_parseq.yml

#多卡训练，通过--gpus参数指定卡号
python3 -m paddle.distributed.launch --gpus '0,1,2,3'  tools/train.py -c configs/rec/rec_vit_parseq.yml

评估

# GPU 评估， Global.pretrained_model 为待测权重
python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_vit_parseq.yml -o Global.pretrained_model={path/to/weights}/best_accuracy

预测：

# 预测使用的配置文件必须与训练一致
python3 tools/infer_rec.py -c configs/rec/rec_vit_parseq.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/en/word_1.png

4. 推理部署

4.1 Python推理

首先将ParseQ文本识别训练过程中保存的模型，转换成inference model。（模型下载地址 )，可以使用如下命令进行转换：

python3 tools/export_model.py -c configs/rec/rec_vit_parseq.yml -o Global.pretrained_model=./rec_vit_parseq_real/best_accuracy Global.save_inference_dir=./inference/rec_parseq

ParseQ文本识别模型推理，可以执行如下命令：

python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/en/word_1.png" --rec_model_dir="./inference/rec_parseq/" --rec_image_shape="3, 32, 128" --rec_algorithm="ParseQ" --rec_char_dict_path="ppocr/utils/dict/parseq_dict.txt" --max_text_length=25 --use_space_char=False

4.2 C++推理

由于C++预处理后处理还未支持ParseQ，所以暂未支持

4.3 Serving服务化部署

暂不支持

4.4 更多推理部署