* Update recognition_en.md (#10059) ic15_dict.txt only have 36 digits * Update ocr_rec.h (#9469) It is enough to include preprocess_op.h, we do not need to include ocr_cls.h. * 补充num_classes注释说明 (#10073) ser_vi_layoutxlm_xfund_zh.yml中的Architecture.Backbone.num_classes所赋值会设置给Loss.num_classes, 由于采用BIO标注,假设字典中包含n个字段(包含other)时,则类别数为2n-1;假设字典中包含n个字段(不含other)时,则类别数为2n+1。 * Update algorithm_overview_en.md (#9747) Fix links to super-resolution algorithm docs * 改进文档`deploy/hubserving/readme.md`和`doc/doc_ch/models_list.md` (#9110) * Update readme.md * Update readme.md * Update readme.md * Update models_list.md * trim trailling spaces @ `deploy/hubserving/readme_en.md` * `s/shell/bash/` @ `deploy/hubserving/readme_en.md` * Update `deploy/hubserving/readme_en.md` to sync with `deploy/hubserving/readme.md` * Update deploy/hubserving/readme_en.md to sync with `deploy/hubserving/readme.md` * Update deploy/hubserving/readme_en.md to sync with `deploy/hubserving/readme.md` * Update `doc/doc_en/models_list_en.md` to sync with `doc/doc_ch/models_list_en.md` * using Grammarly to weak `deploy/hubserving/readme_en.md` * using Grammarly to tweak `doc/doc_en/models_list_en.md` * `ocr_system` module will return with values of field `confidence` * Update README_CN.md * 修复测试服务中图片转Base64的引用地址错误。 (#8334) * Update application.md * [Doc] Fix 404 link. (#10318) * Update PP-OCRv3_det_train.md * Update knowledge_distillation.md * Update config.md * Fix fitz camelCase deprecation and .PDF not being recognized as pdf file (#10181) * Fix fitz camelCase deprecation and .PDF not being recognized as pdf file * refactor get_image_file_list function * Update customize.md (#10325) * Update FAQ.md (#10345) * Update FAQ.md (#10349) * Don't break overall processing on a bad image (#10216) * Add preprocessing common to OCR tasks (#10217) Add preprocessing to options * [MLU] add mlu device for infer (#10249) * Create newfeature.md * Update newfeature.md * remove unused imported module, so can avoid PyInstaller packaged binary's start-time not found module error. (#10502) * CV套件建设专项活动 - 文字识别返回单字识别坐标 (#10515) * modification of return word box * update_implements * Update rec_postprocess.py * Update utility.py * Update README_ch.md * revert README_ch.md update * Fixed Layout recovery README file (#10493) Co-authored-by: Shubham Chambhare <shubhamchambhare@zoop.one> * update_doc * bugfix --------- Co-authored-by: ChuongLoc <89434232+ChuongLoc@users.noreply.github.com> Co-authored-by: Wang Xin <xinwang614@gmail.com> Co-authored-by: tanjh <dtdhinjapan@gmail.com> Co-authored-by: Louis Maddox <lmmx@users.noreply.github.com> Co-authored-by: n0099 <n@n0099.net> Co-authored-by: zhenliang li <37922155+shouyong@users.noreply.github.com> Co-authored-by: itasli <ilyas.tasli@outlook.fr> Co-authored-by: UserUnknownFactor <63057995+UserUnknownFactor@users.noreply.github.com> Co-authored-by: PeiyuLau <135964669+PeiyuLau@users.noreply.github.com> Co-authored-by: kerneltravel <kjpioo2006@gmail.com> Co-authored-by: ToddBear <43341135+ToddBear@users.noreply.github.com> Co-authored-by: Ligoml <39876205+Ligoml@users.noreply.github.com> Co-authored-by: Shubham Chambhare <59397280+Shubham654@users.noreply.github.com> Co-authored-by: Shubham Chambhare <shubhamchambhare@zoop.one> Co-authored-by: andyj <87074272+andyjpaddle@users.noreply.github.com>
21 KiB
OCR Model List(V3, updated on 2022.4.28)
Note
- Compared with model v2, the 3rd version of the detection model has an improvement in accuracy, and the 2.1 version of the recognition model has optimizations in accuracy and speed with CPU.
- Compared with models 1.1, which are trained with static graph programming paradigm, models 2.0 or higher are the dynamic graph trained version and achieve close performance.
- All models in this tutorial are from the PaddleOCR series, for more introduction to algorithms and models based on the public dataset, you can refer to algorithm overview tutorial.
- OCR Model List(V3, updated on 2022.4.28)
The downloadable models provided by PaddleOCR include the inference model
, trained model
, pre-trained model
and nb model
. The differences between the models are as follows:
model type | model format | description |
---|---|---|
inference model | inference.pdmodel、inference.pdiparams | Used for inference based on Paddle inference engine,detail |
trained model, pre-trained model | *.pdparams、*.pdopt、*.states | The checkpoints model saved in the training process, which stores the parameters of the model, is mostly used for model evaluation and continuous training. |
nb model | *.nb | Model optimized by Paddle-Lite, which is suitable for mobile-side deployment scenarios (Paddle-Lite is needed for nb model deployment). |
The relationship of the above models is as follows.
1. Text Detection Model
1. Chinese Detection Model
model name | description | config | model size | download |
---|---|---|---|---|
ch_PP-OCRv3_det_slim | [New] slim quantization with distillation lightweight model, supporting Chinese, English, multilingual text detection | ch_PP-OCRv3_det_cml.yml | 1.1M | inference model / trained model / nb model |
ch_PP-OCRv3_det | [New] Original lightweight model, supporting Chinese, English, multilingual text detection | ch_PP-OCRv3_det_cml.yml | 3.8M | inference model / trained model |
ch_PP-OCRv2_det_slim | [New] slim quantization with distillation lightweight model, supporting Chinese, English, multilingual text detection | ch_PP-OCRv2_det_cml.yml | 3.0M | inference model |
ch_PP-OCRv2_det | [New] Original lightweight model, supporting Chinese, English, multilingual text detection | ch_PP-OCRv2_det_cml.yml | 3.0M | inference model / trained model |
ch_ppocr_mobile_slim_v2.0_det | Slim pruned lightweight model, supporting Chinese, English, multilingual text detection | ch_det_mv3_db_v2.0.yml | 2.6M | inference model |
ch_ppocr_mobile_v2.0_det | Original lightweight model, supporting Chinese, English, multilingual text detection | ch_det_mv3_db_v2.0.yml | 3.0M | inference model / trained model |
ch_ppocr_server_v2.0_det | General model, which is larger than the lightweight model, but achieved better performance | ch_det_res18_db_v2.0.yml | 47.0M | inference model / trained model |
1.2 English Detection Model
model name | description | config | model size | download |
---|---|---|---|---|
en_PP-OCRv3_det_slim | [New] Slim quantization with distillation lightweight detection model, supporting English | ch_PP-OCRv3_det_cml.yml | 1.1M | inference model / trained model / nb model |
en_PP-OCRv3_det | [New] Original lightweight detection model, supporting English | ch_PP-OCRv3_det_cml.yml | 3.8M | inference model / trained model |
- Note: English configuration file is the same as Chinese except for training data, here we only provide one configuration file.
1.3 Multilingual Detection Model
model name | description | config | model size | download |
---|---|---|---|---|
ml_PP-OCRv3_det_slim | [New] Slim quantization with distillation lightweight detection model, supporting English | ch_PP-OCRv3_det_cml.yml | 1.1M | inference model / trained model / nb model |
ml_PP-OCRv3_det | [New] Original lightweight detection model, supporting English | ch_PP-OCRv3_det_cml.yml | 3.8M | inference model / trained model |
- Note: English configuration file is the same as Chinese except for training data, here we only provide one configuration file.
2. Text Recognition Model
2.1 Chinese Recognition Model
model name | description | config | model size | download |
---|---|---|---|---|
ch_PP-OCRv3_rec_slim | [New] Slim quantization with distillation lightweight model, supporting Chinese, English text recognition | ch_PP-OCRv3_rec_distillation.yml | 4.9M | inference model / trained model / nb model |
ch_PP-OCRv3_rec | [New] Original lightweight model, supporting Chinese, English, multilingual text recognition | ch_PP-OCRv3_rec_distillation.yml | 12.4M | inference model / trained model |
ch_PP-OCRv2_rec_slim | Slim quantization with distillation lightweight model, supporting Chinese, English text recognition | ch_PP-OCRv2_rec.yml | 9.0M | inference model / trained model |
ch_PP-OCRv2_rec | Original lightweight model, supporting Chinese, English, and multilingual text recognition | ch_PP-OCRv2_rec_distillation.yml | 8.5M | inference model / trained model |
ch_ppocr_mobile_slim_v2.0_rec | Slim pruned and quantized lightweight model, supporting Chinese, English and number recognition | rec_chinese_lite_train_v2.0.yml | 6.0M | inference model / trained model |
ch_ppocr_mobile_v2.0_rec | Original lightweight model, supporting Chinese, English and number recognition | rec_chinese_lite_train_v2.0.yml | 5.2M | inference model / trained model / pre-trained model |
ch_ppocr_server_v2.0_rec | General model, supporting Chinese, English and number recognition | rec_chinese_common_train_v2.0.yml | 94.8M | inference model / trained model / pre-trained model |
Note: The trained model
is fine-tuned on the pre-trained model
with real data and synthesized vertical text data, which achieved better performance in the real scene. The pre-trained model
is directly trained on the full amount of real data and synthesized data, which is more suitable for fine-tuning your dataset.
2.2 English Recognition Model
model name | description | config | model size | download |
---|---|---|---|---|
en_PP-OCRv3_rec_slim | [New] Slim quantization with distillation lightweight model, supporting English, English text recognition | en_PP-OCRv3_rec.yml | 3.2M | inference model / trained model / nb model |
en_PP-OCRv3_rec | [New] Original lightweight model, supporting English, English, multilingual text recognition | en_PP-OCRv3_rec.yml | 9.6M | inference model / trained model |
en_number_mobile_slim_v2.0_rec | Slim pruned and quantized lightweight model, supporting English and number recognition | rec_en_number_lite_train.yml | 2.7M | inference model / trained model |
en_number_mobile_v2.0_rec | Original lightweight model, supporting English and number recognition | rec_en_number_lite_train.yml | 2.6M | inference model / trained model |
Note: Dictionary file of all English recognition models is ppocr/utils/en_dict.txt
.
2.3 Multilingual Recognition Model(Updating...)
model name | dict file | description | config | model size | download |
---|---|---|---|---|---|
korean_PP-OCRv3_rec | ppocr/utils/dict/korean_dict.txt | Lightweight model for Korean recognition | korean_PP-OCRv3_rec.yml | 11.0M | inference model / trained model |
japan_PP-OCRv3_rec | ppocr/utils/dict/japan_dict.txt | Lightweight model for Japanese recognition | japan_PP-OCRv3_rec.yml | 11.0M | inference model / trained model |
chinese_cht_PP-OCRv3_rec | ppocr/utils/dict/chinese_cht_dict.txt | Lightweight model for chinese cht | chinese_cht_PP-OCRv3_rec.yml | 12.0M | inference model / trained model |
te_PP-OCRv3_rec | ppocr/utils/dict/te_dict.txt | Lightweight model for Telugu recognition | te_PP-OCRv3_rec.yml | 9.6M | inference model / trained model |
ka_PP-OCRv3_rec | ppocr/utils/dict/ka_dict.txt | Lightweight model for Kannada recognition | ka_PP-OCRv3_rec.yml | 9.9M | inference model / trained model |
ta_PP-OCRv3_rec | ppocr/utils/dict/ta_dict.txt | Lightweight model for Tamil recognition | ta_PP-OCRv3_rec.yml | 9.6M | inference model / trained model |
latin_PP-OCRv3_rec | ppocr/utils/dict/latin_dict.txt | Lightweight model for latin recognition | latin_PP-OCRv3_rec.yml | 9.7M | inference model / trained model |
arabic_PP-OCRv3_rec | ppocr/utils/dict/arabic_dict.txt | Lightweight model for arabic recognition | arabic_PP-OCRv3_rec.yml | 9.6M | inference model / trained model |
cyrillic_PP-OCRv3_rec | ppocr/utils/dict/cyrillic_dict.txt | Lightweight model for cyrillic recognition | cyrillic_PP-OCRv3_rec.yml | 9.6M | inference model / trained model |
devanagari_PP-OCRv3_rec | ppocr/utils/dict/devanagari_dict.txt | Lightweight model for devanagari recognition | devanagari_PP-OCRv3_rec.yml | 9.9M | inference model / trained model |
For a complete list of languages and tutorials, please refer to Multi-language model
3. Text Angle Classification Model
model name | description | config | model size | download |
---|---|---|---|---|
ch_ppocr_mobile_slim_v2.0_cls | Slim quantized model for text angle classification | cls_mv3.yml | 2.1M | inference model / trained model / nb model |
ch_ppocr_mobile_v2.0_cls | Original model for text angle classification | cls_mv3.yml | 1.38M | inference model / trained model |
4. Paddle-Lite Model
Paddle Lite is an updated version of Paddle-Mobile, an open-open source deep learning framework designed to make it easy to perform inference on mobile, embedded, and IoT devices. It can further optimize the inference model and generate the nb model
used for edge devices. It's suggested to optimize the quantization model using Paddle-Lite because the INT8
format is used for the model storage and inference.
This chapter lists OCR nb models with PP-OCRv2 or earlier versions. You can access the latest nb models from the above tables.
Version | Introduction | Model size | Detection model | Text Direction model | Recognition model | Paddle-Lite branch |
---|---|---|---|---|---|---|
PP-OCRv2 | extra-lightweight chinese OCR optimized model | 11.0M | download link | download link | download link | v2.10 |
PP-OCRv2(slim) | extra-lightweight chinese OCR optimized model | 4.6M | download link | download link | download link | v2.10 |
PP-OCRv2 | extra-lightweight chinese OCR optimized model | 11.0M | download link | download link | download link | v2.9 |
PP-OCRv2(slim) | extra-lightweight chinese OCR optimized model | 4.9M | download link | download link | download link | v2.9 |
V2.0 | ppocr_v2.0 extra-lightweight chinese OCR optimized model | 7.8M | download link | download link | download link | v2.9 |
V2.0(slim) | ppovr_v2.0 extra-lightweight chinese OCR optimized model | 3.3M | download link | download link | download link | v2.9 |