From ac17abe989ff2c3a451b6871f519b416d3c20cf7 Mon Sep 17 00:00:00 2001 From: zhangyubo0722 <94225063+zhangyubo0722@users.noreply.github.com> Date: Tue, 20 May 2025 18:08:13 +0800 Subject: [PATCH] add ocrv5 inference time (#15234) Co-authored-by: zhangyubo0722 --- .../module_usage/text_recognition.en.md | 30 +++--- .../module_usage/text_recognition.md | 30 +++--- docs/version3.x/pipeline_usage/OCR.en.md | 99 +++++++++-------- docs/version3.x/pipeline_usage/OCR.md | 38 ++++--- .../pipeline_usage/PP-ChatOCRv4.en.md | 62 ++++++++--- .../version3.x/pipeline_usage/PP-ChatOCRv4.md | 39 ++++--- .../pipeline_usage/PP-StructureV3.en.md | 102 ++++++++++++++---- .../pipeline_usage/PP-StructureV3.md | 38 ++++--- .../pipeline_usage/seal_recognition.en.md | 95 ++++++++++++---- .../pipeline_usage/seal_recognition.md | 38 ++++--- .../pipeline_usage/table_recognition_v2.en.md | 98 +++++++++-------- .../pipeline_usage/table_recognition_v2.md | 38 ++++--- 12 files changed, 422 insertions(+), 285 deletions(-) diff --git a/docs/version3.x/module_usage/text_recognition.en.md b/docs/version3.x/module_usage/text_recognition.en.md index 680ad04e11..407351c927 100644 --- a/docs/version3.x/module_usage/text_recognition.en.md +++ b/docs/version3.x/module_usage/text_recognition.en.md @@ -24,19 +24,18 @@ The text recognition module is the core component of an OCR (Optical Character R PP-OCRv5_server_recInference Model/Pretrained Model 86.38 - - - - -205 M -PP-OCRv5_server_rec is a next-generation text recognition model. It aims to efficiently and accurately support the recognition of four major languages—Simplified Chinese, Traditional Chinese, English, and Japanese—as well as complex text scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model. While maintaining recognition performance, it balances inference speed and model robustness, providing efficient and accurate technical support for document understanding in various scenarios. + 8.45/2.36 + 122.69/122.69 +81 M +PP-OCRv5_rec is a next-generation text recognition model. It aims to efficiently and accurately support the recognition of four major languages—Simplified Chinese, Traditional Chinese, English, and Japanese—as well as complex text scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model. While maintaining recognition performance, it balances inference speed and model robustness, providing efficient and accurate technical support for document understanding in various scenarios. PP-OCRv5_mobile_recInference Model/Pretrained Model 81.29 - - - - -136 M -PP-OCRv5_mobile_rec is a next-generation text recognition model. It aims to efficiently and accurately support the recognition of four major languages—Simplified Chinese, Traditional Chinese, English, and Japanese—as well as complex text scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model. While maintaining recognition performance, it balances inference speed and model robustness, providing efficient and accurate technical support for document understanding in various scenarios. + 1.46/5.43 + 5.32/91.79 +16 M PP-OCRv4_server_rec_docInference Model/PP-OCRv5_rec is a next-generation text recognition model. It aims to efficiently and accurately support the recognition of four major languages—Simplified Chinese, Traditional Chinese, English, and Japanese—as well as complex text scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model. While maintaining recognition performance, it balances inference speed and model robustness, providing efficient and accurate technical support for document understanding in various scenarios. PP-OCRv5_mobile_recInference Model/推理模型/训练模型 86.38 - - - - -205M -PP-OCRv5_server_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 + 8.45/2.36 + 122.69/122.69 +81 M +PP-OCRv5_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 PP-OCRv5_mobile_rec推理模型/训练模型 81.29 - - - - -128 -PP-OCRv5_mobile_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 + 1.46/5.43 + 5.32/91.79 +16 M PP-OCRv4_server_rec_doc推理模型/PP-OCRv5_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 PP-OCRv5_mobile_rec推理模型/Inference Model/Training Model +PP-OCRv5_server_rec_infer.tar">Inference Model/Pretrained Model 86.38 - - - - -205 -PP-OCRv5_server_rec is a next-generation text recognition model designed to efficiently and accurately support Simplified Chinese, Traditional Chinese, English, and Japanese, as well as complex scenarios like handwriting, vertical text, pinyin, and rare characters. It balances recognition performance with inference speed and robustness, providing reliable support for document understanding across diverse scenarios. + 8.45/2.36 + 122.69/122.69 +81 M +PP-OCRv5_rec is a next-generation text recognition model. It aims to efficiently and accurately support the recognition of four major languages—Simplified Chinese, Traditional Chinese, English, and Japanese—as well as complex text scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model. While maintaining recognition performance, it balances inference speed and model robustness, providing efficient and accurate technical support for document understanding in various scenarios. PP-OCRv5_mobile_recInference Model/Training Model +PP-OCRv5_mobile_rec_infer.tar">Inference Model/Pretrained Model 81.29 - - - - -128 -PP-OCRv5_mobile_rec is a next-generation lightweight text recognition model optimized for efficiency and accuracy across Simplified Chinese, Traditional Chinese, English, and Japanese, including complex scenarios like handwriting and vertical text. It delivers robust performance while maintaining fast inference speeds. + 1.46/5.43 + 5.32/91.79 +16 M PP-OCRv4_server_rec_docInference Model/Training Model +PP-OCRv4_server_rec_doc_infer.tar">Inference Model/Pretrained Model 86.58 6.65 / 2.38 32.92 / 32.92 -181 -PP-OCRv4_server_rec_doc is trained on a hybrid dataset of Chinese document data and PP-OCR training data, enhancing recognition for Traditional Chinese, Japanese, and special characters. It supports 15,000+ characters and improves both document-specific and general text recognition. +91 M +PP-OCRv4_server_rec_doc is trained on a mixed dataset of more Chinese document data and PP-OCR training data, building upon PP-OCRv4_server_rec. It enhances the recognition capabilities for some Traditional Chinese characters, Japanese characters, and special symbols, supporting over 15,000 characters. In addition to improving document-related text recognition, it also enhances general text recognition capabilities. -PP-OCRv4_mobile_recInference Model/Training Model +PP-OCRv4_mobile_recInference Model/Pretrained Model 83.28 4.82 / 1.20 16.74 / 4.64 -88 -PP-OCRv4's lightweight recognition model, optimized for fast inference on edge devices and various hardware platforms. +11 M +A lightweight recognition model of PP-OCRv4 with high inference efficiency, suitable for deployment on various hardware devices, including edge devices. -PP-OCRv4_server_rec Inference Model/Training Model +PP-OCRv4_server_rec Inference Model/Pretrained Model 85.19 6.58 / 2.43 33.17 / 33.17 -151 -PP-OCRv4's server-side model, delivering high accuracy for deployment on various servers. +87 M +The server-side model of PP-OCRv4, offering high inference accuracy and deployable on various servers. en_PP-OCRv4_mobile_recInference Model/Training Model +en_PP-OCRv4_mobile_rec_infer.tar">Inference Model/Pretrained Model 70.39 4.81 / 0.75 16.10 / 5.31 -66 -An ultra-lightweight English recognition model based on PP-OCRv4, supporting English and numeric characters. +7.3 M +An ultra-lightweight English recognition model trained based on the PP-OCRv4 recognition model, supporting English and numeric character recognition. +> ❗ The above section lists the **6 core models** that are primarily supported by the text recognition module. In total, the module supports **20 comprehensive models**, including multiple multilingual text recognition models. Below is the complete list of models: -> ❗ The above table highlights 6 core models from the text recognition module, which includes 10 full models in total, covering multiple multilingual recognition models. For the complete list: +
👉Details of the Model List -
👉 Full Model Details - -* PP-OCRv5 Multi-Scene Models +* PP-OCRv5 Multi-Scenario Models - - - - - - - - - + + + + + + + + + +PP-OCRv5_server_rec_infer.tar">Inference Model/Pretrained Model - - - - + + + + +PP-OCRv5_mobile_rec_infer.tar">Inference Model/Pretrained Model - - - - + + +
ModelDownload LinksChinese Accuracy(%)English Accuracy(%)Traditional Chinese Accuracy(%)Japanese Accuracy(%)GPU Inference Time (ms)
[Standard / High-Performance]
CPU Inference Time (ms)
[Standard / High-Performance]
Model Size (MB)DescriptionModelModel Download LinksAvg Accuracy for Chinese Recognition (%)Avg Accuracy for English Recognition (%)Avg Accuracy for Traditional Chinese Recognition (%)Avg Accuracy for Japanese Recognition (%)GPU Inference Time (ms)
[Normal Mode / High-Performance Mode]
CPU Inference Time (ms)
[Normal Mode / High-Performance Mode]
Model Storage Size (M)Introduction
PP-OCRv5_server_recInference Model/Training Model 86.38 64.70 93.29 60.35 - - 205PP-OCRv5_server_rec is a next-generation text recognition model supporting Simplified Chinese, Traditional Chinese, English, and Japanese, including complex scenarios like handwriting and vertical text. 8.45/2.36 122.69/122.69 81 MPP-OCRv5_rec is a next-generation text recognition model. It aims to efficiently and accurately support the recognition of four major languages—Simplified Chinese, Traditional Chinese, English, and Japanese—as well as complex text scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model. While maintaining recognition performance, it balances inference speed and model robustness, providing efficient and accurate technical support for document understanding in various scenarios.
PP-OCRv5_mobile_recInference Model/Training Model 81.29 66.00 83.55 54.65 - - 128PP-OCRv5_mobile_rec is a lightweight version optimized for efficiency and accuracy across multiple languages and scenarios. 1.46/5.43 5.32/91.79 16 M
diff --git a/docs/version3.x/pipeline_usage/OCR.md b/docs/version3.x/pipeline_usage/OCR.md index 9dbd3e5c4e..2d88ba4026 100644 --- a/docs/version3.x/pipeline_usage/OCR.md +++ b/docs/version3.x/pipeline_usage/OCR.md @@ -135,19 +135,18 @@ OCR(光学字符识别,Optical Character Recognition)是一种将图像中 PP-OCRv5_server_rec推理模型/训练模型 86.38 - - - - -205 M -PP-OCRv5_server_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 + 8.45/2.36 + 122.69/122.69 +81 M +PP-OCRv5_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 PP-OCRv5_mobile_rec推理模型/训练模型 81.29 - - - - -136 M -PP-OCRv5_mobile_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 + 1.46/5.43 + 5.32/91.79 +16 M PP-OCRv4_server_rec_doc推理模型/推理模型/推理模型/推理模型/推理模型/PP-OCRv5_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 PP-OCRv5_mobile_rec推理模型/Inference Model/Training Model -78.20 -4.82 / 4.82 +PP-OCRv5_server_recInference Model/Pretrained Model +86.38 + 8.45/2.36 + 122.69/122.69 +81 M +PP-OCRv5_rec is a next-generation text recognition model. It aims to efficiently and accurately support the recognition of four major languages—Simplified Chinese, Traditional Chinese, English, and Japanese—as well as complex text scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model. While maintaining recognition performance, it balances inference speed and model robustness, providing efficient and accurate technical support for document understanding in various scenarios. + + +PP-OCRv5_mobile_recInference Model/Pretrained Model +81.29 + 1.46/5.43 + 5.32/91.79 +16 M + + +PP-OCRv4_server_rec_docInference Model/Pretrained Model +86.58 +6.65 / 2.38 +32.92 / 32.92 +91 M +PP-OCRv4_server_rec_doc is trained on a mixed dataset of more Chinese document data and PP-OCR training data, building upon PP-OCRv4_server_rec. It enhances the recognition capabilities for some Traditional Chinese characters, Japanese characters, and special symbols, supporting over 15,000 characters. In addition to improving document-related text recognition, it also enhances general text recognition capabilities. + + +PP-OCRv4_mobile_recInference Model/Pretrained Model +83.28 +4.82 / 1.20 16.74 / 4.64 -10.6 M -PP-OCRv4 is the next version of Baidu PaddlePaddle's self-developed text recognition model PP-OCRv3. By introducing data augmentation schemes and GTC-NRTR guidance branches, it further improves text recognition accuracy without compromising inference speed. The model offers both server (server) and mobile (mobile) versions to meet industrial needs in different scenarios. +11 M +A lightweight recognition model of PP-OCRv4 with high inference efficiency, suitable for deployment on various hardware devices, including edge devices. -PP-OCRv4_server_recInference Model/Training Model -79.20 -6.58 / 6.58 +PP-OCRv4_server_rec Inference Model/Pretrained Model +85.19 +6.58 / 2.43 33.17 / 33.17 -71.2 M +87 M +The server-side model of PP-OCRv4, offering high inference accuracy and deployable on various servers. + + +en_PP-OCRv4_mobile_recInference Model/Pretrained Model +70.39 +4.81 / 0.75 +16.10 / 5.31 +7.3 M +An ultra-lightweight English recognition model trained based on the PP-OCRv4 recognition model, supporting English and numeric character recognition. diff --git a/docs/version3.x/pipeline_usage/PP-ChatOCRv4.md b/docs/version3.x/pipeline_usage/PP-ChatOCRv4.md index 707f4a0beb..9528f542d7 100644 --- a/docs/version3.x/pipeline_usage/PP-ChatOCRv4.md +++ b/docs/version3.x/pipeline_usage/PP-ChatOCRv4.md @@ -275,27 +275,25 @@ PP-ChatOCRv4 产线中包含版面区域检测模块表格结构识 PP-OCRv5_server_rec推理模型/训练模型 86.38 - - - - -205 M -PP-OCRv5_server_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 + 8.45/2.36 + 122.69/122.69 +81 M +PP-OCRv5_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 PP-OCRv5_mobile_rec推理模型/训练模型 81.29 - - - - -136 M -PP-OCRv5_mobile_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 - + 1.46/5.43 + 5.32/91.79 +16 M PP-OCRv4_server_rec_doc推理模型/训练模型 86.58 6.65 / 2.38 32.92 / 32.92 -91 M +181 M PP-OCRv4_server_rec_doc是在PP-OCRv4_server_rec的基础上,在更多中文文档数据和PP-OCR训练数据的混合数据训练而成,增加了部分繁体字、日文、特殊字符的识别能力,可支持识别的字符为1.5万+,除文档相关的文字识别能力提升外,也同时提升了通用文字的识别能力 @@ -303,7 +301,7 @@ PP-OCRv4_server_rec_doc_infer.tar">推理模型/推理模型/推理模型/推理模型/PP-OCRv5_server_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 PP-OCRv5_mobile_rec推理模型/Inference Model/Training Model -81.53 +PP-OCRv5_server_recInference Model/Pretrained Model +86.38 + 8.45/2.36 + 122.69/122.69 +81 M +PP-OCRv5_server_rec is a next-generation text recognition model. It aims to efficiently and accurately support the recognition of four major languages—Simplified Chinese, Traditional Chinese, English, and Japanese—as well as complex text scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model. While maintaining recognition performance, it balances inference speed and model robustness, providing efficient and accurate technical support for document understanding in various scenarios. + + +PP-OCRv5_mobile_recInference Model/Pretrained Model +81.29 + 1.46/5.43 + 5.32/91.79 +16 M +PP-OCRv5_mobile_rec is a next-generation text recognition model. It aims to efficiently and accurately support the recognition of four major languages—Simplified Chinese, Traditional Chinese, English, and Japanese—as well as complex text scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model. While maintaining recognition performance, it balances inference speed and model robustness, providing efficient and accurate technical support for document understanding in various scenarios. + + +PP-OCRv4_server_rec_docInference Model/Pretrained Model +86.58 6.65 / 2.38 32.92 / 32.92 -74.7 M -PP-OCRv4_server_rec_doc is trained on a mixed dataset of more Chinese document data and PP-OCR training data based on PP-OCRv4_server_rec. It has added the recognition capabilities for some traditional Chinese characters, Japanese, and special characters. The number of recognizable characters is over 15,000. In addition to the improvement in document-related text recognition, it also enhances the general text recognition capability. +91 M +PP-OCRv4_server_rec_doc is trained on a mixed dataset of more Chinese document data and PP-OCR training data, building upon PP-OCRv4_server_rec. It enhances the recognition capabilities for some Traditional Chinese characters, Japanese characters, and special symbols, supporting over 15,000 characters. In addition to improving document-related text recognition, it also enhances general text recognition capabilities. -PP-OCRv4_mobile_recInference Model/Training Model -78.74 +PP-OCRv4_mobile_recInference Model/Pretrained Model +83.28 4.82 / 1.20 16.74 / 4.64 -10.6 M -The lightweight recognition model of PP-OCRv4 has high inference efficiency and can be deployed on various hardware devices, including edge devices. +11 M +A lightweight recognition model of PP-OCRv4 with high inference efficiency, suitable for deployment on various hardware devices, including edge devices. -PP-OCRv4_server_rec Inference Model/Trained Model -80.61 +PP-OCRv4_server_rec Inference Model/Pretrained Model +85.19 6.58 / 2.43 33.17 / 33.17 -71.2 M -The server-side model of PP-OCRv4 offers high inference accuracy and can be deployed on various types of servers. +87 M +The server-side model of PP-OCRv4, offering high inference accuracy and deployable on various servers. -PP-OCRv3_mobile_recInference Model/Training Model -72.96 -5.87 / 1.19 -9.07 / 4.28 -9.2 M -PP-OCRv3’s lightweight recognition model is designed for high inference efficiency and can be deployed on a variety of hardware devices, including edge devices. +en_PP-OCRv4_mobile_recInference Model/Pretrained Model +70.39 +4.81 / 0.75 +16.10 / 5.31 +7.3 M +An ultra-lightweight English recognition model trained based on the PP-OCRv4 recognition model, supporting English and numeric character recognition. + + + +> ❗ The above section lists the **6 core models** that are primarily supported by the text recognition module. In total, the module supports **20 comprehensive models**, including multiple multilingual text recognition models. Below is the complete list of models: + +
👉Details of the Model List + +* PP-OCRv5 Multi-Scenario Models + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ModelModel Download LinksAvg Accuracy for Chinese Recognition (%)Avg Accuracy for English Recognition (%)Avg Accuracy for Traditional Chinese Recognition (%)Avg Accuracy for Japanese Recognition (%)GPU Inference Time (ms)
[Normal Mode / High-Performance Mode]
CPU Inference Time (ms)
[Normal Mode / High-Performance Mode]
Model Storage Size (M)Introduction
PP-OCRv5_server_recInference Model/Pretrained Model86.3864.7093.2960.35 8.45/2.36 122.69/122.69 81 MPP-OCRv5_server_rec is a next-generation text recognition model. It aims to efficiently and accurately support the recognition of four major languages—Simplified Chinese, Traditional Chinese, English, and Japanese—as well as complex text scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model. While maintaining recognition performance, it balances inference speed and model robustness, providing efficient and accurate technical support for document understanding in various scenarios.
PP-OCRv5_mobile_recInference Model/Pretrained Model81.2966.0083.5554.65 1.46/5.43 5.32/91.79 16 MPP-OCRv5_mobile_rec is a next-generation text recognition model. It aims to efficiently and accurately support the recognition of four major languages—Simplified Chinese, Traditional Chinese, English, and Japanese—as well as complex text scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model. While maintaining recognition performance, it balances inference speed and model robustness, providing efficient and accurate technical support for document understanding in various scenarios.
diff --git a/docs/version3.x/pipeline_usage/PP-StructureV3.md b/docs/version3.x/pipeline_usage/PP-StructureV3.md index c56b8820c3..f7bc8c2464 100644 --- a/docs/version3.x/pipeline_usage/PP-StructureV3.md +++ b/docs/version3.x/pipeline_usage/PP-StructureV3.md @@ -242,19 +242,18 @@ comments: true PP-OCRv5_server_rec推理模型/训练模型 86.38 - - - - -205 M -PP-OCRv5_server_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 + 8.45/2.36 + 122.69/122.69 +81 M +PP-OCRv5_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 PP-OCRv5_mobile_rec推理模型/训练模型 81.29 - - - - -136 M -PP-OCRv5_mobile_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 + 1.46/5.43 + 5.32/91.79 +16 M PP-OCRv4_server_rec_doc推理模型/推理模型/推理模型/推理模型/推理模型/PP-OCRv5_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 PP-OCRv5_mobile_rec推理模型/Inference Model/Training Model -81.53 +PP-OCRv5_server_recInference Model/Pretrained Model +86.38 + 8.45/2.36 + 122.69/122.69 +81 M +PP-OCRv5_rec is a next-generation text recognition model. It aims to efficiently and accurately support the recognition of four major languages—Simplified Chinese, Traditional Chinese, English, and Japanese—as well as complex text scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model. While maintaining recognition performance, it balances inference speed and model robustness, providing efficient and accurate technical support for document understanding in various scenarios. + + +PP-OCRv5_mobile_recInference Model/Pretrained Model +81.29 + 1.46/5.43 + 5.32/91.79 +16 M + + +PP-OCRv4_server_rec_docInference Model/Pretrained Model +86.58 6.65 / 2.38 32.92 / 32.92 -74.7 M -PP-OCRv4_server_rec_doc is trained on a mixed dataset of more Chinese document data and PP-OCR training data based on PP-OCRv4_server_rec. It has added the ability to recognize some traditional Chinese characters, Japanese, and special characters, and can support the recognition of more than 15,000 characters. In addition to improving the text recognition capability related to documents, it also enhances the general text recognition capability. +91 M +PP-OCRv4_server_rec_doc is trained on a mixed dataset of more Chinese document data and PP-OCR training data, building upon PP-OCRv4_server_rec. It enhances the recognition capabilities for some Traditional Chinese characters, Japanese characters, and special symbols, supporting over 15,000 characters. In addition to improving document-related text recognition, it also enhances general text recognition capabilities. -PP-OCRv4_mobile_recInference Model/Training Model -78.74 +PP-OCRv4_mobile_recInference Model/Pretrained Model +83.28 4.82 / 1.20 16.74 / 4.64 -10.6 M - -The lightweight recognition model of PP-OCRv4 has high inference efficiency and can be deployed on various hardware devices, including edge devices. +11 M +A lightweight recognition model of PP-OCRv4 with high inference efficiency, suitable for deployment on various hardware devices, including edge devices. -PP-OCRv4_server_recInference Model/Training Model -80.61 +PP-OCRv4_server_rec Inference Model/Pretrained Model +85.19 6.58 / 2.43 33.17 / 33.17 -71.2 M -The server-side model of PP-OCRv4 offers high inference accuracy and can be deployed on various types of servers. +87 M +The server-side model of PP-OCRv4, offering high inference accuracy and deployable on various servers. -en_PP-OCRv4_mobile_recInference Model/Training Model +en_PP-OCRv4_mobile_recInference Model/Pretrained Model 70.39 4.81 / 0.75 16.10 / 5.31 -6.8 M -The ultra-lightweight English recognition model, trained based on the PP-OCRv4 recognition model, supports the recognition of English letters and numbers. +7.3 M +An ultra-lightweight English recognition model trained based on the PP-OCRv4 recognition model, supporting English and numeric character recognition. -> ❗ The above list features the 4 core models that the text recognition module primarily supports. In total, this module supports 18 models. The complete list of models is as follows: +> ❗ The above section lists the **6 core models** that are primarily supported by the text recognition module. In total, the module supports **20 comprehensive models**, including multiple multilingual text recognition models. Below is the complete list of models: -
👉Model List Details +
👉Details of the Model List + +* PP-OCRv5 Multi-Scenario Models + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ModelModel Download LinksAvg Accuracy for Chinese Recognition (%)Avg Accuracy for English Recognition (%)Avg Accuracy for Traditional Chinese Recognition (%)Avg Accuracy for Japanese Recognition (%)GPU Inference Time (ms)
[Normal Mode / High-Performance Mode]
CPU Inference Time (ms)
[Normal Mode / High-Performance Mode]
Model Storage Size (M)Introduction
PP-OCRv5_server_recInference Model/Pretrained Model86.3864.7093.2960.35 8.45/2.36 122.69/122.69 81 MPP-OCRv5_rec is a next-generation text recognition model. It aims to efficiently and accurately support the recognition of four major languages—Simplified Chinese, Traditional Chinese, English, and Japanese—as well as complex text scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model. While maintaining recognition performance, it balances inference speed and model robustness, providing efficient and accurate technical support for document understanding in various scenarios.
PP-OCRv5_mobile_recInference Model/Pretrained Model81.2966.0083.5554.65 1.46/5.43 5.32/91.79 16 M
* Chinese Recognition Model diff --git a/docs/version3.x/pipeline_usage/seal_recognition.md b/docs/version3.x/pipeline_usage/seal_recognition.md index 739a4f3a25..923f84dc33 100644 --- a/docs/version3.x/pipeline_usage/seal_recognition.md +++ b/docs/version3.x/pipeline_usage/seal_recognition.md @@ -273,19 +273,18 @@ comments: true - - - - + + + + - - - - + + + +PP-OCRv5_server_rec_infer.tar">Inference Model/Pretrained Model - - - - + + + + +PP-OCRv5_mobile_rec_infer.tar">Inference Model/Pretrained Model - - - - + + + +PP-OCRv4_server_rec_doc_infer.tar">Inference Model/Pretrained Model - - + + - + - - + + - + - - + + +en_PP-OCRv4_mobile_rec_infer.tar">Inference Model/Pretrained Model - - + +
PP-OCRv5_server_rec推理模型/训练模型 86.38 - - 205 MPP-OCRv5_server_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 8.45/2.36 122.69/122.69 81 MPP-OCRv5_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。
PP-OCRv5_mobile_rec推理模型/训练模型 81.29 - - 136 MPP-OCRv5_mobile_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 1.46/5.43 5.32/91.79 16 M
PP-OCRv4_server_rec_doc推理模型/推理模型/推理模型/推理模型/推理模型/PP-OCRv5_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。
PP-OCRv5_mobile_rec推理模型/Inference Model/Training Model 86.38 - - 205MPP-OCRv5_server_rec is a new generation text recognition model. This model aims to efficiently and accurately support four major languages: Simplified Chinese, Traditional Chinese, English, and Japanese, as well as complex text scenarios like handwriting, vertical text, pinyin, and rare characters. While maintaining recognition effectiveness, it also considers inference speed and model robustness, providing efficient and accurate technical support for document understanding across various scenarios. 8.45/2.36 122.69/122.69 81 MPP-OCRv5_rec is a next-generation text recognition model. It aims to efficiently and accurately support the recognition of four major languages—Simplified Chinese, Traditional Chinese, English, and Japanese—as well as complex text scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model. While maintaining recognition performance, it balances inference speed and model robustness, providing efficient and accurate technical support for document understanding in various scenarios.
PP-OCRv5_mobile_recInference Model/Training Model 81.29 - - 128PP-OCRv5_mobile_rec is a new generation text recognition model. This model aims to efficiently and accurately support four major languages: Simplified Chinese, Traditional Chinese, English, and Japanese, as well as complex text scenarios like handwriting, vertical text, pinyin, and rare characters. While maintaining recognition effectiveness, it also considers inference speed and model robustness, providing efficient and accurate technical support for document understanding across various scenarios. 1.46/5.43 5.32/91.79 16 M
PP-OCRv4_server_rec_docInference Model/Training Model 86.58 6.65 / 2.38 32.92 / 32.92181 MPP-OCRv4_server_rec_doc is based on PP-OCRv4_server_rec, trained with a mix of more Chinese document data and PP-OCR training data, increasing the recognition capabilities for some Traditional Chinese, Japanese, and special characters, supporting recognition of over 15,000 characters. In addition to improving the document-related text recognition capabilities, it also enhances general text recognition capabilities.91 MPP-OCRv4_server_rec_doc is trained on a mixed dataset of more Chinese document data and PP-OCR training data, building upon PP-OCRv4_server_rec. It enhances the recognition capabilities for some Traditional Chinese characters, Japanese characters, and special symbols, supporting over 15,000 characters. In addition to improving document-related text recognition, it also enhances general text recognition capabilities.
PP-OCRv4_mobile_recInference Model/Training ModelPP-OCRv4_mobile_recInference Model/Pretrained Model 83.28 4.82 / 1.20 16.74 / 4.6488 MPP-OCRv4's lightweight recognition model has high inference efficiency and can be deployed on various hardware, including edge devices.11 MA lightweight recognition model of PP-OCRv4 with high inference efficiency, suitable for deployment on various hardware devices, including edge devices.
PP-OCRv4_server_rec Inference Model/Training ModelPP-OCRv4_server_rec Inference Model/Pretrained Model 85.19 6.58 / 2.43 33.17 / 33.17151 MPP-OCRv4's server-side model has high inference accuracy and can be deployed on various servers.87 MThe server-side model of PP-OCRv4, offering high inference accuracy and deployable on various servers.
en_PP-OCRv4_mobile_recInference Model/Training Model 70.39 4.81 / 0.75 16.10 / 5.3166 MBased on the PP-OCRv4 recognition model, this ultra-lightweight English recognition model supports English and digit recognition.7.3 MAn ultra-lightweight English recognition model trained based on the PP-OCRv4 recognition model, supporting English and numeric character recognition.
-> ❗ The above lists the 6 core models that are key to the text recognition module. The module supports a total of 10 complete models, including multiple multilingual text recognition models. The complete model list is as follows: +> ❗ The above section lists the **6 core models** that are primarily supported by the text recognition module. In total, the module supports **20 comprehensive models**, including multiple multilingual text recognition models. Below is the complete list of models: -
👉 Model List Details +
👉Details of the Model List -* PP-OCRv5 Multi-Scene Model +* PP-OCRv5 Multi-Scenario Models - - - - - - - - - + + + + + + + + + +PP-OCRv5_server_rec_infer.tar">Inference Model/Pretrained Model - - - - + + + + +PP-OCRv5_mobile_rec_infer.tar">Inference Model/Pretrained Model - - - - + + +
ModelModel Download LinkChinese Recognition Avg Accuracy (%)English Recognition Avg Accuracy (%)Traditional Chinese Recognition Avg Accuracy (%)Japanese Recognition Avg Accuracy (%)GPU Inference Time (ms)
[Regular Mode / High-Performance Mode]
CPU Inference Time (ms)
[Regular Mode / High-Performance Mode]
Model Size (M)DescriptionModelModel Download LinksAvg Accuracy for Chinese Recognition (%)Avg Accuracy for English Recognition (%)Avg Accuracy for Traditional Chinese Recognition (%)Avg Accuracy for Japanese Recognition (%)GPU Inference Time (ms)
[Normal Mode / High-Performance Mode]
CPU Inference Time (ms)
[Normal Mode / High-Performance Mode]
Model Storage Size (M)Introduction
PP-OCRv5_server_recInference Model/Training Model 86.38 64.70 93.29 60.35 - - 205MPP-OCRv5_server_rec is a new generation text recognition model. This model aims to efficiently and accurately support four major languages: Simplified Chinese, Traditional Chinese, English, and Japanese, as well as complex text scenarios like handwriting, vertical text, pinyin, and rare characters. While maintaining recognition effectiveness, it also considers inference speed and model robustness, providing efficient and accurate technical support for document understanding across various scenarios. 8.45/2.36 122.69/122.69 81 MPP-OCRv5_rec is a next-generation text recognition model. It aims to efficiently and accurately support the recognition of four major languages—Simplified Chinese, Traditional Chinese, English, and Japanese—as well as complex text scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model. While maintaining recognition performance, it balances inference speed and model robustness, providing efficient and accurate technical support for document understanding in various scenarios.
PP-OCRv5_mobile_recInference Model/Training Model 81.29 66.00 83.55 54.65 - - 128PP-OCRv5_mobile_rec is a new generation text recognition model. This model aims to efficiently and accurately support four major languages: Simplified Chinese, Traditional Chinese, English, and Japanese, as well as complex text scenarios like handwriting, vertical text, pinyin, and rare characters. While maintaining recognition effectiveness, it also considers inference speed and model robustness, providing efficient and accurate technical support for document understanding across various scenarios. 1.46/5.43 5.32/91.79 16 M
diff --git a/docs/version3.x/pipeline_usage/table_recognition_v2.md b/docs/version3.x/pipeline_usage/table_recognition_v2.md index c6dd682bd3..5bd83b7a8a 100644 --- a/docs/version3.x/pipeline_usage/table_recognition_v2.md +++ b/docs/version3.x/pipeline_usage/table_recognition_v2.md @@ -181,19 +181,18 @@ comments: true PP-OCRv5_server_rec推理模型/训练模型 86.38 - - - - -205 M -PP-OCRv5_server_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 + 8.45/2.36 + 122.69/122.69 +81 M +PP-OCRv5_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 PP-OCRv5_mobile_rec推理模型/训练模型 81.29 - - - - -136 M -PP-OCRv5_mobile_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 + 1.46/5.43 + 5.32/91.79 +16 M PP-OCRv4_server_rec_doc推理模型/推理模型/推理模型/推理模型/推理模型/PP-OCRv5_rec 是新一代文本识别模型。该模型致力于以单一模型高效、精准地支持简体中文、繁体中文、英文、日文四种主要语言,以及手写、竖版、拼音、生僻字等复杂文本场景的识别。在保持识别效果的同时,兼顾推理速度和模型鲁棒性,为各种场景下的文档理解提供高效、精准的技术支撑。 PP-OCRv5_mobile_rec推理模型/