add latexocr docs and fix some typos (#13532)
parent
cab3fcbcdf
commit
d3ed42241a
|
@ -8,7 +8,7 @@ hide:
|
|||
|
||||
PaddleOCR收集整理了自从开源以来在issues和用户群中的常见问题并且给出了简要解答,旨在为OCR的开发者提供一些参考,也希望帮助大家少走一些弯路。
|
||||
|
||||
其中[通用问题](#1)一般是初次接触OCR相关算法时用户会提出的问题,在[1.5 垂类场景实现思路](#15)中总结了如何在一些具体的场景中确定技术路线进行优化。[PaddleOCR常见问题](#2)是开发者在使用PaddleOCR之后可能会遇到的问题也是PaddleOCR实践过程中的避坑指南。
|
||||
其中[通用问题](#1)一般是初次接触OCR相关算法时用户会提出的问题,在[1.5 垂类场景实现思路](#15)中总结了如何在一些具体的场景中确定技术路线进行优化。[PaddleOCR常见问题](#2-paddleocr)是开发者在使用PaddleOCR之后可能会遇到的问题也是PaddleOCR实践过程中的避坑指南。
|
||||
|
||||
同时PaddleOCR也会在review issue的过程中添加 `good issue`、 `good first issue` 标签,但这些问题可能不会被立刻补充在FAQ文档里,开发者也可对应查看。我们也非常希望开发者能够帮助我们将这些内容补充在FAQ中。
|
||||
|
||||
|
@ -234,9 +234,7 @@ A:训练集精度90,测试集70多的话,应该是过拟合了,有两个
|
|||
|
||||
#### Q: 对于小白如何快速入门中文OCR项目实践?
|
||||
|
||||
A:建议可以先了解OCR方向的基础知识,大概了解基础的检测和识别模型算法。然后在Github上可以查看OCR方向相关的repo。目前来看,从内容的完备性来看,PaddleOCR的中英文双语教程文档是有明显优势的,在数据集、模型训练、预测部署文档详实,可以快速入手。而且还有微信用户群答疑,非常适合学习实践。项目地址:PaddleOCR
|
||||
|
||||
AI 快车道课程:<https://aistudio.baidu.com/aistudio/course/introduce/1519>
|
||||
A:建议可以先了解OCR方向的基础知识,大概了解基础的检测和识别模型算法。然后在Github上可以查看OCR方向相关的repo。目前来看,从内容的完备性来看,PaddleOCR的中英文双语教程文档是有明显优势的,在数据集、模型训练、预测部署文档详实,可以快速入手。而且还有微信用户群答疑,非常适合学习实践。项目地址:PaddleOCR AI 快车道课程:<https://aistudio.baidu.com/aistudio/course/introduce/1519>
|
||||
|
||||
## 2. PaddleOCR实战问题
|
||||
|
||||
|
|
|
@ -1,20 +1,5 @@
|
|||
# LaTeX-OCR
|
||||
|
||||
- [1. Introduction](#1)
|
||||
- [2. Environment](#2)
|
||||
- [3. Model Training / Evaluation / Prediction](#3)
|
||||
- [3.1 Pickle File Generation](#3-1)
|
||||
- [3.2 Training](#3-2)
|
||||
- [3.3 Evaluation](#3-3)
|
||||
- [3.4 Prediction](#3-4)
|
||||
- [4. Inference and Deployment](#4)
|
||||
- [4.1 Python Inference](#4-1)
|
||||
- [4.2 C++ Inference](#4-2)
|
||||
- [4.3 Serving](#4-3)
|
||||
- [4.4 More](#4-4)
|
||||
- [5. FAQ](#5)
|
||||
|
||||
<a name="1"></a>
|
||||
## 1. Introduction
|
||||
|
||||
Original Project:
|
||||
|
@ -25,21 +10,19 @@ Using LaTeX-OCR printed mathematical expression recognition datasets for trainin
|
|||
|
||||
| Model | Backbone |config| BLEU score | normed edit distance | ExpRate |Download link|
|
||||
|-----------|----------| ---- |:-----------:|:---------------------:|:---------:| ----- |
|
||||
| LaTeX-OCR | Hybrid ViT |[rec_latex_ocr.yml](../../configs/rec/rec_latex_ocr.yml)| 0.8821 | 0.0823 | 40.01% |[trained model](https://paddleocr.bj.bcebos.com/contribution/rec_latex_ocr_train.tar)|
|
||||
| LaTeX-OCR | Hybrid ViT |[rec_latex_ocr.yml](https://github.com/PaddlePaddle/PaddleOCR/blob/main/configs/rec/rec_latex_ocr.yml)| 0.8821 | 0.0823 | 40.01% |[trained model](https://paddleocr.bj.bcebos.com/contribution/rec_latex_ocr_train.tar)|
|
||||
|
||||
<a name="2"></a>
|
||||
## 2. Environment
|
||||
Please refer to ["Environment Preparation"](./environment_en.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone_en.md) to clone the project code.
|
||||
Please refer to ["Environment Preparation"](../../ppocr/environment.en.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](../../ppocr/blog/clone.en.md) to clone the project code.
|
||||
|
||||
Furthermore, additional dependencies need to be installed:
|
||||
```shell
|
||||
pip install "tokenizers==0.19.1" "imagesize"
|
||||
```
|
||||
|
||||
<a name="3"></a>
|
||||
## 3. Model Training / Evaluation / Prediction
|
||||
|
||||
Please refer to [Text Recognition Tutorial](./recognition_en.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
|
||||
Please refer to [Text Recognition Tutorial](../../ppocr/model_train/recognition.en.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
|
||||
|
||||
Pickle File Generation:
|
||||
|
||||
|
@ -90,10 +73,8 @@ Prediction:
|
|||
python3 tools/infer_rec.py -c configs/rec/rec_latex_ocr.yml -o Architecture.Backbone.is_predict=True Architecture.Backbone.is_export=True Architecture.Head.is_export=True Global.infer_img='./doc/datasets/pme_demo/0000013.png' Global.pretrained_model=./rec_latex_ocr_train/best_accuracy.pdparams
|
||||
```
|
||||
|
||||
<a name="4"></a>
|
||||
## 4. Inference and Deployment
|
||||
|
||||
<a name="4-1"></a>
|
||||
### 4.1 Python Inference
|
||||
First, the model saved during the LaTeX-OCR printed mathematical expression recognition training process is converted into an inference model. you can use the following command to convert:
|
||||
|
||||
|
@ -109,23 +90,16 @@ For LaTeX-OCR printed mathematical expression recognition model inference, the f
|
|||
python3 tools/infer/predict_rec.py --image_dir='./doc/datasets/pme_demo/0000295.png' --rec_algorithm="LaTeXOCR" --rec_batch_num=1 --rec_model_dir="./inference/rec_latex_ocr_infer/" --rec_char_dict_path="./ppocr/utils/dict/latex_ocr_tokenizer.json"
|
||||
```
|
||||
|
||||
<a name="4-2"></a>
|
||||
### 4.2 C++ Inference
|
||||
|
||||
Not supported
|
||||
|
||||
<a name="4-3"></a>
|
||||
### 4.3 Serving
|
||||
|
||||
Not supported
|
||||
|
||||
<a name="4-4"></a>
|
||||
### 4.4 More
|
||||
|
||||
Not supported
|
||||
|
||||
<a name="5"></a>
|
||||
## 5. FAQ
|
||||
|
||||
|
||||
```
|
|
@ -1,48 +1,27 @@
|
|||
# 印刷数学公式识别算法-LaTeX-OCR
|
||||
|
||||
- [1. 算法简介](#1)
|
||||
- [2. 环境配置](#2)
|
||||
- [3. 模型训练、评估、预测](#3)
|
||||
- [3.1 pickle 标签文件生成](#3-1)
|
||||
- [3.2 训练](#3-2)
|
||||
- [3.3 评估](#3-3)
|
||||
- [3.4 预测](#3-4)
|
||||
- [4. 推理部署](#4)
|
||||
- [4.1 Python推理](#4-1)
|
||||
- [4.2 C++推理](#4-2)
|
||||
- [4.3 Serving服务化部署](#4-3)
|
||||
- [4.4 更多推理部署](#4-4)
|
||||
- [5. FAQ](#5)
|
||||
|
||||
<a name="1"></a>
|
||||
## 1. 算法简介
|
||||
|
||||
原始项目:
|
||||
> [https://github.com/lukas-blecher/LaTeX-OCR](https://github.com/lukas-blecher/LaTeX-OCR)
|
||||
|
||||
|
||||
|
||||
<a name="model"></a>
|
||||
`LaTeX-OCR`使用[`LaTeX-OCR印刷公式数据集`](https://drive.google.com/drive/folders/13CA4vAmOmD_I_dSbvLp-Lf0s6KiaNfuO)进行训练,在对应测试集上的精度如下:
|
||||
|
||||
| 模型 | 骨干网络 |配置文件 | BLEU score | normed edit distance | ExpRate |下载链接|
|
||||
|-----------|------------| ----- |:-----------:|:---------------------:|:---------:| ----- |
|
||||
| LaTeX-OCR | Hybrid ViT |[rec_latex_ocr.yml](../../configs/rec/rec_latex_ocr.yml)| 0.8821 | 0.0823 | 40.01% |[训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_latex_ocr_train.tar)|
|
||||
| LaTeX-OCR | Hybrid ViT |[rec_latex_ocr.yml](https://github.com/PaddlePaddle/PaddleOCR/blob/main/configs/rec/rec_latex_ocr.yml)| 0.8821 | 0.0823 | 40.01% |[训练模型](https://paddleocr.bj.bcebos.com/contribution/rec_latex_ocr_train.tar)|
|
||||
|
||||
<a name="2"></a>
|
||||
## 2. 环境配置
|
||||
请先参考[《运行环境准备》](./environment.md)配置PaddleOCR运行环境,参考[《项目克隆》](./clone.md)克隆项目代码。
|
||||
请先参考[《运行环境准备》](../../ppocr/environment.md)配置PaddleOCR运行环境,参考[《项目克隆》](../../ppocr/blog/clone.md)克隆项目代码。
|
||||
|
||||
此外,需要安装额外的依赖:
|
||||
```shell
|
||||
pip install "tokenizers==0.19.1" "imagesize"
|
||||
```
|
||||
|
||||
<a name="3"></a>
|
||||
## 3. 模型训练、评估、预测
|
||||
|
||||
<a name="3-1"></a>
|
||||
|
||||
### 3.1 pickle 标签文件生成
|
||||
从[谷歌云盘](https://drive.google.com/drive/folders/13CA4vAmOmD_I_dSbvLp-Lf0s6KiaNfuO)中下载 formulae.zip 和 math.txt,之后,使用如下命令,生成 pickle 标签文件。
|
||||
|
||||
|
@ -63,7 +42,7 @@ python ppocr/utils/formula_utils/math_txt2pkl.py --image_dir=train_data/LaTeXOCR
|
|||
|
||||
### 3.2 模型训练
|
||||
|
||||
请参考[文本识别训练教程](./recognition.md)。PaddleOCR对代码进行了模块化,训练`LaTeX-OCR`识别模型时需要**更换配置文件**为`LaTeX-OCR`的[配置文件](../../configs/rec/rec_latex_ocr.yml)。
|
||||
请参考[文本识别训练教程](../../ppocr/model_train/recognition.md)。PaddleOCR对代码进行了模块化,训练`LaTeX-OCR`识别模型时需要**更换配置文件**为`LaTeX-OCR`的[配置文件](https://github.com/PaddlePaddle/PaddleOCR/blob/main/configs/rec/rec_latex_ocr.yml)。
|
||||
|
||||
#### 启动训练
|
||||
|
||||
|
@ -83,7 +62,6 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs
|
|||
python3 tools/train.py -c configs/rec/rec_latex_ocr.yml -o Global.eval_batch_step=[0,{length_of_dataset//batch_size*22}]
|
||||
```
|
||||
|
||||
<a name="3-2"></a>
|
||||
### 3.3 评估
|
||||
|
||||
可下载已训练完成的[模型文件](https://paddleocr.bj.bcebos.com/contribution/rec_latex_ocr_train.tar),使用如下命令进行评估:
|
||||
|
@ -96,7 +74,6 @@ python3 tools/eval.py -c configs/rec/rec_latex_ocr.yml -o Global.pretrained_mode
|
|||
python3 tools/eval.py -c configs/rec/rec_latex_ocr.yml -o Global.pretrained_model=./rec_latex_ocr_train/best_accuracy.pdparams Metric.cal_blue_score=True Eval.dataset.data=./train_data/LaTeXOCR/latexocr_test.pkl
|
||||
```
|
||||
|
||||
<a name="3-3"></a>
|
||||
### 3.4 预测
|
||||
|
||||
使用如下命令进行单张图片预测:
|
||||
|
@ -106,12 +83,10 @@ python3 tools/infer_rec.py -c configs/rec/rec_latex_ocr.yml -o Architecture.Ba
|
|||
# 预测文件夹下所有图像时,可修改infer_img为文件夹,如 Global.infer_img='./doc/datasets/pme_demo/'。
|
||||
```
|
||||
|
||||
<a name="4"></a>
|
||||
## 4. 推理部署
|
||||
|
||||
<a name="4-1"></a>
|
||||
### 4.1 Python推理
|
||||
首先将训练得到best模型,转换成inference model。这里以训练完成的模型为例([模型下载地址](https://paddleocr.bj.bcebos.com/contribution/rec_latex_ocr_train.tar) ),可以使用如下命令进行转换:
|
||||
首先将训练得到best模型,转换成inference model。这里以训练完成的模型为例([模型下载地址](https://paddleocr.bj.bcebos.com/contribution/rec_latex_ocr_train.tar)),可以使用如下命令进行转换:
|
||||
|
||||
```shell
|
||||
# 注意将pretrained_model的路径设置为本地路径。
|
||||
|
@ -140,7 +115,7 @@ python3 tools/infer/predict_rec.py --image_dir='./doc/datasets/pme_demo/0000295.
|
|||
```
|
||||
|
||||
|
||||

|
||||

|
||||
|
||||
执行命令后,上面图像的预测结果(识别的文本)会打印到屏幕上,示例如下:
|
||||
```shell
|
||||
|
@ -155,22 +130,18 @@ Predicts of ./doc/datasets/pme_demo/0000295.png:\zeta_{0}(\nu)=-{\frac{\nu\varrh
|
|||
- 如果您修改了预处理方法,需修改`tools/infer/predict_rec.py`中 LaTeX-OCR 的预处理为您的预处理方法。
|
||||
|
||||
|
||||
<a name="4-2"></a>
|
||||
### 4.2 C++推理部署
|
||||
|
||||
由于C++预处理后处理还未支持 LaTeX-OCR,所以暂未支持
|
||||
|
||||
<a name="4-3"></a>
|
||||
### 4.3 Serving服务化部署
|
||||
|
||||
暂不支持
|
||||
|
||||
<a name="4-4"></a>
|
||||
### 4.4 更多推理部署
|
||||
|
||||
暂不支持
|
||||
|
||||
<a name="5"></a>
|
||||
## 5. FAQ
|
||||
|
||||
1. LaTeX-OCR 数据集来自于[LaTeXOCR源repo](https://github.com/lukas-blecher/LaTeX-OCR) 。
|
|
@ -124,6 +124,7 @@ On the TextZoom public dataset, the effect of the algorithm is as follows:
|
|||
Supported formula recognition algorithms (Click the link to get the tutorial):
|
||||
|
||||
- [x] [CAN](./formula_recognition/algorithm_rec_can.en.md)
|
||||
- [x] [LaTeX-OCR](./formula_recognition/algorithm_rec_latex_ocr.en.md)
|
||||
|
||||
On the CROHME handwritten formula dataset, the effect of the algorithm is as follows:
|
||||
|
||||
|
|
|
@ -126,6 +126,7 @@ PaddleOCR将**持续新增**支持OCR领域前沿算法与模型,**欢迎广
|
|||
已支持的公式识别算法列表(戳链接获取使用教程):
|
||||
|
||||
- [x] [CAN](./formula_recognition/algorithm_rec_can.md)
|
||||
- [x] [LaTeX-OCR](./formula_recognition/algorithm_rec_latex_ocr.md)
|
||||
|
||||
在CROHME手写公式数据集上,算法效果如下:
|
||||
|
||||
|
|
|
@ -114,28 +114,3 @@ PaddleOCR非常欢迎社区贡献以PaddleOCR为核心的各种服务、部署
|
|||
- 合入代码之后会在本文档第一节中更新信息,默认链接为github名字及主页,如果有需要更换主页,也可以联系我们。
|
||||
- 新增重要功能类,会在用户群广而告之,享受开源社区荣誉时刻。
|
||||
- **如果您有基于PaddleOCR的项目,但未出现在上述列表中,请按照 `4. 联系我们` 的步骤与我们联系。**
|
||||
|
||||
## 附录:社区常规赛积分榜
|
||||
|
||||
| 开发者| 总积分 | 开发者| 总积分 |
|
||||
| ---- | ------ | ----- | ------ |
|
||||
| [RangeKing](https://github.com/RangeKing) | 220 | [WZMIAOMIAO](https://github.com/WZMIAOMIAO) | 36 |
|
||||
| [hao6699](https://github.com/hao6699) | 145 | [v3fc](https://github.com/v3fc) | 35 |
|
||||
| [mymagicpower](https://github.com/mymagicpower) | 140 | [imiyu](https://github.com/imiyu) | 30 |
|
||||
| [raoyutian](https://github.com/raoyutian) | 90 | [haigang1975](https://github.com/haigang1975) | 29 |
|
||||
| [sdcb](https://github.com/sdcb) | 80 | [daassh](https://github.com/daassh) | 23 |
|
||||
| [zhiminzhang0830](https://github.com/zhiminzhang0830) | 70 | [xiaoyangyang2](https://github.com/xiaoyangyang2) | 20 |
|
||||
| [Lovely-Pig](https://github.com/Lovely-Pig) | 70 | [prettyocean85](https://github.com/prettyocean85) | 20 |
|
||||
| [livingbody](https://github.com/livingbody) | 70 | [nmusik](https://github.com/nmusik) | 20 |
|
||||
| [fanruinet](https://github.com/fanruinet) | 70 | [kjf4096](https://github.com/kjf4096) | 20 |
|
||||
| [bupt906](https://github.com/bupt906) | 60 | [chccc1994](https://github.com/chccc1994) | 20 |
|
||||
| [edencfc](https://github.com/edencfc) | 57 | [BeyondYourself](https://github.com/BeyondYourself) | 20 |
|
||||
| [zhangyingying520](https://github.com/zhangyingying520) | 57 | chenguoqi08161 | 18 |
|
||||
| [ITerydh](https://github.com/ITerydh) | 55 | [weiwenlan](https://github.com/weiwenlan) | 10 |
|
||||
| [telppa](https://github.com/telppa) | 40 | [shaoshenchen thinc](https://github.com/shaoshenchen) | 10 |
|
||||
| sosojust1984 | 40 | [jordan2013](https://github.com/jordan2013) | 10 |
|
||||
| [redearly123](https://github.com/redearly123) | 40 | [JimEverest](https://github.com/JimEverest) | 10 |
|
||||
| [OneYearIsEnough](https://github.com/OneYearIsEnough) | 40 | [HustBestCat](https://github.com/HustBestCat) | 10 |
|
||||
| [Huntersdeng](https://github.com/Huntersdeng) | 40 | | |
|
||||
| [GreatV](https://github.com/GreatV) | 40 | | |
|
||||
| CLXK294 | 40 | | |
|
||||
|
|
Before Width: | Height: | Size: 1.5 KiB After Width: | Height: | Size: 1.5 KiB |
Before Width: | Height: | Size: 2.3 KiB After Width: | Height: | Size: 2.3 KiB |
Before Width: | Height: | Size: 1.2 KiB After Width: | Height: | Size: 1.2 KiB |
|
@ -124,6 +124,8 @@ python3 tools/train.py -c configs/det/det_mv3_db.yml \
|
|||
Global.use_amp=True Global.scale_loss=1024.0 Global.use_dynamic_loss_scaling=True
|
||||
```
|
||||
|
||||
**注意:** 文本检测模型使用AMP时可能遇到训练不收敛问题,可以参考[discussions](https://github.com/PaddlePaddle/PaddleOCR/discussions/12445)中的临时解决方案进行使用。
|
||||
|
||||
### 2.5 分布式训练
|
||||
|
||||
多机多卡训练时,通过 `--ips` 参数设置使用的机器IP地址,通过 `--gpus` 参数设置使用的GPU ID:
|
||||
|
|
Loading…
Reference in New Issue