update doc
parent
bdee9eba63
commit
2be35217d3
|
@ -3,7 +3,7 @@
|
|||
本文提供了PaddleOCR表格识别模型的全流程指南,包括数据准备、模型训练、调优、评估、预测,各个阶段的详细说明:
|
||||
|
||||
- [1. 数据准备](#1-数据准备)
|
||||
- [1.1. 准备数据集](#11-数据集格式)
|
||||
- [1.1. 数据集格式](#11-数据集格式)
|
||||
- [1.2. 数据下载](#12-数据下载)
|
||||
- [1.3. 数据集生成](#13-数据集生成)
|
||||
- [2. 开始训练](#2-开始训练)
|
||||
|
@ -19,6 +19,8 @@
|
|||
- [3.1. 指标评估](#31-指标评估)
|
||||
- [3.2. 测试表格结构识别效果](#32-测试表格结构识别效果)
|
||||
- [4. 模型导出与预测](#4-模型导出与预测)
|
||||
- [4.1 模型导出](#41-模型导出)
|
||||
- [4.2 模型预测](#42-模型预测)
|
||||
- [5. FAQ](#5-faq)
|
||||
|
||||
# 1. 数据准备
|
||||
|
@ -33,7 +35,7 @@ img_label
|
|||
```
|
||||
|
||||
每一行的json格式为:
|
||||
```json
|
||||
```txt
|
||||
{
|
||||
'filename': PMC5755158_010_01.png, # 图像名
|
||||
'split': ’train‘, # 图像属于训练集还是验证集
|
||||
|
@ -236,6 +238,12 @@ DCU设备上运行需要设置环境变量 `export HIP_VISIBLE_DEVICES=0,1,2,3`
|
|||
python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/table/SLANet.yml -o Global.checkpoints={path/to/weights}/best_accuracy
|
||||
```
|
||||
|
||||
运行完成后,会输出模型的acc指标,如对英文表格识别模型进行评估,会见到如下输出。
|
||||
```bash
|
||||
[2022/08/16 07:59:55] ppocr INFO: acc:0.7622245132160782
|
||||
[2022/08/16 07:59:55] ppocr INFO: fps:30.991640622573044
|
||||
```
|
||||
|
||||
## 3.2. 测试表格结构识别效果
|
||||
|
||||
使用 PaddleOCR 训练好的模型,可以通过以下脚本进行快速预测。
|
||||
|
@ -278,6 +286,8 @@ python3 tools/infer_table.py -c configs/table/SLANet.yml -o Global.pretrained_mo
|
|||
|
||||
# 4. 模型导出与预测
|
||||
|
||||
## 4.1 模型导出
|
||||
|
||||
inference 模型(`paddle.jit.save`保存的模型)
|
||||
一般是模型训练,把模型结构和模型参数保存在文件中的固化模型,多用于预测部署场景。
|
||||
训练过程中保存的模型是checkpoints模型,保存的只有模型的参数,多用于恢复训练等。
|
||||
|
@ -303,6 +313,33 @@ inference/SLANet/
|
|||
└── inference.pdmodel # inference模型的program文件
|
||||
```
|
||||
|
||||
## 4.2 模型预测
|
||||
|
||||
模型导出后,使用如下命令即可完成inference模型的预测
|
||||
|
||||
```python
|
||||
python3.7 table/predict_structure.py \
|
||||
--table_model_dir={path/to/inference model} \
|
||||
--table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \
|
||||
--image_dir=docs/table/table.jpg \
|
||||
--output=../output/table
|
||||
```
|
||||
|
||||
预测图片:
|
||||
|
||||

|
||||
|
||||
得到输入图像的预测结果:
|
||||
|
||||
```
|
||||
['<html>', '<body>', '<table>', '<thead>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '</thead>', '<tbody>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '</tbody>', '</table>', '</body>', '</html>'],[[320.0562438964844, 197.83375549316406, 350.0928955078125, 214.4309539794922], ... , [318.959228515625, 271.0166931152344, 353.7394104003906, 286.4538269042969]]
|
||||
```
|
||||
|
||||
单元格坐标可视化结果为
|
||||
|
||||

|
||||
|
||||
|
||||
# 5. FAQ
|
||||
|
||||
Q1: 训练模型转inference 模型之后预测效果不一致?
|
||||
|
|
|
@ -3,9 +3,9 @@
|
|||
This article provides a full-process guide for the PaddleOCR table recognition model, including data preparation, model training, tuning, evaluation, prediction, and detailed descriptions of each stage:
|
||||
|
||||
- [1. Data Preparation](#1-data-preparation)
|
||||
- [1.1. DataSet Preparation](#11-dataset-preparation)
|
||||
- [1.1. DataSet Format](#11-dataset-format)
|
||||
- [1.2. Data Download](#12-data-download)
|
||||
- [1.3. Dataset Generation](#13-dataset-format)
|
||||
- [1.3. Dataset Generation](#13-dataset-generation)
|
||||
- [2. Training](#2-training)
|
||||
- [2.1. Start Training](#21-start-training)
|
||||
- [2.2. Resume Training](#22-resume-training)
|
||||
|
@ -19,7 +19,9 @@ This article provides a full-process guide for the PaddleOCR table recognition m
|
|||
- [3.1. Evaluation](#31-evaluation)
|
||||
- [3.2. Test table structure recognition effect](#32-test-table-structure-recognition-effect)
|
||||
- [4. Model export and prediction](#4-model-export-and-prediction)
|
||||
- [5. FAQ](#5-faq)
|
||||
- [4.1 Model export](#41-model-export)
|
||||
- [4.2 Prediction](#42-prediction)
|
||||
- [5. FAQ](#5-faq)
|
||||
|
||||
# 1. Data Preparation
|
||||
|
||||
|
@ -243,6 +245,13 @@ The model parameters during training are saved in the `Global.save_model_dir` di
|
|||
python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/table/SLANet.yml -o Global.checkpoints={path/to/weights}/best_accuracy
|
||||
```
|
||||
|
||||
After the operation is completed, the acc indicator of the model will be output. If you evaluate the English table recognition model, you will see the following output.
|
||||
|
||||
```bash
|
||||
[2022/08/16 07:59:55] ppocr INFO: acc:0.7622245132160782
|
||||
[2022/08/16 07:59:55] ppocr INFO: fps:30.991640622573044
|
||||
```
|
||||
|
||||
## 3.2. Test table structure recognition effect
|
||||
|
||||
Using the model trained by PaddleOCR, you can quickly get prediction through the following script.
|
||||
|
@ -287,6 +296,8 @@ The cell coordinates are visualized as
|
|||
|
||||
# 4. Model export and prediction
|
||||
|
||||
## 4.1 Model export
|
||||
|
||||
inference model (model saved by `paddle.jit.save`)
|
||||
Generally, it is model training, a solidified model that saves the model structure and model parameters in a file, and is mostly used to predict deployment scenarios.
|
||||
The model saved during the training process is the checkpoints model, and only the parameters of the model are saved, which are mostly used to resume training.
|
||||
|
@ -313,7 +324,35 @@ inference/SLANet/
|
|||
└── inference.pdmodel # The program file of model
|
||||
```
|
||||
|
||||
## 5. FAQ
|
||||
## 4.2 Prediction
|
||||
|
||||
After the model is exported, use the following command to complete the prediction of the inference model
|
||||
|
||||
```python
|
||||
python3.7 table/predict_structure.py \
|
||||
--table_model_dir={path/to/inference model} \
|
||||
--table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \
|
||||
--image_dir=docs/table/table.jpg \
|
||||
--output=../output/table
|
||||
```
|
||||
|
||||
Input image:
|
||||
|
||||

|
||||
|
||||
Get the prediction result of the input image:
|
||||
|
||||
```
|
||||
['<html>', '<body>', '<table>', '<thead>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '</thead>', '<tbody>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '</tbody>', '</table>', '</body>', '</html>'],[[320.0562438964844, 197.83375549316406, 350.0928955078125, 214.4309539794922], ... , [318.959228515625, 271.0166931152344, 353.7394104003906, 286.4538269042969]]
|
||||
```
|
||||
|
||||
The cell coordinates are visualized as
|
||||
|
||||

|
||||
|
||||
|
||||
|
||||
# 5. FAQ
|
||||
|
||||
Q1: After the training model is transferred to the inference model, the prediction effect is inconsistent?
|
||||
|
||||
|
|
|
@ -36,13 +36,12 @@ We evaluated the algorithm on the PubTabNet<sup>[1]</sup> eval dataset, and the
|
|||
| EDD<sup>[2]</sup> |x| 88.3 |
|
||||
| TableRec-RARE(ours) |73.8%| 93.32 |
|
||||
| SLANet(ours) | 76.2%| 94.98 |SLANet |
|
||||
|
||||
## 3. Result
|
||||
|
||||

|
||||

|
||||

|
||||

|
||||

|
||||

|
||||

|
||||

|
||||
|
||||
## 4. How to use
|
||||
|
||||
|
|
|
@ -44,15 +44,9 @@
|
|||
|
||||
## 3. 效果演示
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||

|
||||
|
||||

|
||||
|
||||

|
||||

|
||||

|
||||

|
||||
|
||||
## 4. 使用
|
||||
|
||||
|
|
Loading…
Reference in New Issue