add quickstart
parent
4eafe007fe
commit
1414fa0f17
|
@ -3,21 +3,22 @@ English | [简体中文](README_ch.md)
|
|||
# Layout analysis
|
||||
|
||||
- [1. Introduction](#1-Introduction)
|
||||
- [2. Install](#2-Install)
|
||||
- [2.1 Install PaddlePaddle](#21-Install-paddlepaddle)
|
||||
- [2.2 Install PaddleDetection](#22-Install-paddledetection)
|
||||
- [3. Data preparation](#3-Data-preparation)
|
||||
- [3.1 English data set](#31-English-data-set)
|
||||
- [3.2 More datasets](#32-More-datasets)
|
||||
- [4. Start training](#4-Start-training)
|
||||
- [4.1 Train](#41-Train)
|
||||
- [4.2 FGD Distillation training](#42-FGD-Distillation-training)
|
||||
- [5. Model evaluation and prediction](#5-Model-evaluation-and-prediction)
|
||||
- [5.1 Indicator evaluation](#51-Indicator-evaluation)
|
||||
- [5.2 Test layout analysis results](#52-Test-layout-analysis-results)
|
||||
- [6 Model export and inference](#6-Model-export-and-inference)
|
||||
- [6.1 Model export](#61-Model-export)
|
||||
- [6.2 Model inference](#62-Model-inference)
|
||||
- [2. Quick start](#3-Quick-start)
|
||||
- [3. Install](#3-Install)
|
||||
- [3.1 Install PaddlePaddle](#31-Install-paddlepaddle)
|
||||
- [3.2 Install PaddleDetection](#32-Install-paddledetection)
|
||||
- [4. Data preparation](#4-Data-preparation)
|
||||
- [4.1 English data set](#41-English-data-set)
|
||||
- [4.2 More datasets](#42-More-datasets)
|
||||
- [5. Start training](#5-Start-training)
|
||||
- [5.1 Train](#51-Train)
|
||||
- [5.2 FGD Distillation training](#52-FGD-Distillation-training)
|
||||
- [6. Model evaluation and prediction](#6-Model-evaluation-and-prediction)
|
||||
- [6.1 Indicator evaluation](#61-Indicator-evaluation)
|
||||
- [6.2 Test layout analysis results](#62-Test-layout-analysis-results)
|
||||
- [7 Model export and inference](#7-Model-export-and-inference)
|
||||
- [7.1 Model export](#71-Model-export)
|
||||
- [7.2 Model inference](#72-Model-inference)
|
||||
|
||||
|
||||
## 1. Introduction
|
||||
|
@ -28,11 +29,12 @@ Layout analysis refers to the regional division of documents in the form of pict
|
|||
<img src="../docs/layout/layout.png" width="800">
|
||||
</div>
|
||||
|
||||
## 2. Quick start
|
||||
PP-Structure currently provides layout analysis models in Chinese, English and table documents. For the model link, see [models_list](../docs/models_list_en.md). The whl package is also provided for quick use, see [quickstart](../docs/quickstart_en.md) for details.
|
||||
|
||||
## 3. Install
|
||||
|
||||
## 2. Install
|
||||
|
||||
### 2.1. Install PaddlePaddle
|
||||
### 3.1. Install PaddlePaddle
|
||||
|
||||
- **(1) Install PaddlePaddle**
|
||||
|
||||
|
@ -47,7 +49,7 @@ python3 -m pip install "paddlepaddle>=2.3" -i https://mirror.baidu.com/pypi/simp
|
|||
```
|
||||
For more requirements, please refer to the instructions in the [Install file](https://www.paddlepaddle.org.cn/install/quick)。
|
||||
|
||||
### 2.2. Install PaddleDetection
|
||||
### 3.2. Install PaddleDetection
|
||||
|
||||
- **(1)Download PaddleDetection Source code**
|
||||
|
||||
|
@ -62,11 +64,11 @@ cd PaddleDetection
|
|||
python3 -m pip install -r requirements.txt
|
||||
```
|
||||
|
||||
## 3. Data preparation
|
||||
## 4. Data preparation
|
||||
|
||||
If you want to experience the prediction process directly, you can skip data preparation and download the pre-training model.
|
||||
|
||||
### 3.1. English data set
|
||||
### 4.1. English data set
|
||||
|
||||
Download document analysis data set [PubLayNet](https://developer.ibm.com/exchanges/data/all/publaynet/)(Dataset 96G),contains 5 classes:`{0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}`
|
||||
|
||||
|
@ -141,7 +143,7 @@ The JSON file contains the annotations of all images, and the data is stored in
|
|||
}
|
||||
```
|
||||
|
||||
### 3.2. More datasets
|
||||
### 4.2. More datasets
|
||||
|
||||
We provide CDLA(Chinese layout analysis), TableBank(Table layout analysis)etc. data set download links,process to the JSON format of the above annotation file,that is, the training can be conducted in the same way。
|
||||
|
||||
|
@ -154,7 +156,7 @@ We provide CDLA(Chinese layout analysis), TableBank(Table layout analysis)etc. d
|
|||
| [DocBank](https://github.com/doc-analysis/DocBank) | Large-scale dataset (500K document pages) constructed using weakly supervised methods for document layout analysis, containing 12 categories:Author, Caption, Date, Equation, Figure, Footer, List, Paragraph, Reference, Section, Table, Title |
|
||||
|
||||
|
||||
## 4. Start training
|
||||
## 5. Start training
|
||||
|
||||
Training scripts, evaluation scripts, and prediction scripts are provided, and the PubLayNet pre-training model is used as an example in this section.
|
||||
|
||||
|
@ -171,7 +173,7 @@ wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_
|
|||
|
||||
If the test image is Chinese, the pre-trained model of Chinese CDLA dataset can be downloaded to identify 10 types of document regions:Table, Figure, Figure caption, Table, Table caption, Header, Footer, Reference, Equation,Download the training model and inference model of Model 'picodet_lcnet_x1_0_fgd_layout_cdla' in [layout analysis model](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/docs/models_list.md)。If only the table area in the image is detected, you can download the pre-trained model of the table dataset, and download the training model and inference model of the 'picodet_LCnet_x1_0_FGd_layout_table' model in [Layout Analysis model](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/docs/models_list.md)
|
||||
|
||||
### 4.1. Train
|
||||
### 5.1. Train
|
||||
|
||||
Train:
|
||||
|
||||
|
@ -247,7 +249,7 @@ After starting training normally, you will see the following log output:
|
|||
|
||||
**Note that the configuration file for prediction / evaluation must be consistent with the training.**
|
||||
|
||||
### 4.2. FGD Distillation Training
|
||||
### 5.2. FGD Distillation Training
|
||||
|
||||
PaddleDetection supports FGD-based [Focal and Global Knowledge Distillation for Detectors]( https://arxiv.org/abs/2111.11837v1) The training process of the target detection model of distillation, FGD distillation is divided into two parts `Focal` and `Global`. `Focal` Distillation separates the foreground and background of the image, allowing the student model to focus on the key pixels of the foreground and background features of the teacher model respectively;` Global`Distillation section reconstructs the relationships between different pixels and transfers them from the teacher to the student to compensate for the global information lost in `Focal`Distillation.
|
||||
|
||||
|
@ -265,9 +267,9 @@ python3 tools/train.py \
|
|||
- `-c`: Specify the model configuration file.
|
||||
- `--slim_config`: Specify the compression policy profile.
|
||||
|
||||
## 5. Model evaluation and prediction
|
||||
## 6. Model evaluation and prediction
|
||||
|
||||
### 5.1. Indicator evaluation
|
||||
### 6.1. Indicator evaluation
|
||||
|
||||
Model parameters in training are saved by default in `output/picodet_ Lcnet_ X1_ 0_ Under the layout` directory. When evaluating indicators, you need to set `weights` to point to the saved parameter file.Assessment datasets can be accessed via `configs/picodet/legacy_ Model/application/layout_ Analysis/picodet_ Lcnet_ X1_ 0_ Layout. Yml` . Modify `EvalDataset` : `img_dir`,`anno_ Path`and`dataset_dir` setting.
|
||||
|
||||
|
@ -310,7 +312,7 @@ python3 tools/eval.py \
|
|||
- `--slim_config`: Specify the distillation policy profile.
|
||||
- `-o weights`: Specify the model path trained by the distillation algorithm.
|
||||
|
||||
### 5.2. Test Layout Analysis Results
|
||||
### 6.2. Test Layout Analysis Results
|
||||
|
||||
|
||||
The profile predicted to be used must be consistent with the training, for example, if you pass `python3 tools/train'. Py-c configs/picodet/legacy_ Model/application/layout_ Analysis/picodet_ Lcnet_ X1_ 0_ Layout. Yml` completed the training process for the model.
|
||||
|
@ -343,10 +345,10 @@ python3 tools/infer.py \
|
|||
```
|
||||
|
||||
|
||||
## 6. Model Export and Inference
|
||||
## 7. Model Export and Inference
|
||||
|
||||
|
||||
### 6.1 Model Export
|
||||
### 7.1 Model Export
|
||||
|
||||
The inference model (the model saved by `paddle.jit.save`) is generally a solidified model saved after the model training is completed, and is mostly used to give prediction in deployment.
|
||||
|
||||
|
@ -385,7 +387,7 @@ python3 tools/export_model.py \
|
|||
--output_dir=output_inference/
|
||||
```
|
||||
|
||||
### 6.2 Model inference
|
||||
### 7.2 Model inference
|
||||
|
||||
Replace model_with the provided inference training model for inference or the FGD distillation training `model_dir`Inference model path, execute the following commands for inference:
|
||||
|
||||
|
|
|
@ -3,21 +3,22 @@
|
|||
# 版面分析
|
||||
|
||||
- [1. 简介](#1-简介)
|
||||
- [2. 安装](#2-安装)
|
||||
- [2.1 安装PaddlePaddle](#21-安装paddlepaddle)
|
||||
- [2.2 安装PaddleDetection](#22-安装paddledetection)
|
||||
- [3. 数据准备](#3-数据准备)
|
||||
- [3.1 英文数据集](#31-英文数据集)
|
||||
- [3.2 更多数据集](#32-更多数据集)
|
||||
- [4. 开始训练](#4-开始训练)
|
||||
- [4.1 启动训练](#41-启动训练)
|
||||
- [4.2 FGD蒸馏训练](#42-FGD蒸馏训练)
|
||||
- [5. 模型评估与预测](#5-模型评估与预测)
|
||||
- [5.1 指标评估](#51-指标评估)
|
||||
- [5.2 测试版面分析结果](#52-测试版面分析结果)
|
||||
- [6 模型导出与预测](#6-模型导出与预测)
|
||||
- [6.1 模型导出](#61-模型导出)
|
||||
- [6.2 模型推理](#62-模型推理)
|
||||
- [2. 快速开始](#2-快速开始)
|
||||
- [3. 安装](#3-安装)
|
||||
- [3.1 安装PaddlePaddle](#31-安装paddlepaddle)
|
||||
- [3.2 安装PaddleDetection](#32-安装paddledetection)
|
||||
- [4. 数据准备](#4-数据准备)
|
||||
- [4.1 英文数据集](#41-英文数据集)
|
||||
- [4.2 更多数据集](#42-更多数据集)
|
||||
- [5. 开始训练](#5-开始训练)
|
||||
- [5.1 启动训练](#51-启动训练)
|
||||
- [5.2 FGD蒸馏训练](#52-FGD蒸馏训练)
|
||||
- [6. 模型评估与预测](#6-模型评估与预测)
|
||||
- [6.1 指标评估](#61-指标评估)
|
||||
- [6.2 测试版面分析结果](#62-测试版面分析结果)
|
||||
- [7 模型导出与预测](#7-模型导出与预测)
|
||||
- [7.1 模型导出](#71-模型导出)
|
||||
- [7.2 模型推理](#72-模型推理)
|
||||
|
||||
## 1. 简介
|
||||
|
||||
|
@ -26,12 +27,14 @@
|
|||
<div align="center">
|
||||
<img src="../docs/layout/layout.png" width="800">
|
||||
</div>
|
||||
## 2. 快速开始
|
||||
|
||||
PP-Structure目前提供了中文、英文、表格三类文档版面分析模型,模型链接见 [models_list](../docs/models_list.md#1-版面分析模型)。也提供了whl包的形式方便快速使用,详见 [quickstart](../docs/quickstart.md)。
|
||||
|
||||
|
||||
## 3. 安装依赖
|
||||
|
||||
## 2. 安装依赖
|
||||
|
||||
### 2.1. 安装PaddlePaddle
|
||||
### 3.1. 安装PaddlePaddle
|
||||
|
||||
- **(1) 安装PaddlePaddle**
|
||||
|
||||
|
@ -46,7 +49,7 @@ python3 -m pip install "paddlepaddle>=2.3" -i https://mirror.baidu.com/pypi/simp
|
|||
```
|
||||
更多需求,请参照[安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
|
||||
|
||||
### 2.2. 安装PaddleDetection
|
||||
### 3.2. 安装PaddleDetection
|
||||
|
||||
- **(1)下载PaddleDetection源码**
|
||||
|
||||
|
@ -61,11 +64,11 @@ cd PaddleDetection
|
|||
python3 -m pip install -r requirements.txt
|
||||
```
|
||||
|
||||
## 3. 数据准备
|
||||
## 4. 数据准备
|
||||
|
||||
如果希望直接体验预测过程,可以跳过数据准备,下载我们提供的预训练模型。
|
||||
|
||||
### 3.1. 英文数据集
|
||||
### 4.1. 英文数据集
|
||||
|
||||
下载文档分析数据集[PubLayNet](https://developer.ibm.com/exchanges/data/all/publaynet/)(数据集96G),包含5个类:`{0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}`
|
||||
|
||||
|
@ -140,7 +143,7 @@ json文件包含所有图像的标注,数据以字典嵌套的方式存放,
|
|||
}
|
||||
```
|
||||
|
||||
### 3.2. 更多数据集
|
||||
### 4.2. 更多数据集
|
||||
|
||||
我们提供了CDLA(中文版面分析)、TableBank(表格版面分析)等数据集的下连接,处理为上述标注文件json格式,即可以按相同方式进行训练。
|
||||
|
||||
|
@ -153,7 +156,7 @@ json文件包含所有图像的标注,数据以字典嵌套的方式存放,
|
|||
| [DocBank](https://github.com/doc-analysis/DocBank) | 使用弱监督方法构建的大规模数据集(500K文档页面),用于文档布局分析,包含12类:Author、Caption、Date、Equation、Figure、Footer、List、Paragraph、Reference、Section、Table、Title |
|
||||
|
||||
|
||||
## 4. 开始训练
|
||||
## 5. 开始训练
|
||||
|
||||
提供了训练脚本、评估脚本和预测脚本,本节将以PubLayNet预训练模型为例进行讲解。
|
||||
|
||||
|
@ -170,7 +173,7 @@ wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_
|
|||
|
||||
如果测试图片为中文,可以下载中文CDLA数据集的预训练模型,识别10类文档区域:Table、Figure、Figure caption、Table、Table caption、Header、Footer、Reference、Equation,在[版面分析模型](../docs/models_list.md)中下载`picodet_lcnet_x1_0_fgd_layout_cdla`模型的训练模型和推理模型。如果只检测图片中的表格区域,可以下载表格数据集的预训练模型,在[版面分析模型](../docs/models_list.md)中下载`picodet_lcnet_x1_0_fgd_layout_table`模型的训练模型和推理模型。
|
||||
|
||||
### 4.1. 启动训练
|
||||
### 5.1. 启动训练
|
||||
|
||||
开始训练:
|
||||
|
||||
|
@ -246,7 +249,7 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py \
|
|||
|
||||
**注意,预测/评估时的配置文件请务必与训练一致。**
|
||||
|
||||
### 4.2. FGD蒸馏训练
|
||||
### 5.2. FGD蒸馏训练
|
||||
|
||||
PaddleDetection支持了基于FGD([Focal and Global Knowledge Distillation for Detectors](https://arxiv.org/abs/2111.11837v1))蒸馏的目标检测模型训练过程,FGD蒸馏分为两个部分`Focal`和`Global`。`Focal`蒸馏分离图像的前景和背景,让学生模型分别关注教师模型的前景和背景部分特征的关键像素;`Global`蒸馏部分重建不同像素之间的关系并将其从教师转移到学生,以补偿`Focal`蒸馏中丢失的全局信息。
|
||||
|
||||
|
@ -264,9 +267,9 @@ python3 tools/train.py \
|
|||
- `-c`: 指定模型配置文件。
|
||||
- `--slim_config`: 指定压缩策略配置文件。
|
||||
|
||||
## 5. 模型评估与预测
|
||||
## 6. 模型评估与预测
|
||||
|
||||
### 5.1. 指标评估
|
||||
### 6.1. 指标评估
|
||||
|
||||
训练中模型参数默认保存在`output/picodet_lcnet_x1_0_layout`目录下。在评估指标时,需要设置`weights`指向保存的参数文件。评估数据集可以通过 `configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml` 修改`EvalDataset`中的 `image_dir`、`anno_path`和`dataset_dir` 设置。
|
||||
|
||||
|
@ -309,7 +312,7 @@ python3 tools/eval.py \
|
|||
- `--slim_config`: 指定蒸馏策略配置文件。
|
||||
- `-o weights`: 指定蒸馏算法训好的模型路径。
|
||||
|
||||
### 5.2. 测试版面分析结果
|
||||
### 6.2 测试版面分析结果
|
||||
|
||||
|
||||
预测使用的配置文件必须与训练一致,如您通过 `python3 tools/train.py -c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml` 完成了模型的训练过程。
|
||||
|
@ -342,10 +345,10 @@ python3 tools/infer.py \
|
|||
```
|
||||
|
||||
|
||||
## 6. 模型导出与预测
|
||||
## 7. 模型导出与预测
|
||||
|
||||
|
||||
### 6.1 模型导出
|
||||
### 7.1 模型导出
|
||||
|
||||
inference 模型(`paddle.jit.save`保存的模型) 一般是模型训练,把模型结构和模型参数保存在文件中的固化模型,多用于预测部署场景。 训练过程中保存的模型是checkpoints模型,保存的只有模型的参数,多用于恢复训练等。 与checkpoints模型相比,inference 模型会额外保存模型的结构信息,在预测部署、加速推理上性能优越,灵活方便,适合于实际系统集成。
|
||||
|
||||
|
@ -382,7 +385,7 @@ python3 tools/export_model.py \
|
|||
|
||||
|
||||
|
||||
### 6.2 模型推理
|
||||
### 7.2 模型推理
|
||||
|
||||
若使用**提供的推理训练模型推理**,或使用**FGD蒸馏训练的模型**,更换`model_dir`推理模型路径,执行如下命令进行推理:
|
||||
|
||||
|
|
|
@ -25,7 +25,6 @@ Layout recovery combines [layout analysis](../layout/README.md)、[table recogni
|
|||
<div align="center">
|
||||
<img src="../docs/recovery/recovery_ch.jpg" width = "800" />
|
||||
</div>
|
||||
|
||||
<a name="2"></a>
|
||||
|
||||
## 2. Install
|
||||
|
@ -44,7 +43,6 @@ python3 -m pip install "paddlepaddle-gpu" -i https://mirror.baidu.com/pypi/simpl
|
|||
|
||||
# CPU installation
|
||||
python3 -m pip install "paddlepaddle" -i https://mirror.baidu.com/pypi/simple
|
||||
|
||||
````
|
||||
|
||||
For more requirements, please refer to the instructions in [Installation Documentation](https://www.paddlepaddle.org.cn/en/install/quick?docurl=/documentation/docs/en/install/pip/macos-pip_en.html).
|
||||
|
@ -85,6 +83,8 @@ Through layout analysis, we divided the image/PDF documents into regions, locate
|
|||
|
||||
We can restore the test picture through the layout information, OCR detection and recognition structure, table information, and saved pictures.
|
||||
|
||||
The whl package is also provided for quick use, see [quickstart](../docs/quickstart_en.md) for details.
|
||||
|
||||
|
||||
<a name="3.1"></a>
|
||||
### 3.1 Download models
|
||||
|
@ -151,10 +151,10 @@ Field:
|
|||
|
||||
## 4. More
|
||||
|
||||
For training, evaluation and inference tutorial for text detection models, please refer to [text detection doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/detection.md).
|
||||
For training, evaluation and inference tutorial for text detection models, please refer to [text detection doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_en/detection_en.md).
|
||||
|
||||
For training, evaluation and inference tutorial for text recognition models, please refer to [text recognition doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/recognition.md).
|
||||
For training, evaluation and inference tutorial for text recognition models, please refer to [text recognition doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_en/recognition_en.md).
|
||||
|
||||
For training, evaluation and inference tutorial for layout analysis models, please refer to [layout analysis doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/layout/README_ch.md)
|
||||
For training, evaluation and inference tutorial for layout analysis models, please refer to [layout analysis doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/layout/README.md)
|
||||
|
||||
For training, evaluation and inference tutorial for table recognition models, please refer to [table recognition doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/table/README_ch.md)
|
||||
For training, evaluation and inference tutorial for table recognition models, please refer to [table recognition doc](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/table/README.md)
|
||||
|
|
|
@ -6,7 +6,6 @@
|
|||
- [2. 安装](#2)
|
||||
- [2.1 安装依赖](#2.1)
|
||||
- [2.2 安装PaddleOCR](#2.2)
|
||||
|
||||
- [3. 使用](#3)
|
||||
- [3.1 下载模型](#3.1)
|
||||
- [3.2 版面恢复](#3.2)
|
||||
|
@ -27,7 +26,6 @@
|
|||
<div align="center">
|
||||
<img src="../docs/recovery/recovery_ch.jpg" width = "800" />
|
||||
</div>
|
||||
|
||||
<a name="2"></a>
|
||||
|
||||
## 2. 安装
|
||||
|
@ -87,6 +85,8 @@ python3 -m pip install -r ppstructure/recovery/requirements.txt
|
|||
|
||||
我们通过版面信息、OCR检测和识别结构、表格信息、保存的图片,对测试图片进行恢复即可。
|
||||
|
||||
提供如下代码实现版面恢复,也提供了whl包的形式方便快速使用,详见 [quickstart](../docs/quickstart.md)。
|
||||
|
||||
<a name="3.1"></a>
|
||||
|
||||
### 3.1 下载模型
|
||||
|
|
Loading…
Reference in New Issue