mirror of
https://github.com/PaddlePaddle/PaddleOCR.git
synced 2025-06-03 21:53:39 +08:00
commit
d048fe8d77
@ -10,13 +10,17 @@
|
||||
<a name="1"></a>
|
||||
## 1. 版面分析模型
|
||||
|
||||
|模型名称|模型简介|下载地址|label_map|
|
||||
| --- | --- | --- | --- |
|
||||
| ppyolov2_r50vd_dcn_365e_publaynet | PubLayNet 数据集训练的版面分析模型,可以划分**文字、标题、表格、图片以及列表**5类区域 | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet.tar) / [训练模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet_pretrained.pdparams) |{0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}|
|
||||
| ppyolov2_r50vd_dcn_365e_tableBank_word | TableBank Word 数据集训练的版面分析模型,只能检测表格 | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_word.tar) | {0:"Table"}|
|
||||
| ppyolov2_r50vd_dcn_365e_tableBank_latex | TableBank Latex 数据集训练的版面分析模型,只能检测表格 | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_latex.tar) | {0:"Table"}|
|
||||
|模型名称|模型简介|推理模型大小|下载地址|dict path|
|
||||
| --- | --- | --- | --- | --- |
|
||||
| picodet_lcnet_x1_0_fgd_layout | 基于PicoDet LCNet_x1_0和FGD蒸馏在PubLayNet 数据集训练的英文版面分析模型,可以划分**文字、标题、表格、图片以及列表**5类区域 | 9.7M | [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout.pdparams) | [PubLayNet dict](../../ppocr/utils/dict/layout_dict/layout_publaynet_dict.txt) |
|
||||
| ppyolov2_r50vd_dcn_365e_publaynet | 基于PP-YOLOv2在PubLayNet数据集上训练的英文版面分析模型 | 221M | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet.tar) / [训练模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet_pretrained.pdparams) | 同上 |
|
||||
| picodet_lcnet_x1_0_fgd_layout_cdla | CDLA数据集训练的中文版面分析模型,可以划分为**表格、图片、图片标题、表格、表格标题、页眉、脚本、引用、公式**10类区域 | 9.7M | [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla.pdparams) | [CDLA dict](../../ppocr/utils/dict/layout_dict/layout_cdla_dict.txt) |
|
||||
| picodet_lcnet_x1_0_fgd_layout_table | 表格数据集训练的版面分析模型,支持中英文文档表格区域的检测 | 9.7M | [推理模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table.pdparams) | [Table dict](../../ppocr/utils/dict/layout_dict/layout_table_dict.txt) |
|
||||
| ppyolov2_r50vd_dcn_365e_tableBank_word | 基于PP-YOLOv2在TableBank Word 数据集训练的版面分析模型,支持英文文档表格区域的检测 | 221M | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_word.tar) | 同上 |
|
||||
| ppyolov2_r50vd_dcn_365e_tableBank_latex | 基于PP-YOLOv2在TableBank Latex数据集训练的版面分析模型,支持英文文档表格区域的检测 | 221M | [推理模型](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_latex.tar) | 同上 |
|
||||
|
||||
<a name="2"></a>
|
||||
|
||||
## 2. OCR和表格识别模型
|
||||
|
||||
<a name="21"></a>
|
||||
|
@ -6,15 +6,18 @@
|
||||
- [2.2 Table Recognition](#22-table-recognition)
|
||||
- [3. KIE](#3-kie)
|
||||
|
||||
|
||||
<a name="1"></a>
|
||||
|
||||
## 1. Layout Analysis
|
||||
|
||||
|model name| description |download|label_map|
|
||||
| --- |---------------------------------------------------------------------------------------------------------------------------------------------------------| --- | --- |
|
||||
| ppyolov2_r50vd_dcn_365e_publaynet | The layout analysis model trained on the PubLayNet dataset, the model can recognition 5 types of areas such as **text, title, table, picture and list** | [inference model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet.tar) / [trained model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet_pretrained.pdparams) |{0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}|
|
||||
| ppyolov2_r50vd_dcn_365e_tableBank_word | The layout analysis model trained on the TableBank Word dataset, the model can only detect tables | [inference model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_word.tar) | {0:"Table"}|
|
||||
| ppyolov2_r50vd_dcn_365e_tableBank_latex | The layout analysis model trained on the TableBank Latex dataset, the model can only detect tables | [inference model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_latex.tar) | {0:"Table"}|
|
||||
|model name| description | inference model size |download|dict path|
|
||||
| --- |---------------------------------------------------------------------------------------------------------------------------------------------------------| --- | --- | --- |
|
||||
| picodet_lcnet_x1_0_fgd_layout | The layout analysis English model trained on the PubLayNet dataset based on PicoDet LCNet_x1_0 and FGD . the model can recognition 5 types of areas such as **Text, Title, Table, Picture and List** | 9.7M | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout.pdparams) | [PubLayNet dict](../../ppocr/utils/dict/layout_dict/layout_publaynet_dict.txt) |
|
||||
| ppyolov2_r50vd_dcn_365e_publaynet | The layout analysis English model trained on the PubLayNet dataset based on PP-YOLOv2 | 221M | [inference_moel]](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet.tar) / [trained model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_publaynet_pretrained.pdparams) | sme as above |
|
||||
| picodet_lcnet_x1_0_fgd_layout_cdla | The layout analysis Chinese model trained on the CDLA dataset, the model can recognition 10 types of areas such as **Table、Figure、Figure caption、Table、Table caption、Header、Footer、Reference、Equation** | 9.7M | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla.pdparams) | [CDLA dict](../../ppocr/utils/dict/layout_dict/layout_cdla_dict.txt) |
|
||||
| picodet_lcnet_x1_0_fgd_layout_table | The layout analysis model trained on the table dataset, the model can detect tables in Chinese and English documents | 9.7M | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_table.pdparams) | [Table dict](../../ppocr/utils/dict/layout_dict/layout_table_dict.txt) |
|
||||
| ppyolov2_r50vd_dcn_365e_tableBank_word | The layout analysis model trained on the TableBank Word dataset based on PP-YOLOv2, the model can detect tables in English documents | 221M | [inference model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_word.tar) | same as above |
|
||||
| ppyolov2_r50vd_dcn_365e_tableBank_latex | The layout analysis model trained on the TableBank Latex dataset based on PP-YOLOv2, the model can detect tables in English documents | 221M | [inference model](https://paddle-model-ecology.bj.bcebos.com/model/layout-parser/ppyolov2_r50vd_dcn_365e_tableBank_latex.tar) | same as above |
|
||||
|
||||
<a name="2"></a>
|
||||
## 2. OCR and Table Recognition
|
||||
|
@ -8,15 +8,21 @@
|
||||
- [2.1.3 版面分析](#213-版面分析)
|
||||
- [2.1.4 表格识别](#214-表格识别)
|
||||
- [2.1.5 关键信息抽取](#215-关键信息抽取)
|
||||
- [2.1.6 版面恢复](#216-版面恢复)
|
||||
- [2.2 代码使用](#22-代码使用)
|
||||
- [2.2.1 图像方向分类版面分析表格识别](#221-图像方向分类版面分析表格识别)
|
||||
|
||||
- [2.2.1 图像方向+分类版面分析+表格识别](#221-图像方向分类版面分析表格识别)
|
||||
- [2.2.2 版面分析+表格识别](#222-版面分析表格识别)
|
||||
- [2.2.3 版面分析](#223-版面分析)
|
||||
- [2.2.4 表格识别](#224-表格识别)
|
||||
|
||||
- [2.2.5 关键信息抽取](#225-关键信息抽取)
|
||||
- [2.2.6 版面恢复](#226-版面恢复)
|
||||
|
||||
- [2.3 返回结果说明](#23-返回结果说明)
|
||||
- [2.3.1 版面分析+表格识别](#231-版面分析表格识别)
|
||||
- [2.3.2 关键信息抽取](#232-关键信息抽取)
|
||||
|
||||
- [2.4 参数说明](#24-参数说明)
|
||||
|
||||
|
||||
@ -24,11 +30,12 @@
|
||||
## 1. 安装依赖包
|
||||
|
||||
```bash
|
||||
# 安装 paddleocr,推荐使用2.5+版本
|
||||
pip3 install "paddleocr>=2.5"
|
||||
# 安装 paddleocr,推荐使用2.6版本
|
||||
pip3 install "paddleocr>=2.6"
|
||||
# 安装 关键信息抽取 依赖包(如不需要KIE功能,可跳过)
|
||||
pip install -r kie/requirements.txt
|
||||
|
||||
# 安装 图像方向分类依赖包paddleclas(如不需要图像方向分类功能,可跳过)
|
||||
pip3 install paddleclas
|
||||
```
|
||||
|
||||
<a name="2"></a>
|
||||
@ -62,15 +69,24 @@ paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/table.jpg --type=structur
|
||||
```
|
||||
|
||||
<a name="215"></a>
|
||||
#### 2.1.5 关键信息抽取
|
||||
|
||||
#### 2.1.5 关键信息抽取
|
||||
请参考:[关键信息抽取教程](../kie/README_ch.md)。
|
||||
|
||||
<a name="216"></a>
|
||||
|
||||
#### 2.1.6 版面恢复
|
||||
|
||||
```bash
|
||||
paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/1.png --type=structure --recovery=true
|
||||
```
|
||||
|
||||
<a name="22"></a>
|
||||
|
||||
### 2.2 代码使用
|
||||
|
||||
<a name="221"></a>
|
||||
#### 2.2.1 图像方向分类版面分析表格识别
|
||||
#### 2.2.1 图像方向分类+版面分析+表格识别
|
||||
|
||||
```python
|
||||
import os
|
||||
@ -149,6 +165,7 @@ for line in result:
|
||||
```
|
||||
|
||||
<a name="224"></a>
|
||||
|
||||
#### 2.2.4 表格识别
|
||||
|
||||
```python
|
||||
@ -174,6 +191,33 @@ for line in result:
|
||||
|
||||
请参考:[关键信息抽取教程](../kie/README_ch.md)。
|
||||
|
||||
<a name="226"></a>
|
||||
|
||||
#### 2.2.6 版面恢复
|
||||
|
||||
```python
|
||||
import os
|
||||
import cv2
|
||||
from paddleocr import PPStructure,save_structure_res
|
||||
from paddelocr.ppstructure.recovery.recovery_to_doc import sorted_layout_boxes, convert_info_docx
|
||||
|
||||
table_engine = PPStructure(layout=False, show_log=True)
|
||||
|
||||
save_folder = './output'
|
||||
img_path = 'PaddleOCR/ppstructure/docs/table/1.png'
|
||||
img = cv2.imread(img_path)
|
||||
result = table_engine(img)
|
||||
save_structure_res(result, save_folder, os.path.basename(img_path).split('.')[0])
|
||||
|
||||
for line in result:
|
||||
line.pop('img')
|
||||
print(line)
|
||||
|
||||
h, w, _ = img.shape
|
||||
res = sorted_layout_boxes(res, w)
|
||||
convert_info_docx(img, result, save_folder, os.path.basename(img_path).split('.')[0])
|
||||
```
|
||||
|
||||
<a name="23"></a>
|
||||
### 2.3 返回结果说明
|
||||
PP-Structure的返回结果为一个dict组成的list,示例如下
|
||||
@ -235,6 +279,7 @@ dict 里各个字段说明如下
|
||||
| table | 前向中是否执行表格识别 | True |
|
||||
| ocr | 对于版面分析中的非表格区域,是否执行ocr。当layout为False时会被自动设置为False| True |
|
||||
| recovery | 前向中是否执行版面恢复| False |
|
||||
| save_pdf | 版面恢复导出docx文件的同时,是否导出pdf文件 | False |
|
||||
| structure_version | 模型版本,可选 PP-structure和PP-structurev2 | PP-structure |
|
||||
|
||||
大部分参数和PaddleOCR whl包保持一致,见 [whl包文档](../../doc/doc_ch/whl.md)
|
||||
|
@ -8,12 +8,15 @@
|
||||
- [2.1.3 layout analysis](#213-layout-analysis)
|
||||
- [2.1.4 table recognition](#214-table-recognition)
|
||||
- [2.1.5 Key Information Extraction](#215-Key-Information-Extraction)
|
||||
- [2.1.6 layout recovery](#216-layout-recovery)
|
||||
- [2.2 Use by code](#22-use-by-code)
|
||||
- [2.2.1 image orientation + layout analysis + table recognition](#221-image-orientation--layout-analysis--table-recognition)
|
||||
- [2.2.2 layout analysis + table recognition](#222-layout-analysis--table-recognition)
|
||||
- [2.2.3 layout analysis](#223-layout-analysis)
|
||||
- [2.2.4 table recognition](#224-table-recognition)
|
||||
- [2.2.5 DocVQA](#225-dockie)
|
||||
- [2.2.5 Key Information Extraction](#225-Key-Information-Extraction)
|
||||
- [2.2.6 layout recovery](#226-layout-recovery)
|
||||
- [2.3 Result description](#23-result-description)
|
||||
- [2.3.1 layout analysis + table recognition](#231-layout-analysis--table-recognition)
|
||||
- [2.3.2 Key Information Extraction](#232-Key-Information-Extraction)
|
||||
@ -24,14 +27,16 @@
|
||||
## 1. Install package
|
||||
|
||||
```bash
|
||||
# Install paddleocr, version 2.5+ is recommended
|
||||
pip3 install "paddleocr>=2.5"
|
||||
# Install paddleocr, version 2.6 is recommended
|
||||
pip3 install "paddleocr>=2.6"
|
||||
# Install the KIE dependency packages (if you do not use the KIE, you can skip it)
|
||||
pip install -r kie/requirements.txt
|
||||
|
||||
# Install the image direction classification dependency package paddleclas (if you do not use the image direction classification, you can skip it)
|
||||
pip3 install paddleclas
|
||||
```
|
||||
|
||||
<a name="2"></a>
|
||||
|
||||
## 2. Use
|
||||
|
||||
<a name="21"></a>
|
||||
@ -66,6 +71,12 @@ paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/table.jpg --type=structur
|
||||
|
||||
Please refer to: [Key Information Extraction](../kie/README.md) .
|
||||
|
||||
<a name="216"></a>
|
||||
#### 2.1.6 layout recovery
|
||||
```bash
|
||||
paddleocr --image_dir=PaddleOCR/ppstructure/docs/table/1.png --type=structure --recovery=true
|
||||
```
|
||||
|
||||
<a name="22"></a>
|
||||
### 2.2 Use by code
|
||||
|
||||
@ -174,6 +185,32 @@ for line in result:
|
||||
|
||||
Please refer to: [Key Information Extraction](../kie/README.md) .
|
||||
|
||||
<a name="226"></a>
|
||||
#### 2.2.6 layout recovery
|
||||
|
||||
```python
|
||||
import os
|
||||
import cv2
|
||||
from paddleocr import PPStructure,save_structure_res
|
||||
from paddelocr.ppstructure.recovery.recovery_to_doc import sorted_layout_boxes, convert_info_docx
|
||||
|
||||
table_engine = PPStructure(layout=False, show_log=True)
|
||||
|
||||
save_folder = './output'
|
||||
img_path = 'PaddleOCR/ppstructure/docs/table/1.png'
|
||||
img = cv2.imread(img_path)
|
||||
result = table_engine(img)
|
||||
save_structure_res(result, save_folder, os.path.basename(img_path).split('.')[0])
|
||||
|
||||
for line in result:
|
||||
line.pop('img')
|
||||
print(line)
|
||||
|
||||
h, w, _ = img.shape
|
||||
res = sorted_layout_boxes(res, w)
|
||||
convert_info_docx(img, result, save_folder, os.path.basename(img_path).split('.')[0])
|
||||
```
|
||||
|
||||
<a name="23"></a>
|
||||
### 2.3 Result description
|
||||
|
||||
@ -235,6 +272,7 @@ Please refer to: [Key Information Extraction](../kie/README.md) .
|
||||
| table | Whether to perform table recognition in forward | True |
|
||||
| ocr | Whether to perform ocr for non-table areas in layout analysis. When layout is False, it will be automatically set to False| True |
|
||||
| recovery | Whether to perform layout recovery in forward| False |
|
||||
| save_pdf | Whether to convert docx to pdf when recovery| False |
|
||||
| structure_version | Structure version, optional PP-structure and PP-structurev2 | PP-structure |
|
||||
|
||||
Most of the parameters are consistent with the PaddleOCR whl package, see [whl package documentation](../../doc/doc_en/whl.md)
|
||||
|
BIN
ppstructure/docs/recovery/recovery.jpg
Normal file
BIN
ppstructure/docs/recovery/recovery.jpg
Normal file
Binary file not shown.
After Width: | Height: | Size: 385 KiB |
Binary file not shown.
Before Width: | Height: | Size: 762 KiB |
@ -63,7 +63,7 @@ python3 -m pip install "paddlepaddle>=2.2" -i https://mirror.baidu.com/pypi/simp
|
||||
git clone https://github.com/PaddlePaddle/PaddleDetection.git
|
||||
```
|
||||
|
||||
- **(2)安装其他依赖 **
|
||||
- **(2)安装其他依赖**
|
||||
|
||||
```bash
|
||||
cd PaddleDetection
|
||||
@ -166,15 +166,17 @@ json文件包含所有图像的标注,数据以字典嵌套的方式存放,
|
||||
|
||||
提供了训练脚本、评估脚本和预测脚本,本节将以PubLayNet预训练模型为例进行讲解。
|
||||
|
||||
如果不希望训练,直接体验后面的模型评估、预测、动转静、推理的流程,可以下载提供的预训练模型,并跳过本部分。
|
||||
如果不希望训练,直接体验后面的模型评估、预测、动转静、推理的流程,可以下载提供的预训练模型(PubLayNet数据集),并跳过本部分。
|
||||
|
||||
```
|
||||
mkdir pretrained_model
|
||||
cd pretrained_model
|
||||
# 下载并解压PubLayNet预训练模型
|
||||
# 下载PubLayNet预训练模型
|
||||
wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout.pdparams
|
||||
```
|
||||
|
||||
下载更多[版面分析模型](../docs/models_list.md)(中文CDLA数据集预训练模型、表格预训练模型)
|
||||
|
||||
### 4.1. 启动训练
|
||||
|
||||
开始训练:
|
||||
@ -184,7 +186,7 @@ wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_
|
||||
如果你希望训练自己的数据集,需要修改配置文件中的数据配置、类别数。
|
||||
|
||||
|
||||
以`configs/picodet/legacy_model/application/layout_detection/picodet_lcnet_x1_0_layout.yml` 为例,修改的内容如下所示。
|
||||
以`configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml` 为例,修改的内容如下所示。
|
||||
|
||||
```yaml
|
||||
metric: COCO
|
||||
@ -223,16 +225,20 @@ TestDataset:
|
||||
# 训练日志会自动保存到 log 目录中
|
||||
|
||||
# 单卡训练
|
||||
export CUDA_VISIBLE_DEVICES=0
|
||||
python3 tools/train.py \
|
||||
-c configs/picodet/legacy_model/application/layout_detection/picodet_lcnet_x1_0_layout.yml \
|
||||
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
|
||||
--eval
|
||||
|
||||
# 多卡训练,通过--gpus参数指定卡号
|
||||
export CUDA_VISIBLE_DEVICES=0,1,2,3
|
||||
python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py \
|
||||
-c configs/picodet/legacy_model/application/layout_detection/picodet_lcnet_x1_0_layout.yml \
|
||||
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
|
||||
--eval
|
||||
```
|
||||
|
||||
**注意:**如果训练时显存out memory,将TrainReader中batch_size调小,同时LearningRate中base_lr等比例减小。发布的config均由8卡训练得到,如果改变GPU卡数为1,那么base_lr需要减小8倍。
|
||||
|
||||
正常启动训练后,会看到以下log输出:
|
||||
|
||||
```
|
||||
@ -254,9 +260,11 @@ PaddleDetection支持了基于FGD([Focal and Global Knowledge Distillation for D
|
||||
更换数据集,修改【TODO】配置中的数据配置、类别数,具体可以参考4.1。启动训练:
|
||||
|
||||
```bash
|
||||
python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py \
|
||||
-c configs/picodet/legacy_model/application/layout_detection/picodet_lcnet_x1_0_layout.yml \
|
||||
--slim_config configs/picodet/legacy_model/application/layout_detection/picodet_lcnet_x2_5_layout.yml \
|
||||
# 单卡训练
|
||||
export CUDA_VISIBLE_DEVICES=0
|
||||
python3 tools/train.py \
|
||||
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
|
||||
--slim_config configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x2_5_layout.yml \
|
||||
--eval
|
||||
```
|
||||
|
||||
@ -267,13 +275,13 @@ python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py \
|
||||
|
||||
### 5.1. 指标评估
|
||||
|
||||
训练中模型参数默认保存在`output/picodet_lcnet_x1_0_layout`目录下。在评估指标时,需要设置`weights`指向保存的参数文件。评估数据集可以通过 `configs/picodet/legacy_model/application/layout_detection/picodet_lcnet_x1_0_layout.yml` 修改`EvalDataset`中的 `image_dir`、`anno_path`和`dataset_dir` 设置。
|
||||
训练中模型参数默认保存在`output/picodet_lcnet_x1_0_layout`目录下。在评估指标时,需要设置`weights`指向保存的参数文件。评估数据集可以通过 `configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml` 修改`EvalDataset`中的 `image_dir`、`anno_path`和`dataset_dir` 设置。
|
||||
|
||||
```bash
|
||||
# GPU 评估, weights 为待测权重
|
||||
python3 tools/eval.py \
|
||||
-c configs/picodet/legacy_model/application/layout_detection/picodet_lcnet_x1_0_layout.yml \
|
||||
-o weigths=./output/picodet_lcnet_x1_0_layout/best_model
|
||||
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
|
||||
-o weights=./output/picodet_lcnet_x1_0_layout/best_model
|
||||
```
|
||||
|
||||
会输出以下信息,打印出mAP、AP0.5等信息。
|
||||
@ -299,8 +307,8 @@ python3 tools/eval.py \
|
||||
|
||||
```
|
||||
python3 tools/eval.py \
|
||||
-c configs/picodet/legacy_model/application/layout_detection/picodet_lcnet_x1_0_layout.yml \
|
||||
--slim_config configs/picodet/legacy_model/application/layout_detection/picodet_lcnet_x2_5_layout.yml \
|
||||
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
|
||||
--slim_config configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x2_5_layout.yml \
|
||||
-o weights=output/picodet_lcnet_x2_5_layout/best_model
|
||||
```
|
||||
|
||||
@ -311,18 +319,17 @@ python3 tools/eval.py \
|
||||
### 5.2. 测试版面分析结果
|
||||
|
||||
|
||||
预测使用的配置文件必须与训练一致,如您通过 `python3 tools/train.py -c configs/picodet/legacy_model/application/layout_detection/picodet_lcnet_x1_0_layout.yml` 完成了模型的训练过程。
|
||||
|
||||
使用 PaddleDetection 训练好的模型,您可以使用如下命令进行中文模型预测。
|
||||
预测使用的配置文件必须与训练一致,如您通过 `python3 tools/train.py -c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml` 完成了模型的训练过程。
|
||||
|
||||
使用 PaddleDetection 训练好的模型,您可以使用如下命令进行模型预测。
|
||||
|
||||
```bash
|
||||
python3 tools/infer.py \
|
||||
-c configs/picodet/legacy_model/application/layout_detection/picodet_lcnet_x1_0_layout.yml \
|
||||
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
|
||||
-o weights='output/picodet_lcnet_x1_0_layout/best_model.pdparams' \
|
||||
--infer_img='docs/images/layout.jpg' \
|
||||
--output_dir=output_dir/ \
|
||||
--draw_threshold=0.4
|
||||
--draw_threshold=0.5
|
||||
```
|
||||
|
||||
- `--infer_img`: 推理单张图片,也可以通过`--infer_dir`推理文件中的所有图片。
|
||||
@ -335,16 +342,15 @@ python3 tools/infer.py \
|
||||
|
||||
```
|
||||
python3 tools/infer.py \
|
||||
-c configs/picodet/legacy_model/application/layout_detection/picodet_lcnet_x1_0_layout.yml \
|
||||
--slim_config configs/picodet/legacy_model/application/layout_detection/picodet_lcnet_x2_5_layout.yml \
|
||||
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
|
||||
--slim_config configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x2_5_layout.yml \
|
||||
-o weights='output/picodet_lcnet_x2_5_layout/best_model.pdparams' \
|
||||
--infer_img='docs/images/layout.jpg' \
|
||||
--output_dir=output_dir/ \
|
||||
--draw_threshold=0.4
|
||||
--draw_threshold=0.5
|
||||
```
|
||||
|
||||
|
||||
|
||||
## 6. 模型导出与预测
|
||||
|
||||
|
||||
@ -356,7 +362,7 @@ inference 模型(`paddle.jit.save`保存的模型) 一般是模型训练,
|
||||
|
||||
```bash
|
||||
python3 tools/export_model.py \
|
||||
-c configs/picodet/legacy_model/application/layout_detection/picodet_lcnet_x1_0_layout.yml \
|
||||
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
|
||||
-o weights=output/picodet_lcnet_x1_0_layout/best_model \
|
||||
--output_dir=output_inference/
|
||||
```
|
||||
@ -377,8 +383,8 @@ FGD蒸馏模型转inference模型步骤如下:
|
||||
|
||||
```bash
|
||||
python3 tools/export_model.py \
|
||||
-c configs/picodet/legacy_model/application/publayernet_lcnet_x1_5/picodet_student.yml \
|
||||
--slim_config configs/picodet/legacy_model/application/publayernet_lcnet_x1_5/picodet_teacher.yml \
|
||||
-c configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml \
|
||||
--slim_config configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x2_5_layout.yml \
|
||||
-o weights=./output/picodet_lcnet_x2_5_layout/best_model \
|
||||
--output_dir=output_inference/
|
||||
```
|
||||
@ -466,4 +472,3 @@ preprocess_time(ms): 2172.50, inference_time(ms): 11.90, postprocess_time(ms): 1
|
||||
year={2022}
|
||||
}
|
||||
```
|
||||
|
||||
|
13
ppstructure/layout/__init__.py
Normal file
13
ppstructure/layout/__init__.py
Normal file
@ -0,0 +1,13 @@
|
||||
# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
@ -6,6 +6,8 @@ English | [简体中文](README_ch.md)
|
||||
- [2.1 Installation dependencies](#2.1)
|
||||
- [2.2 Install PaddleOCR](#2.2)
|
||||
- [3. Quick Start](#3)
|
||||
- [3.1 Download models](#3.1)
|
||||
- [3.2 Layout recovery](#3.2)
|
||||
|
||||
<a name="1"></a>
|
||||
|
||||
@ -17,8 +19,9 @@ Layout recovery combines [layout analysis](../layout/README.md)、[table recogni
|
||||
The following figure shows the result:
|
||||
|
||||
<div align="center">
|
||||
<img src="../docs/table/recovery.jpg" width = "700" />
|
||||
<img src="../docs/recovery/recovery.jpg" width = "700" />
|
||||
</div>
|
||||
|
||||
<a name="2"></a>
|
||||
|
||||
## 2. Install
|
||||
@ -33,14 +36,14 @@ The following figure shows the result:
|
||||
python3 -m pip install --upgrade pip
|
||||
|
||||
# GPU installation
|
||||
python3 -m pip install "paddlepaddle-gpu>=2.2" -i https://mirror.baidu.com/pypi/simple
|
||||
python3 -m pip install "paddlepaddle-gpu" -i https://mirror.baidu.com/pypi/simple
|
||||
|
||||
# CPU installation
|
||||
python3 -m pip install "paddlepaddle>=2.2" -i https://mirror.baidu.com/pypi/simple
|
||||
python3 -m pip install "paddlepaddle" -i https://mirror.baidu.com/pypi/simple
|
||||
|
||||
````
|
||||
|
||||
For more requirements, please refer to the instructions in [Installation Documentation](https://www.paddlepaddle.org.cn/install/quick).
|
||||
For more requirements, please refer to the instructions in [Installation Documentation](https://www.paddlepaddle.org.cn/en/install/quick?docurl=/documentation/docs/en/install/pip/macos-pip_en.html).
|
||||
|
||||
<a name="2.2"></a>
|
||||
|
||||
@ -67,38 +70,61 @@ python3 -m pip install -r ppstructure/recovery/requirements.txt
|
||||
|
||||
## 3. Quick Start
|
||||
|
||||
```python
|
||||
<a name="3.1"></a>
|
||||
### 3.1 Download models
|
||||
|
||||
If input is English document, download English models:
|
||||
|
||||
```bash
|
||||
cd PaddleOCR/ppstructure
|
||||
|
||||
# download model
|
||||
mkdir inference && cd inference
|
||||
# Download the detection model of the ultra-lightweight English PP-OCRv3 model and unzip it
|
||||
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_det_infer.tar
|
||||
https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_infer.tar && tar xf en_PP-OCRv3_det_infer.tar
|
||||
# Download the recognition model of the ultra-lightweight English PP-OCRv3 model and unzip it
|
||||
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar
|
||||
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_infer.tar && tar xf en_PP-OCRv3_rec_infer.tar
|
||||
# Download the ultra-lightweight English table inch model and unzip it
|
||||
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar && tar xf en_ppocr_mobile_v2.0_table_structure_infer.tar
|
||||
wget https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/en_ppstructure_mobile_v2.0_SLANet_infer.tar && tar xf en_ppstructure_mobile_v2.0_SLANet_infer.tar
|
||||
# Download the layout model of publaynet dataset and unzip it
|
||||
wget
|
||||
https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout_infer.tar && tar picodet_lcnet_x1_0_layout_infer.tar
|
||||
wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar && tar xf picodet_lcnet_x1_0_fgd_layout_infer.tar
|
||||
cd ..
|
||||
# run
|
||||
```
|
||||
If input is Chinese document,download Chinese models:
|
||||
[Chinese and English ultra-lightweight PP-OCRv3 model](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/README.md#pp-ocr-series-model-listupdate-on-september-8th)、[表格识别模型](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/docs/models_list.md#22-表格识别模型)、[版面分析模型](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/docs/models_list.md#1-版面分析模型)
|
||||
|
||||
<a name="3.2"></a>
|
||||
### 3.2 Layout recovery
|
||||
|
||||
|
||||
```bash
|
||||
python3 predict_system.py \
|
||||
--image_dir=./docs/table/1.png \
|
||||
--det_model_dir=inference/en_PP-OCRv3_det_infer \
|
||||
--rec_model_dir=inference/en_PP-OCRv3_rec_infe \
|
||||
--rec_model_dir=inference/en_PP-OCRv3_rec_infer \
|
||||
--rec_char_dict_path=../ppocr/utils/en_dict.txt \
|
||||
--output=../output/ \
|
||||
--table_model_dir=inference/ch_ppstructure_mobile_v2.0_SLANet_infer \
|
||||
--table_model_dir=inference/en_ppstructure_mobile_v2.0_SLANet_infer \
|
||||
--table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \
|
||||
--table_max_len=488 \
|
||||
--layout_model_dir=inference/picodet_lcnet_x1_0_layout_infer \
|
||||
--layout_model_dir=inference/picodet_lcnet_x1_0_fgd_layout_infer \
|
||||
--layout_dict_path=../ppocr/utils/dict/layout_dict/layout_publaynet_dict.txt \
|
||||
--vis_font_path=../doc/fonts/simfang.ttf \
|
||||
--recovery=True \
|
||||
--save_pdf=False
|
||||
--save_pdf=False \
|
||||
--output=../output/
|
||||
```
|
||||
|
||||
After running, the docx of each picture will be saved in the directory specified by the output field
|
||||
|
||||
Recovery table to Word code[table_process.py] reference:https://github.com/pqzx/html2docx.git
|
||||
Field:
|
||||
|
||||
- image_dir:test file测试文件, can be picture, picture directory, pdf file, pdf file directory
|
||||
- det_model_dir:OCR detection model path
|
||||
- rec_model_dir:OCR recognition model path
|
||||
- rec_char_dict_path:OCR recognition dict path. If the Chinese model is used, change to "../ppocr/utils/ppocr_keys_v1.txt". And if you trained the model on your own dataset, change to the trained dictionary
|
||||
- table_model_dir:tabel recognition model path
|
||||
- table_char_dict_path:tabel recognition dict path. If the Chinese model is used, no need to change
|
||||
- layout_model_dir:layout analysis model path
|
||||
- layout_dict_path:layout analysis dict path. If the Chinese model is used, change to "../ppocr/utils/dict/layout_dict/layout_cdla_dict.txt"
|
||||
- recovery:whether to enable layout of recovery, default False
|
||||
- save_pdf:when recovery file, whether to save pdf file, default False
|
||||
- output:save the recovery result path
|
||||
|
@ -8,6 +8,8 @@
|
||||
- [2.2 安装PaddleOCR](#2.2)
|
||||
|
||||
- [3. 使用](#3)
|
||||
- [3.1 下载模型](#3.1)
|
||||
- [3.2 版面恢复](#3.2)
|
||||
|
||||
|
||||
<a name="1"></a>
|
||||
@ -16,11 +18,12 @@
|
||||
|
||||
版面恢复就是在OCR识别后,内容仍然像原文档图片那样排列着,段落不变、顺序不变的输出到word文档中等。
|
||||
|
||||
版面恢复结合了[版面分析](../layout/README_ch.md)、[表格识别](../table/README_ch.md)技术,从而更好地恢复图片、表格、标题等内容,下图展示了版面恢复的结果:
|
||||
版面恢复结合了[版面分析](../layout/README_ch.md)、[表格识别](../table/README_ch.md)技术,从而更好地恢复图片、表格、标题等内容,支持pdf文档、文档图片格式的输入文件,下图展示了版面恢复的结果:
|
||||
|
||||
<div align="center">
|
||||
<img src="../docs/table/recovery.jpg" width = "700" />
|
||||
<img src="../docs/recovery/recovery.jpg" width = "700" />
|
||||
</div>
|
||||
|
||||
<a name="2"></a>
|
||||
|
||||
## 2. 安装
|
||||
@ -35,10 +38,10 @@
|
||||
python3 -m pip install --upgrade pip
|
||||
|
||||
# GPU安装
|
||||
python3 -m pip install "paddlepaddle-gpu>=2.3" -i https://mirror.baidu.com/pypi/simple
|
||||
python3 -m pip install "paddlepaddle-gpu" -i https://mirror.baidu.com/pypi/simple
|
||||
|
||||
# CPU安装
|
||||
python3 -m pip install "paddlepaddle>=2.3" -i https://mirror.baidu.com/pypi/simple
|
||||
python3 -m pip install "paddlepaddle" -i https://mirror.baidu.com/pypi/simple
|
||||
|
||||
```
|
||||
|
||||
@ -69,40 +72,66 @@ python3 -m pip install -r ppstructure/recovery/requirements.txt
|
||||
|
||||
## 3. 使用
|
||||
|
||||
恢复给定文档的版面:
|
||||
<a name="3.1"></a>
|
||||
|
||||
```python
|
||||
### 3.1 下载模型
|
||||
|
||||
如果输入为英文文档类型,下载英文模型
|
||||
|
||||
```bash
|
||||
cd PaddleOCR/ppstructure
|
||||
|
||||
# 下载模型
|
||||
mkdir inference && cd inference
|
||||
# 下载超英文轻量级PP-OCRv3模型的检测模型并解压
|
||||
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_det_infer.tar
|
||||
# 下载英文轻量级PP-OCRv3模型的识别模型并解压
|
||||
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar
|
||||
# 下载超轻量级英文表格英寸模型并解压
|
||||
wget https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar && tar xf ch_ppstructure_mobile_v2.0_SLANet_infer.tar
|
||||
# 下载英文超轻量PP-OCRv3检测模型并解压
|
||||
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_infer.tar && tar xf en_PP-OCRv3_det_infer.tar
|
||||
# 下载英文超轻量PP-OCRv3识别模型并解压
|
||||
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_infer.tar && tar xf en_PP-OCRv3_rec_infer.tar
|
||||
# 下载英文表格识别模型并解压
|
||||
wget https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/en_ppstructure_mobile_v2.0_SLANet_infer.tar && tar xf en_ppstructure_mobile_v2.0_SLANet_infer.tar
|
||||
# 下载英文版面分析模型
|
||||
wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout_infer.tar && tar picodet_lcnet_x1_0_layout_infer.tar
|
||||
wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar && tar xf picodet_lcnet_x1_0_fgd_layout_infer.tar
|
||||
cd ..
|
||||
```
|
||||
|
||||
# 执行预测
|
||||
如果输入为中文文档类型,在下述链接中下载中文模型即可:
|
||||
|
||||
[PP-OCRv3中英文超轻量文本检测和识别模型](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/README_ch.md#pp-ocr%E7%B3%BB%E5%88%97%E6%A8%A1%E5%9E%8B%E5%88%97%E8%A1%A8%E6%9B%B4%E6%96%B0%E4%B8%AD)、[表格识别模型](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/docs/models_list.md#22-表格识别模型)、[版面分析模型](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/docs/models_list.md#1-版面分析模型)
|
||||
|
||||
<a name="3.2"></a>
|
||||
|
||||
### 3.2 版面恢复
|
||||
|
||||
使用下载的模型恢复给定文档的版面,以英文模型为例,执行如下命令:
|
||||
|
||||
```bash
|
||||
python3 predict_system.py \
|
||||
--image_dir=./docs/table/1.png \
|
||||
--det_model_dir=inference/en_PP-OCRv3_det_infer \
|
||||
--rec_model_dir=inference/en_PP-OCRv3_rec_infe \
|
||||
--rec_model_dir=inference/en_PP-OCRv3_rec_infer \
|
||||
--rec_char_dict_path=../ppocr/utils/en_dict.txt \
|
||||
--output=../output/ \
|
||||
--table_model_dir=inference/ch_ppstructure_mobile_v2.0_SLANet_infer \
|
||||
--table_model_dir=inference/en_ppstructure_mobile_v2.0_SLANet_infer \
|
||||
--table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \
|
||||
--table_max_len=488 \
|
||||
--layout_model_dir=inference/picodet_lcnet_x1_0_layout_infer \
|
||||
--layout_model_dir=inference/picodet_lcnet_x1_0_fgd_layout_infer \
|
||||
--layout_dict_path=../ppocr/utils/dict/layout_dict/layout_publaynet_dict.txt \
|
||||
--vis_font_path=../doc/fonts/simfang.ttf \
|
||||
--recovery=True \
|
||||
--save_pdf=False
|
||||
--save_pdf=False \
|
||||
--output=../output/
|
||||
```
|
||||
|
||||
运行完成后,每张图片的docx文档会保存到`output`字段指定的目录下
|
||||
运行完成后,恢复版面的docx文档会保存到`output`字段指定的目录下
|
||||
|
||||
表格恢复到Word代码[table_process.py]来自:https://github.com/pqzx/html2docx.git
|
||||
字段含义:
|
||||
|
||||
- image_dir:测试文件,可以是图片、图片目录、pdf文件、pdf文件目录
|
||||
- det_model_dir:OCR检测模型路径
|
||||
- rec_model_dir:OCR识别模型路径
|
||||
- rec_char_dict_path:OCR识别字典,如果更换为中文模型,需要更改为"../ppocr/utils/ppocr_keys_v1.txt",如果您在自己的数据集上训练的模型,则更改为训练的字典的文件
|
||||
- table_model_dir:表格识别模型路径
|
||||
- table_char_dict_path:表格识别字典,如果更换为中文模型,不需要更换字典
|
||||
- layout_model_dir:版面分析模型路径
|
||||
- layout_dict_path:版面分析字典,如果更换为中文模型,需要更改为"../ppocr/utils/dict/layout_dict/layout_cdla_dict.txt"
|
||||
- recovery:是否进行版面恢复,默认False
|
||||
- save_pdf:进行版面恢复导出docx文档的同时,是否保存为pdf文件,默认为False
|
||||
- output:版面恢复结果保存路径
|
||||
|
13
ppstructure/recovery/__init__.py
Normal file
13
ppstructure/recovery/__init__.py
Normal file
@ -0,0 +1,13 @@
|
||||
# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
@ -24,7 +24,7 @@ from docx.enum.section import WD_SECTION
|
||||
from docx.oxml.ns import qn
|
||||
from docx.enum.table import WD_TABLE_ALIGNMENT
|
||||
|
||||
from table_process import HtmlToDocx
|
||||
from ppstructure.recovery.table_process import HtmlToDocx
|
||||
|
||||
from ppocr.utils.logging import get_logger
|
||||
logger = get_logger()
|
||||
@ -86,10 +86,10 @@ def convert_info_docx(img, res, save_folder, img_name, save_pdf):
|
||||
|
||||
# save to pdf
|
||||
if save_pdf:
|
||||
pdf = os.path.join(save_folder, '{}.pdf'.format(img_name))
|
||||
pdf_path = os.path.join(save_folder, '{}.pdf'.format(img_name))
|
||||
from docx2pdf import convert
|
||||
convert(docx_path, pdf_path)
|
||||
logger.info('pdf save to {}'.format(pdf))
|
||||
logger.info('pdf save to {}'.format(pdf_path))
|
||||
|
||||
|
||||
def sorted_layout_boxes(res, w):
|
||||
@ -137,7 +137,7 @@ def sorted_layout_boxes(res, w):
|
||||
res_left = []
|
||||
res_right = []
|
||||
break
|
||||
elif _boxes[i]['bbox'][0] < w / 4 and _boxes[i]['bbox'][2] < 3*w / 4:
|
||||
elif _boxes[i]['bbox'][0] < w / 4 and _boxes[i]['bbox'][2] < 3 * w / 4:
|
||||
_boxes[i]['layout'] = 'double'
|
||||
res_left.append(_boxes[i])
|
||||
i += 1
|
||||
|
@ -84,13 +84,18 @@ def init_args():
|
||||
type=str2bool,
|
||||
default=True,
|
||||
help='In the forward, whether the non-table area is recognition by ocr')
|
||||
# param for recovery
|
||||
parser.add_argument(
|
||||
"--recovery",
|
||||
type=bool,
|
||||
type=str2bool,
|
||||
default=False,
|
||||
help='Whether to enable layout of recovery')
|
||||
parser.add_argument(
|
||||
"--save_pdf", type=bool, default=False, help='Whether to save pdf file')
|
||||
"--save_pdf",
|
||||
type=str2bool,
|
||||
default=False,
|
||||
help='Whether to save pdf file')
|
||||
|
||||
return parser
|
||||
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user