PaddleOCR/ppstructure/docs/kie.md

78 lines
2.6 KiB
Markdown
Raw Normal View History

2021-12-21 13:17:25 +08:00
2022-03-22 19:05:52 +08:00
# Key Information Extraction(KIE)
2021-12-21 13:17:25 +08:00
2022-03-22 19:05:52 +08:00
This section provides a tutorial example on how to quickly use, train, and evaluate a key information extraction(KIE) model, [SDMGR](https://arxiv.org/abs/2103.14470), in PaddleOCR.
2021-12-21 13:17:25 +08:00
2022-03-22 19:05:52 +08:00
[SDMGR(Spatial Dual-Modality Graph Reasoning)](https://arxiv.org/abs/2103.14470) is a KIE algorithm that classifies each detected text region into predefined categories, such as order ID, invoice number, amount, and etc.
2021-12-21 13:17:25 +08:00
2022-03-22 19:05:52 +08:00
* [1. Quick Use](#1-----)
* [2. Model Training](#2-----)
* [3. Model Evaluation](#3-----)
2021-12-21 13:17:25 +08:00
<a name="1-----"></a>
2022-03-22 19:05:52 +08:00
## 1. Quick Use
2021-12-21 13:17:25 +08:00
2022-03-22 19:05:52 +08:00
[Wildreceipt dataset](https://paperswithcode.com/dataset/wildreceipt) is used for this tutorial. It contains 1765 photos, with 25 classes, and 50000 text boxes, which can be downloaded by wget:
```shell
2021-12-21 13:17:25 +08:00
wget https://paddleocr.bj.bcebos.com/dygraph_v2.1/kie/wildreceipt.tar && tar xf wildreceipt.tar
```
2022-03-22 19:05:52 +08:00
Download the pretrained model and predict the result:
2021-12-21 13:17:25 +08:00
2022-03-22 19:05:52 +08:00
```shell
2021-12-21 13:17:25 +08:00
cd PaddleOCR/
wget https://paddleocr.bj.bcebos.com/dygraph_v2.1/kie/kie_vgg16.tar && tar xf kie_vgg16.tar
python3.7 tools/infer_kie.py -c configs/kie/kie_unet_sdmgr.yml -o Global.checkpoints=kie_vgg16/best_accuracy Global.infer_img=../wildreceipt/1.txt
```
2022-03-22 19:05:52 +08:00
The prediction result is saved as `./output/sdmgr_kie/predicts_kie.txt`, and the visualization results are saved in the folder`/output/sdmgr_kie/kie_results/`.
2021-12-21 13:17:25 +08:00
2022-03-22 19:05:52 +08:00
The visualization results are shown in the figure below:
2021-12-21 15:38:49 +08:00
<div align="center">
<img src="./imgs/0.png" width="800">
</div>
2021-12-21 13:17:25 +08:00
<a name="2-----"></a>
2022-03-22 19:05:52 +08:00
## 2. Model Training
2021-12-21 13:17:25 +08:00
2022-03-22 19:05:52 +08:00
Create a softlink to the folder, `PaddleOCR/train_data`:
```shell
2021-12-21 13:17:25 +08:00
cd PaddleOCR/ && mkdir train_data && cd train_data
ln -s ../../wildreceipt ./
```
2022-03-22 19:05:52 +08:00
The configuration file used for training is `configs/kie/kie_unet_sdmgr.yml`. The default training data path in the configuration file is `train_data/wildreceipt`. After preparing the data, you can execute the model training with the following command:
```shell
2021-12-21 13:17:25 +08:00
python3.7 tools/train.py -c configs/kie/kie_unet_sdmgr.yml -o Global.save_model_dir=./output/kie/
```
<a name="3-----"></a>
2022-03-22 19:05:52 +08:00
## 3. Model Evaluation
After training, you can execute the model evaluation with the following command:
```shell
2021-12-21 13:17:25 +08:00
python3.7 tools/eval.py -c configs/kie/kie_unet_sdmgr.yml -o Global.checkpoints=./output/kie/best_accuracy
```
2022-03-22 19:05:52 +08:00
**Reference:**
2021-12-21 13:17:25 +08:00
<!-- [ALGORITHM] -->
```bibtex
@misc{sun2021spatial,
title={Spatial Dual-Modality Graph Reasoning for Key Information Extraction},
author={Hongbin Sun and Zhanghui Kuang and Xiaoyu Yue and Chenhao Lin and Wayne Zhang},
year={2021},
eprint={2103.14470},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```