PaddleOCR/ppstructure/docs/inference_en.md

# Python Inference

- [1. Layout Structured Analysis](#1-layout-structured-analysis)
  - [1.1 layout analysis + table recognition](#11-layout-analysis--table-recognition)
  - [1.2 layout analysis](#12-layout-analysis)
  - [1.3 table recognition](#13-table-recognition)
- [2. Key Information Extraction](#2-key-information-extraction)
  - [2.1 SER](#21-ser)
  - [2.2 RE+SER](#22-reser)

<a name="1"></a>
## 1. Layout Structured Analysis
Go to the `ppstructure` directory

```bash
cd ppstructure
````

download model

```bash
mkdir inference && cd inference
# Download the PP-StructureV2 layout analysis model and unzip it
wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout_infer.tar && tar xf picodet_lcnet_x1_0_layout_infer.tar
# Download the PP-OCRv3 text detection model and unzip it
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_det_infer.tar
# Download the PP-OCRv3 text recognition model and unzip it
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar
# Download the PP-StructureV2 form recognition model and unzip it
wget https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar && tar xf ch_ppstructure_mobile_v2.0_SLANet_infer.tar
cd ..
```
<a name="1.1"></a>
### 1.1 layout analysis + table recognition
```bash
python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv3_det_infer \
                          --rec_model_dir=inference/ch_PP-OCRv3_rec_infer \
                          --table_model_dir=inference/ch_ppstructure_mobile_v2.0_SLANet_infer \
                          --layout_model_dir=inference/picodet_lcnet_x1_0_layout_infer \
                          --image_dir=./docs/table/1.png \
                          --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \
                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \
                          --output=../output \
                          --vis_font_path=../doc/fonts/simfang.ttf
```
After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each table in the image will be stored as an excel, and the picture area will be cropped and saved. The filename of excel and picture is their coordinates in the image. Detailed results are stored in the `res.txt` file.

<a name="1.2"></a>
### 1.2 layout analysis
```bash
python3 predict_system.py --layout_model_dir=inference/picodet_lcnet_x1_0_layout_infer \
                          --image_dir=./docs/table/1.png \
                          --output=../output \
                          --table=false \
                          --ocr=false
```
After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each picture in image will be cropped and saved. The filename of picture area is their coordinates in the image. Layout analysis results will be stored in the `res.txt` file

<a name="1.3"></a>
### 1.3 table recognition
```bash
python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv3_det_infer \
                          --rec_model_dir=inference/ch_PP-OCRv3_rec_infer \
                          --table_model_dir=inference/ch_ppstructure_mobile_v2.0_SLANet_infer \
                          --image_dir=./docs/table/table.jpg \
                          --rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \
                          --table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \
                          --output=../output \
                          --vis_font_path=../doc/fonts/simfang.ttf \
                          --layout=false
```
After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each table in the image will be stored as an excel. The filename of excel is their coordinates in the image.

<a name="2"></a>
## 2. Key Information Extraction

### 2.1 SER
```bash
cd ppstructure

mkdir inference && cd inference
# download model
wget https://paddleocr.bj.bcebos.com/ppstructure/models/vi_layoutxlm/ser_vi_layoutxlm_xfund_infer.tar && tar -xf ser_vi_layoutxlm_xfund_infer.tar
cd ..
python3 predict_system.py \
  --kie_algorithm=LayoutXLM \
  --ser_model_dir=./inference/ser_vi_layoutxlm_xfund_infer \
  --image_dir=./docs/kie/input/zh_val_42.jpg \
  --ser_dict_path=../ppocr/utils/dict/kie_dict/xfund_class_list.txt \
  --vis_font_path=../doc/fonts/simfang.ttf \
  --ocr_order_method="tb-yx" \
  --mode=kie
```

After the operation is completed, each image will store the visualized image in the `kie` directory under the directory specified by the `output` field, and the image name is the same as the input image name.


### 2.2 RE+SER

```bash
cd ppstructure

mkdir inference && cd inference
# download model
wget https://paddleocr.bj.bcebos.com/ppstructure/models/vi_layoutxlm/ser_vi_layoutxlm_xfund_infer.tar && tar -xf ser_vi_layoutxlm_xfund_infer.tar
wget https://paddleocr.bj.bcebos.com/ppstructure/models/vi_layoutxlm/re_vi_layoutxlm_xfund_infer.tar && tar -xf re_vi_layoutxlm_xfund_infer.tar
cd ..

python3 predict_system.py \
  --kie_algorithm=LayoutXLM \
  --re_model_dir=./inference/re_vi_layoutxlm_xfund_infer \
  --ser_model_dir=./inference/ser_vi_layoutxlm_xfund_infer \
  --image_dir=./docs/kie/input/zh_val_42.jpg \
  --ser_dict_path=../ppocr/utils/dict/kie_dict/xfund_class_list.txt \
  --vis_font_path=../doc/fonts/simfang.ttf \
  --ocr_order_method="tb-yx" \
  --mode=kie
```

After the operation is completed, each image will have a directory with the same name in the `kie` directory under the directory specified by the `output` field, where the visual images and prediction results are stored.
update en doc 2022-04-25 16:03:14 +08:00			`# Python Inference`
update 2022-04-21 17:49:14 +08:00
add re to ppstructure system 2022-10-10 13:39:41 +08:00			`- [1. Layout Structured Analysis](#1-layout-structured-analysis)`
			`- [1.1 layout analysis + table recognition](#11-layout-analysis--table-recognition)`
			`- [1.2 layout analysis](#12-layout-analysis)`
			`- [1.3 table recognition](#13-table-recognition)`
			`- [2. Key Information Extraction](#2-key-information-extraction)`
			`- [2.1 SER](#21-ser)`
			`- [2.2 RE+SER](#22-reser)`
update 2022-04-21 17:49:14 +08:00
			`<a name="1"></a>`
update ppstructure readme en 2022-08-23 15:24:14 +08:00			`## 1. Layout Structured Analysis`
update en doc 2022-04-25 16:03:14 +08:00			Go to the `ppstructure` directory
update 2022-04-21 17:49:14 +08:00
			```bash
			`cd ppstructure`
The whl package supports separate table recognition and layout analysis 2022-04-22 13:24:45 +08:00			````
update en doc 2022-04-25 16:03:14 +08:00
			`download model`

The whl package supports separate table recognition and layout analysis 2022-04-22 13:24:45 +08:00			```bash
update 2022-04-21 17:49:14 +08:00			`mkdir inference && cd inference`
add arxiv pps (#7893) * support reconaug * rename ppstructurev2 and add arxiv link * fix link 2022-10-12 14:52:33 +08:00			`# Download the PP-StructureV2 layout analysis model and unzip it`
update inference_en.md 2022-08-22 14:58:09 +08:00			`wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout_infer.tar && tar xf picodet_lcnet_x1_0_layout_infer.tar`
			`# Download the PP-OCRv3 text detection model and unzip it`
update rec det model to fp32 2022-08-22 16:20:20 +08:00			`wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_det_infer.tar`
update inference_en.md 2022-08-22 14:58:09 +08:00			`# Download the PP-OCRv3 text recognition model and unzip it`
update rec det model to fp32 2022-08-22 16:20:20 +08:00			`wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar`
add arxiv pps (#7893) * support reconaug * rename ppstructurev2 and add arxiv link * fix link 2022-10-12 14:52:33 +08:00			`# Download the PP-StructureV2 form recognition model and unzip it`
update inference_en.md 2022-08-22 14:58:09 +08:00			`wget https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar && tar xf ch_ppstructure_mobile_v2.0_SLANet_infer.tar`
update 2022-04-21 17:49:14 +08:00			`cd ..`
The whl package supports separate table recognition and layout analysis 2022-04-22 13:24:45 +08:00			```
			`<a name="1.1"></a>`
update en doc 2022-04-25 16:03:14 +08:00			`### 1.1 layout analysis + table recognition`
The whl package supports separate table recognition and layout analysis 2022-04-22 13:24:45 +08:00			```bash
update rec det model to fp32 2022-08-22 16:20:20 +08:00			`python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv3_det_infer \`
			`--rec_model_dir=inference/ch_PP-OCRv3_rec_infer \`
update inference_en.md 2022-08-22 14:58:09 +08:00			`--table_model_dir=inference/ch_ppstructure_mobile_v2.0_SLANet_infer \`
			`--layout_model_dir=inference/picodet_lcnet_x1_0_layout_infer \`
The whl package supports separate table recognition and layout analysis 2022-04-22 13:24:45 +08:00			`--image_dir=./docs/table/1.png \`
update 2022-04-21 17:49:14 +08:00			`--rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \`
update inference_en.md 2022-08-22 14:58:09 +08:00			`--table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \`
The whl package supports separate table recognition and layout analysis 2022-04-22 13:24:45 +08:00			`--output=../output \`
update 2022-04-21 17:49:14 +08:00			`--vis_font_path=../doc/fonts/simfang.ttf`
			```
update en doc 2022-04-25 16:03:14 +08:00			After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each table in the image will be stored as an excel, and the picture area will be cropped and saved. The filename of excel and picture is their coordinates in the image. Detailed results are stored in the `res.txt` file.
The whl package supports separate table recognition and layout analysis 2022-04-22 13:24:45 +08:00
			`<a name="1.2"></a>`
update en doc 2022-04-25 16:03:14 +08:00			`### 1.2 layout analysis`
The whl package supports separate table recognition and layout analysis 2022-04-22 13:24:45 +08:00			```bash
update inference_en.md 2022-08-22 14:58:09 +08:00			`python3 predict_system.py --layout_model_dir=inference/picodet_lcnet_x1_0_layout_infer \`
			`--image_dir=./docs/table/1.png \`
			`--output=../output \`
			`--table=false \`
			`--ocr=false`
The whl package supports separate table recognition and layout analysis 2022-04-22 13:24:45 +08:00			```
update en doc 2022-04-25 16:03:14 +08:00			After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each picture in image will be cropped and saved. The filename of picture area is their coordinates in the image. Layout analysis results will be stored in the `res.txt` file
The whl package supports separate table recognition and layout analysis 2022-04-22 13:24:45 +08:00
			`<a name="1.3"></a>`
update en doc 2022-04-25 16:03:14 +08:00			`### 1.3 table recognition`
The whl package supports separate table recognition and layout analysis 2022-04-22 13:24:45 +08:00			```bash
update rec det model to fp32 2022-08-22 16:20:20 +08:00			`python3 predict_system.py --det_model_dir=inference/ch_PP-OCRv3_det_infer \`
			`--rec_model_dir=inference/ch_PP-OCRv3_rec_infer \`
update inference_en.md 2022-08-22 14:58:09 +08:00			`--table_model_dir=inference/ch_ppstructure_mobile_v2.0_SLANet_infer \`
The whl package supports separate table recognition and layout analysis 2022-04-22 13:24:45 +08:00			`--image_dir=./docs/table/table.jpg \`
			`--rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \`
update inference_en.md 2022-08-22 14:58:09 +08:00			`--table_char_dict_path=../ppocr/utils/dict/table_structure_dict_ch.txt \`
The whl package supports separate table recognition and layout analysis 2022-04-22 13:24:45 +08:00			`--output=../output \`
			`--vis_font_path=../doc/fonts/simfang.ttf \`
			`--layout=false`
			```
update en doc 2022-04-25 16:03:14 +08:00			After the operation is completed, each image will have a directory with the same name in the `structure` directory under the directory specified by the `output` field. Each table in the image will be stored as an excel. The filename of excel is their coordinates in the image.
update 2022-04-21 17:49:14 +08:00
			`<a name="2"></a>`
update ppstructure readme en 2022-08-23 15:24:14 +08:00			`## 2. Key Information Extraction`
update 2022-04-21 17:49:14 +08:00
add re to ppstructure system 2022-10-10 13:39:41 +08:00			`### 2.1 SER`
update 2022-04-21 17:49:14 +08:00			```bash
			`cd ppstructure`

			`mkdir inference && cd inference`
fix kie doc (#7275) * fix kie doc * fix en 2022-08-22 09:52:23 +08:00			`# download model`
			`wget https://paddleocr.bj.bcebos.com/ppstructure/models/vi_layoutxlm/ser_vi_layoutxlm_xfund_infer.tar && tar -xf ser_vi_layoutxlm_xfund_infer.tar`
update 2022-04-21 17:49:14 +08:00			`cd ..`
add ser to ppstructure system 2022-10-10 14:31:44 +08:00			`python3 predict_system.py \`
fix kie doc (#7275) * fix kie doc * fix en 2022-08-22 09:52:23 +08:00			`--kie_algorithm=LayoutXLM \`
add ser to ppstructure system 2022-10-10 14:31:44 +08:00			`--ser_model_dir=./inference/ser_vi_layoutxlm_xfund_infer \`
fix kie doc (#7275) * fix kie doc * fix en 2022-08-22 09:52:23 +08:00			`--image_dir=./docs/kie/input/zh_val_42.jpg \`
			`--ser_dict_path=../ppocr/utils/dict/kie_dict/xfund_class_list.txt \`
			`--vis_font_path=../doc/fonts/simfang.ttf \`
add ser to ppstructure system 2022-10-10 14:31:44 +08:00			`--ocr_order_method="tb-yx" \`
			`--mode=kie`
update 2022-04-21 17:49:14 +08:00			```
fix kie doc (#7275) * fix kie doc * fix en 2022-08-22 09:52:23 +08:00
polish kie doc and code (#7255) * add fapiao kie * fix readme * fix fanli * add readme * add how to do kie en * add algo kie * add algo overview en * rename vqa to kie * fix read gif 2022-08-21 10:55:49 +08:00			After the operation is completed, each image will store the visualized image in the `kie` directory under the directory specified by the `output` field, and the image name is the same as the input image name.
add re to ppstructure system 2022-10-10 13:39:41 +08:00

			`### 2.2 RE+SER`

			```bash
			`cd ppstructure`

			`mkdir inference && cd inference`
			`# download model`
			`wget https://paddleocr.bj.bcebos.com/ppstructure/models/vi_layoutxlm/ser_vi_layoutxlm_xfund_infer.tar && tar -xf ser_vi_layoutxlm_xfund_infer.tar`
			`wget https://paddleocr.bj.bcebos.com/ppstructure/models/vi_layoutxlm/re_vi_layoutxlm_xfund_infer.tar && tar -xf re_vi_layoutxlm_xfund_infer.tar`
			`cd ..`

			`python3 predict_system.py \`
			`--kie_algorithm=LayoutXLM \`
			`--re_model_dir=./inference/re_vi_layoutxlm_xfund_infer \`
			`--ser_model_dir=./inference/ser_vi_layoutxlm_xfund_infer \`
			`--image_dir=./docs/kie/input/zh_val_42.jpg \`
			`--ser_dict_path=../ppocr/utils/dict/kie_dict/xfund_class_list.txt \`
			`--vis_font_path=../doc/fonts/simfang.ttf \`
			`--ocr_order_method="tb-yx" \`
			`--mode=kie`
			```

			After the operation is completed, each image will have a directory with the same name in the `kie` directory under the directory specified by the `output` field, where the visual images and prediction results are stored.