update en doc
parent
fcd3b0f005
commit
9b15c7f717
|
@ -446,7 +446,7 @@ python3 tools/export_model.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Glo
|
|||
转换成功后,在目录下有三个文件:
|
||||
|
||||
```
|
||||
/inference/en_PP-OCRv3_rec/
|
||||
inference/en_PP-OCRv3_rec/
|
||||
├── inference.pdiparams # 识别inference模型的参数文件
|
||||
├── inference.pdiparams.info # 识别inference模型的参数信息,可忽略
|
||||
└── inference.pdmodel # 识别inference模型的program文件
|
||||
|
|
|
@ -92,8 +92,6 @@ Similar to the training set, the test set also needs to be provided a folder con
|
|||
If you do not have a dataset locally, you can download it on the official website [icdar2015](http://rrc.cvc.uab.es/?ch=4&com=downloads).
|
||||
Also refer to [DTRB](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here) ,download the lmdb format dataset required for benchmark
|
||||
|
||||
If you want to reproduce the paper SAR, you need to download extra dataset [SynthAdd](https://pan.baidu.com/share/init?surl=uV0LtoNmcxbO-0YA7Ch4dg), extraction code: 627x. Besides, icdar2013, icdar2015, cocotext, IIIT5k datasets are also used to train. For specific details, please refer to the paper SAR.
|
||||
|
||||
PaddleOCR provides label files for training the icdar2015 dataset, which can be downloaded in the following ways:
|
||||
|
||||
```
|
||||
|
@ -194,11 +192,11 @@ First download the pretrain model, you can download the trained model to finetun
|
|||
|
||||
```
|
||||
cd PaddleOCR/
|
||||
# Download the pre-trained model of MobileNetV3
|
||||
wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar
|
||||
# Download the pre-trained model of en_PP-OCRv3
|
||||
wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_train.tar
|
||||
# Decompress model parameters
|
||||
cd pretrain_models
|
||||
tar -xf rec_mv3_none_bilstm_ctc_v2.0_train.tar && rm -rf rec_mv3_none_bilstm_ctc_v2.0_train.tar
|
||||
tar -xf en_PP-OCRv3_rec_train.tar && rm -rf en_PP-OCRv3_rec_train.tar
|
||||
```
|
||||
|
||||
Start training:
|
||||
|
@ -208,9 +206,10 @@ Start training:
|
|||
# Training icdar15 English data and The training log will be automatically saved as train.log under "{save_model_dir}"
|
||||
|
||||
#specify the single card training(Long training time, not recommended)
|
||||
python3 tools/train.py -c configs/rec/rec_icdar15_train.yml
|
||||
python3 tools/train.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=en_PP-OCRv3_rec_train/best_accuracy
|
||||
|
||||
#specify the card number through --gpus
|
||||
python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/rec/rec_icdar15_train.yml
|
||||
python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=en_PP-OCRv3_rec_train/best_accuracy
|
||||
```
|
||||
|
||||
|
||||
|
@ -218,31 +217,13 @@ PaddleOCR supports alternating training and evaluation. You can modify `eval_bat
|
|||
|
||||
If the evaluation set is large, the test will be time-consuming. It is recommended to reduce the number of evaluations, or evaluate after training.
|
||||
|
||||
* Tip: You can use the `-c` parameter to select multiple model configurations under the `configs/rec/` path for training. The recognition algorithms supported by PaddleOCR are:
|
||||
|
||||
|
||||
| Configuration file | Algorithm | backbone | trans | seq | pred |
|
||||
| :--------: | :-------: | :-------: | :-------: | :-----: | :-----: |
|
||||
| [rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml) | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc |
|
||||
| [rec_chinese_common_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml) | CRNN | ResNet34_vd | None | BiLSTM | ctc |
|
||||
| rec_chinese_lite_train.yml | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc |
|
||||
| rec_chinese_common_train.yml | CRNN | ResNet34_vd | None | BiLSTM | ctc |
|
||||
| rec_icdar15_train.yml | CRNN | Mobilenet_v3 large 0.5 | None | BiLSTM | ctc |
|
||||
| rec_mv3_none_bilstm_ctc.yml | CRNN | Mobilenet_v3 large 0.5 | None | BiLSTM | ctc |
|
||||
| rec_mv3_none_none_ctc.yml | Rosetta | Mobilenet_v3 large 0.5 | None | None | ctc |
|
||||
| rec_r34_vd_none_bilstm_ctc.yml | CRNN | Resnet34_vd | None | BiLSTM | ctc |
|
||||
| rec_r34_vd_none_none_ctc.yml | Rosetta | Resnet34_vd | None | None | ctc |
|
||||
| rec_mv3_tps_bilstm_att.yml | CRNN | Mobilenet_v3 | TPS | BiLSTM | att |
|
||||
| rec_r34_vd_tps_bilstm_att.yml | CRNN | Resnet34_vd | TPS | BiLSTM | att |
|
||||
| rec_r50fpn_vd_none_srn.yml | SRN | Resnet50_fpn_vd | None | rnn | srn |
|
||||
| rec_mtb_nrtr.yml | NRTR | nrtr_mtb | None | transformer encoder | transformer decoder |
|
||||
| rec_r31_sar.yml | SAR | ResNet31 | None | LSTM encoder | LSTM decoder |
|
||||
* Tip: You can use the `-c` parameter to select multiple model configurations under the `configs/rec/` path for training. The recognition algorithms supported at [rec_algorithm](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_en/algorithm_overview.md):
|
||||
|
||||
|
||||
For training Chinese data, it is recommended to use
|
||||
[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml). If you want to try the result of other algorithms on the Chinese data set, please refer to the following instructions to modify the configuration file:
|
||||
co
|
||||
Take `rec_chinese_lite_train_v2.0.yml` as an example:
|
||||
[ch_PP-OCRv3_rec.yml](../../configs/rec/PP-OCRv3/ch_PP-OCRv3_rec.yml). If you want to try the result of other algorithms on the Chinese data set, please refer to the following instructions to modify the configuration file:
|
||||
|
||||
Take `ch_PP-OCRv3_rec.yml` as an example:
|
||||
```
|
||||
Global:
|
||||
...
|
||||
|
@ -276,7 +257,7 @@ Train:
|
|||
...
|
||||
- RecResizeImg:
|
||||
# Modify image_shape to fit long text
|
||||
image_shape: [3, 32, 320]
|
||||
image_shape: [3, 48, 320]
|
||||
...
|
||||
loader:
|
||||
...
|
||||
|
@ -296,7 +277,7 @@ Eval:
|
|||
...
|
||||
- RecResizeImg:
|
||||
# Modify image_shape to fit long text
|
||||
image_shape: [3, 32, 320]
|
||||
image_shape: [3, 48, 320]
|
||||
...
|
||||
loader:
|
||||
# Eval batch_size for Single card
|
||||
|
@ -372,11 +353,11 @@ Knowledge distillation is supported in PaddleOCR for text recognition training p
|
|||
|
||||
## 3. Evalution
|
||||
|
||||
The evaluation dataset can be set by modifying the `Eval.dataset.label_file_list` field in the `configs/rec/rec_icdar15_train.yml` file.
|
||||
The evaluation dataset can be set by modifying the `Eval.dataset.label_file_list` field in the `configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml` file.
|
||||
|
||||
```
|
||||
# GPU evaluation, Global.checkpoints is the weight to be tested
|
||||
python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_icdar15_train.yml -o Global.checkpoints={path/to/weights}/best_accuracy
|
||||
python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.checkpoints={path/to/weights}/best_accuracy
|
||||
```
|
||||
|
||||
<a name="PREDICTION"></a>
|
||||
|
@ -409,7 +390,7 @@ Among them, best_accuracy.* is the best model on the evaluation set; iter_epoch_
|
|||
|
||||
```
|
||||
# Predict English results
|
||||
python3 tools/infer_rec.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.load_static_weights=false Global.infer_img=doc/imgs_words/en/word_1.jpg
|
||||
python3 tools/infer_rec.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/en/word_1.png
|
||||
```
|
||||
|
||||
|
||||
|
@ -454,7 +435,7 @@ The recognition model is converted to the inference model in the same way as the
|
|||
# Global.pretrained_model parameter Set the training model address to be converted without adding the file suffix .pdmodel, .pdopt or .pdparams.
|
||||
# Global.save_inference_dir Set the address where the converted model will be saved.
|
||||
|
||||
python3 tools/export_model.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.pretrained_model=./ch_lite/ch_ppocr_mobile_v2.0_rec_train/best_accuracy Global.save_inference_dir=./inference/rec_crnn/
|
||||
python3 tools/export_model.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=en_PP-OCRv3_rec_train/best_accuracy Global.save_inference_dir=./inference/en_PP-OCRv3_rec/
|
||||
```
|
||||
|
||||
If you have a model trained on your own dataset with a different dictionary file, please make sure that you modify the `character_dict_path` in the configuration file to your dictionary file path.
|
||||
|
@ -462,7 +443,7 @@ If you have a model trained on your own dataset with a different dictionary file
|
|||
After the conversion is successful, there are three files in the model save directory:
|
||||
|
||||
```
|
||||
inference/det_db/
|
||||
inference/en_PP-OCRv3_rec/
|
||||
├── inference.pdiparams # The parameter file of recognition inference model
|
||||
├── inference.pdiparams.info # The parameter information of recognition inference model, which can be ignored
|
||||
└── inference.pdmodel # The program file of recognition model
|
||||
|
|
Loading…
Reference in New Issue