diff --git a/doc/doc_ch/recognition.md b/doc/doc_ch/recognition.md index bb8e38d79..55562f20e 100644 --- a/doc/doc_ch/recognition.md +++ b/doc/doc_ch/recognition.md @@ -446,7 +446,7 @@ python3 tools/export_model.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Glo 转换成功后,在目录下有三个文件: ``` -/inference/en_PP-OCRv3_rec/ +inference/en_PP-OCRv3_rec/ ├── inference.pdiparams # 识别inference模型的参数文件 ├── inference.pdiparams.info # 识别inference模型的参数信息,可忽略 └── inference.pdmodel # 识别inference模型的program文件 diff --git a/doc/doc_en/recognition_en.md b/doc/doc_en/recognition_en.md index c3700070b..8042ef8c1 100644 --- a/doc/doc_en/recognition_en.md +++ b/doc/doc_en/recognition_en.md @@ -92,8 +92,6 @@ Similar to the training set, the test set also needs to be provided a folder con If you do not have a dataset locally, you can download it on the official website [icdar2015](http://rrc.cvc.uab.es/?ch=4&com=downloads). Also refer to [DTRB](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here) ,download the lmdb format dataset required for benchmark -If you want to reproduce the paper SAR, you need to download extra dataset [SynthAdd](https://pan.baidu.com/share/init?surl=uV0LtoNmcxbO-0YA7Ch4dg), extraction code: 627x. Besides, icdar2013, icdar2015, cocotext, IIIT5k datasets are also used to train. For specific details, please refer to the paper SAR. - PaddleOCR provides label files for training the icdar2015 dataset, which can be downloaded in the following ways: ``` @@ -194,11 +192,11 @@ First download the pretrain model, you can download the trained model to finetun ``` cd PaddleOCR/ -# Download the pre-trained model of MobileNetV3 -wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar +# Download the pre-trained model of en_PP-OCRv3 +wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_train.tar # Decompress model parameters cd pretrain_models -tar -xf rec_mv3_none_bilstm_ctc_v2.0_train.tar && rm -rf rec_mv3_none_bilstm_ctc_v2.0_train.tar +tar -xf en_PP-OCRv3_rec_train.tar && rm -rf en_PP-OCRv3_rec_train.tar ``` Start training: @@ -208,9 +206,10 @@ Start training: # Training icdar15 English data and The training log will be automatically saved as train.log under "{save_model_dir}" #specify the single card training(Long training time, not recommended) -python3 tools/train.py -c configs/rec/rec_icdar15_train.yml +python3 tools/train.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=en_PP-OCRv3_rec_train/best_accuracy + #specify the card number through --gpus -python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/rec/rec_icdar15_train.yml +python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=en_PP-OCRv3_rec_train/best_accuracy ``` @@ -218,31 +217,13 @@ PaddleOCR supports alternating training and evaluation. You can modify `eval_bat If the evaluation set is large, the test will be time-consuming. It is recommended to reduce the number of evaluations, or evaluate after training. -* Tip: You can use the `-c` parameter to select multiple model configurations under the `configs/rec/` path for training. The recognition algorithms supported by PaddleOCR are: - - -| Configuration file | Algorithm | backbone | trans | seq | pred | -| :--------: | :-------: | :-------: | :-------: | :-----: | :-----: | -| [rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml) | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc | -| [rec_chinese_common_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml) | CRNN | ResNet34_vd | None | BiLSTM | ctc | -| rec_chinese_lite_train.yml | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc | -| rec_chinese_common_train.yml | CRNN | ResNet34_vd | None | BiLSTM | ctc | -| rec_icdar15_train.yml | CRNN | Mobilenet_v3 large 0.5 | None | BiLSTM | ctc | -| rec_mv3_none_bilstm_ctc.yml | CRNN | Mobilenet_v3 large 0.5 | None | BiLSTM | ctc | -| rec_mv3_none_none_ctc.yml | Rosetta | Mobilenet_v3 large 0.5 | None | None | ctc | -| rec_r34_vd_none_bilstm_ctc.yml | CRNN | Resnet34_vd | None | BiLSTM | ctc | -| rec_r34_vd_none_none_ctc.yml | Rosetta | Resnet34_vd | None | None | ctc | -| rec_mv3_tps_bilstm_att.yml | CRNN | Mobilenet_v3 | TPS | BiLSTM | att | -| rec_r34_vd_tps_bilstm_att.yml | CRNN | Resnet34_vd | TPS | BiLSTM | att | -| rec_r50fpn_vd_none_srn.yml | SRN | Resnet50_fpn_vd | None | rnn | srn | -| rec_mtb_nrtr.yml | NRTR | nrtr_mtb | None | transformer encoder | transformer decoder | -| rec_r31_sar.yml | SAR | ResNet31 | None | LSTM encoder | LSTM decoder | +* Tip: You can use the `-c` parameter to select multiple model configurations under the `configs/rec/` path for training. The recognition algorithms supported at [rec_algorithm](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_en/algorithm_overview.md): For training Chinese data, it is recommended to use -[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml). If you want to try the result of other algorithms on the Chinese data set, please refer to the following instructions to modify the configuration file: -co -Take `rec_chinese_lite_train_v2.0.yml` as an example: +[ch_PP-OCRv3_rec.yml](../../configs/rec/PP-OCRv3/ch_PP-OCRv3_rec.yml). If you want to try the result of other algorithms on the Chinese data set, please refer to the following instructions to modify the configuration file: + +Take `ch_PP-OCRv3_rec.yml` as an example: ``` Global: ... @@ -276,7 +257,7 @@ Train: ... - RecResizeImg: # Modify image_shape to fit long text - image_shape: [3, 32, 320] + image_shape: [3, 48, 320] ... loader: ... @@ -296,7 +277,7 @@ Eval: ... - RecResizeImg: # Modify image_shape to fit long text - image_shape: [3, 32, 320] + image_shape: [3, 48, 320] ... loader: # Eval batch_size for Single card @@ -372,11 +353,11 @@ Knowledge distillation is supported in PaddleOCR for text recognition training p ## 3. Evalution -The evaluation dataset can be set by modifying the `Eval.dataset.label_file_list` field in the `configs/rec/rec_icdar15_train.yml` file. +The evaluation dataset can be set by modifying the `Eval.dataset.label_file_list` field in the `configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml` file. ``` # GPU evaluation, Global.checkpoints is the weight to be tested -python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_icdar15_train.yml -o Global.checkpoints={path/to/weights}/best_accuracy +python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.checkpoints={path/to/weights}/best_accuracy ``` @@ -409,7 +390,7 @@ Among them, best_accuracy.* is the best model on the evaluation set; iter_epoch_ ``` # Predict English results -python3 tools/infer_rec.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.load_static_weights=false Global.infer_img=doc/imgs_words/en/word_1.jpg +python3 tools/infer_rec.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/en/word_1.png ``` @@ -454,7 +435,7 @@ The recognition model is converted to the inference model in the same way as the # Global.pretrained_model parameter Set the training model address to be converted without adding the file suffix .pdmodel, .pdopt or .pdparams. # Global.save_inference_dir Set the address where the converted model will be saved. -python3 tools/export_model.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.pretrained_model=./ch_lite/ch_ppocr_mobile_v2.0_rec_train/best_accuracy Global.save_inference_dir=./inference/rec_crnn/ +python3 tools/export_model.py -c configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml -o Global.pretrained_model=en_PP-OCRv3_rec_train/best_accuracy Global.save_inference_dir=./inference/en_PP-OCRv3_rec/ ``` If you have a model trained on your own dataset with a different dictionary file, please make sure that you modify the `character_dict_path` in the configuration file to your dictionary file path. @@ -462,7 +443,7 @@ If you have a model trained on your own dataset with a different dictionary file After the conversion is successful, there are three files in the model save directory: ``` -inference/det_db/ +inference/en_PP-OCRv3_rec/ ├── inference.pdiparams # The parameter file of recognition inference model ├── inference.pdiparams.info # The parameter information of recognition inference model, which can be ignored └── inference.pdmodel # The program file of recognition model