diff --git a/docs/en/user_guides/inference.md b/docs/en/user_guides/inference.md index 6f10d5c0..6660d0bd 100644 --- a/docs/en/user_guides/inference.md +++ b/docs/en/user_guides/inference.md @@ -147,27 +147,36 @@ means that `print_result` is set to `True`) **Text detection:** -| Name | Reference | -| ------------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------: | -| DB_r18 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#real-time-scene-text-detection-with-differentiable-binarization) | -| DB_r50 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#real-time-scene-text-detection-with-differentiable-binarization) | -| DBPP_r50 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#dbnetpp) | -| DRRG | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#drrg) | -| FCE_IC15 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#fourier-contour-embedding-for-arbitrary-shaped-text-detection) | -| FCE_CTW_DCNv2 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#fourier-contour-embedding-for-arbitrary-shaped-text-detection) | -| MaskRCNN_CTW | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#mask-r-cnn) | -| MaskRCNN_IC15 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#mask-r-cnn) | -| PANet_CTW | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#efficient-and-accurate-arbitrary-shaped-text-detection-with-pixel-aggregation-network) | -| PANet_IC15 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#efficient-and-accurate-arbitrary-shaped-text-detection-with-pixel-aggregation-network) | -| PS_CTW | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#psenet) | -| PS_IC15 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#psenet) | -| TextSnake | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#textsnake) | +| Name | Reference | +| ------------- | :----------------------------------------------------------------------------: | +| DB_r18 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#dbnet) | +| DB_r50 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#dbnet) | +| DBPP_r50 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#dbnetpp) | +| DRRG | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#drrg) | +| FCE_IC15 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#fcenet) | +| FCE_CTW_DCNv2 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#fcenet) | +| MaskRCNN_CTW | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#mask-r-cnn) | +| MaskRCNN_IC15 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#mask-r-cnn) | +| PANet_CTW | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#panet) | +| PANet_IC15 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#panet) | +| PS_CTW | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#psenet) | +| PS_IC15 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#psenet) | +| TextSnake | [link](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#textsnake) | **Text recognition:** -| Name | Reference | -| ---- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | -| CRNN | [link](https://mmocr.readthedocs.io/en/dev-1.x/textrecog_models.html#an-end-to-end-trainable-neural-network-for-image-based-sequence-recognition-and-its-application-to-scene-text-recognition) | +| Name | Reference | +| ------------- | :---------------------------------------------------------------------------------: | +| ABINet | [link](https://mmocr.readthedocs.io/en/dev-1.x/textrecog_models.html#abinet) | +| ABINet_Vision | [link](https://mmocr.readthedocs.io/en/dev-1.x/textrecog_models.html#abinet) | +| CRNN | [link](https://mmocr.readthedocs.io/en/dev-1.x/textrecog_models.html#crnn) | +| MASTER | [link](https://mmocr.readthedocs.io/en/dev-1.x/textrecog_models.html#master) | +| NRTR_1/16-1/8 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textrecog_models.html#nrtr) | +| NRTR_1/8-1/4 | [link](https://mmocr.readthedocs.io/en/dev-1.x/textrecog_models.html#nrtr) | +| RobustScanner | [link](https://mmocr.readthedocs.io/en/dev-1.x/textrecog_models.html#robustscanner) | +| SAR | [link](https://mmocr.readthedocs.io/en/dev-1.x/textrecog_models.html#sar) | +| SATRN | [link](https://mmocr.readthedocs.io/en/dev-1.x/textrecog_models.html#satrn) | +| SATRN_sm | [link](https://mmocr.readthedocs.io/en/dev-1.x/textrecog_models.html#satrn) | **Key information extraction:** diff --git a/docs/zh_cn/user_guides/inference.md b/docs/zh_cn/user_guides/inference.md index a8f4dab5..0b2ef694 100644 --- a/docs/zh_cn/user_guides/inference.md +++ b/docs/zh_cn/user_guides/inference.md @@ -145,33 +145,42 @@ mmocr 为了方便使用提供了预置的模型配置和对应的预训练权 **文本检测:** -| 名称 | 引用 | -| ------------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------: | -| DB_r18 | [链接](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#real-time-scene-text-detection-with-differentiable-binarization) | -| DB_r50 | [链接](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#real-time-scene-text-detection-with-differentiable-binarization) | -| DBPP_r50 | [链接](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#dbnetpp) | -| DRRG | [链接](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#drrg) | -| FCE_IC15 | [链接](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#fourier-contour-embedding-for-arbitrary-shaped-text-detection) | -| FCE_CTW_DCNv2 | [链接](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#fourier-contour-embedding-for-arbitrary-shaped-text-detection) | -| MaskRCNN_CTW | [链接](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#mask-r-cnn) | -| MaskRCNN_IC15 | [链接](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#mask-r-cnn) | -| PANet_CTW | [链接](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#efficient-and-accurate-arbitrary-shaped-text-detection-with-pixel-aggregation-network) | -| PANet_IC15 | [链接](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#efficient-and-accurate-arbitrary-shaped-text-detection-with-pixel-aggregation-network) | -| PS_CTW | [链接](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#psenet) | -| PS_IC15 | [链接](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#psenet) | -| TextSnake | [链接](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#textsnake) | +| 名称 | 引用 | +| ------------- | :----------------------------------------------------------------------------: | +| DB_r18 | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textdet_models.html#dbnet) | +| DB_r50 | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textdet_models.html#dbnet) | +| DBPP_r50 | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textdet_models.html#dbnetpp) | +| DRRG | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textdet_models.html#drrg) | +| FCE_IC15 | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textdet_models.html#fcenet) | +| FCE_CTW_DCNv2 | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textdet_models.html#fcenet) | +| MaskRCNN_CTW | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textdet_models.html#mask-r-cnn) | +| MaskRCNN_IC15 | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textdet_models.html#mask-r-cnn) | +| PANet_CTW | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textdet_models.html#panet) | +| PANet_IC15 | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textdet_models.html#panet) | +| PS_CTW | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textdet_models.html#psenet) | +| PS_IC15 | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textdet_models.html#psenet) | +| TextSnake | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textdet_models.html#textsnake) | **文本识别:** -| 名称 | 引用 | -| ---- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | -| CRNN | [链接](https://mmocr.readthedocs.io/en/dev-1.x/textrecog_models.html#an-end-to-end-trainable-neural-network-for-image-based-sequence-recognition-and-its-application-to-scene-text-recognition) | +| 名称 | 引用 | +| ------------- | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | +| ABINet | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textrecog_models.html#abinet) | +| ABINet_Vision | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textrecog_models.html#abinet) | +| CRNN | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textrecog_models.html#crnn) | +| MASTER | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textrecog_models.html#master) | +| NRTR_1/16-1/8 | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textrecog_models.html#nrtr) | +| NRTR_1/8-1/4 | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textrecog_models.html#nrtr) | +| RobustScanner | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textrecog_models.html#robustscanner) | +| SAR | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textrecog_models.html#sar) | +| SATRN | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textrecog_models.html#satrn) | +| SATRN_sm | [链接](https://mmocr.readthedocs.io/zh_CN/dev-1.x/textrecog_models.html#satrn) | **关键信息提取:** -| 名称 | -| ------------------------------------------------------------------------------------------------------------------------------------- | -| [SDMGR](https://mmocr.readthedocs.io/en/dev-1.x/kie_models.html#spatial-dual-modality-graph-reasoning-for-key-information-extraction) | +| 名称 | +| ------------------------------------------------------------------- | +| [SDMGR](https://mmocr.readthedocs.io/zh_CN/dev-1.x/kie_models.html) | ## 其他需要注意 diff --git a/mmocr/ocr.py b/mmocr/ocr.py index a55022b2..616c20f8 100755 --- a/mmocr/ocr.py +++ b/mmocr/ocr.py @@ -379,71 +379,87 @@ class MMOCR: 'ckpt': 'textrecog/crnn/crnn_mini-vgg_5e_mj/crnn_mini-vgg_5e_mj_20220826_224120-8afbedbb.pth' # noqa: E501 }, - # 'SAR': { - # 'config': - # 'textrecog/sar/' - # 'sar_resnet31_parallel-decoder_5e_st-sub_mj-sub_sa_real.py', - # 'ckpt': - # '' - # }, + 'SAR': { + 'config': + 'textrecog/sar/' + 'sar_resnet31_parallel-decoder_5e_st-sub_mj-sub_sa_real.py', + 'ckpt': + 'textrecog/sar/sar_resnet31_parallel-decoder_5e_st-sub_mj-sub_sa_real/sar_resnet31_parallel-decoder_5e_st-sub_mj-sub_sa_real_20220915_171910-04eb4e75.pth' # noqa: E501 + }, # 'SAR_CN': { # 'config': # 'textrecog/' # 'sar/sar_r31_parallel_decoder_chinese.py', # 'ckpt': - # 'textrecog/' + # 'textrecog/' # noqa: E501 # '' # }, - # 'NRTR_1/16-1/8': { - # 'config': - # 'textrecog/' - # 'nrtr/nrtr_resnet31-1by16-1by8_6e_st_mj.py', - # 'ckpt': - # 'textrecog/' - # '' - # }, - # 'NRTR_1/8-1/4': { - # 'config': - # 'textrecog/' - # 'nrtr/nrtr_resnet31-1by8-1by4_6e_st_mj.py', - # 'ckpt': - # 'textrecog/' - # '' - # }, - # 'RobustScanner': { - # 'config': - # 'textrecog/robust_scanner/' - # 'robustscanner_resnet31_5e_st-sub_mj-sub_sa_real.py', - # 'ckpt': - # 'textrecog/' - # '' - # }, - # 'SATRN': { - # 'config': 'textrecog/satrn/satrn_shallow_5e_st_mj.py', - # 'ckpt': '' - # }, - # 'SATRN_sm': { - # 'config': 'textrecog/satrn/satrn_shallow-small_5e_st_mj.py', - # 'ckpt': '' - # }, - # 'ABINet': { - # 'config': 'textrecog/abinet/abinet_20e_st-an_mj.py', - # 'ckpt': '' - # }, - # 'ABINet_Vision': { - # 'config': 'textrecog/abinet/abinet-vision_20e_st-an_mj.py', - # 'ckpt': '' - # }, + 'NRTR_1/16-1/8': { + 'config': + 'textrecog/' + 'nrtr/nrtr_resnet31-1by16-1by8_6e_st_mj.py', + 'ckpt': + 'textrecog/' + 'nrtr/nrtr_resnet31-1by16-1by8_6e_st_mj/nrtr_resnet31-1by16-1by8_6e_st_mj_20220920_143358-43767036.pth' # noqa: E501 + }, + 'NRTR_1/8-1/4': { + 'config': + 'textrecog/' + 'nrtr/nrtr_resnet31-1by8-1by4_6e_st_mj.py', + 'ckpt': + 'textrecog/' + 'nrtr/nrtr_resnet31-1by8-1by4_6e_st_mj/nrtr_resnet31-1by8-1by4_6e_st_mj_20220916_103322-a6a2a123.pth' # noqa: E501 + }, + 'RobustScanner': { + 'config': + 'textrecog/robust_scanner/' + 'robustscanner_resnet31_5e_st-sub_mj-sub_sa_real.py', + 'ckpt': + 'textrecog/' + 'robust_scanner/robustscanner_resnet31_5e_st-sub_mj-sub_sa_real/robustscanner_resnet31_5e_st-sub_mj-sub_sa_real_20220915_152447-7fc35929.pth' # noqa: E501 + }, + 'SATRN': { + 'config': + 'textrecog/satrn/satrn_shallow_5e_st_mj.py', + 'ckpt': + 'textrecog/' + 'satrn/satrn_shallow_5e_st_mj/satrn_shallow_5e_st_mj_20220915_152443-5fd04a4c.pth' # noqa: E501 + }, + 'SATRN_sm': { + 'config': + 'textrecog/satrn/satrn_shallow-small_5e_st_mj.py', + 'ckpt': + 'textrecog/' + 'satrn/satrn_shallow-small_5e_st_mj/satrn_shallow-small_5e_st_mj_20220915_152442-5591bf27.pth' # noqa: E501 + }, + 'ABINet': { + 'config': + 'textrecog/abinet/abinet_20e_st-an_mj.py', + 'ckpt': + 'textrecog/' + 'abinet/abinet_20e_st-an_mj/abinet_20e_st-an_mj_20221005_012617-ead8c139.pth' # noqa: E501 + }, + 'ABINet_Vision': { + 'config': + 'textrecog/abinet/abinet-vision_20e_st-an_mj.py', + 'ckpt': + 'textrecog/' + 'abinet/abinet-vision_20e_st-an_mj/abinet-vision_20e_st-an_mj_20220915_152445-85cfb03d.pth' # noqa: E501 + }, # 'CRNN_TPS': { # 'config': # 'textrecog/tps/crnn_tps_academic_dataset.py', # 'ckpt': + # 'textrecog/' # '' # }, - # 'MASTER': { - # 'config': 'textrecog/master/master_resnet31_12e_st_mj_sa.py', - # 'ckpt': '' - # }, + 'MASTER': { + 'config': + 'textrecog/master/master_resnet31_12e_st_mj_sa.py', + 'ckpt': + 'textrecog/' + 'master/master_resnet31_12e_st_mj_sa/master_resnet31_12e_st_mj_sa_20220915_152443-f4a5cabc.pth' # noqa: E501 + }, # KIE models 'SDMGR': { 'config':