mirror of https://github.com/open-mmlab/mmocr.git
[Docs] Update Recog Models (#1402)
* init * update * update abinet * update abinet * update abinet * update abinet * apply comments Co-authored-by: Tong Gao <gaotongxiao@gmail.com> * apply comments Co-authored-by: Tong Gao <gaotongxiao@gmail.com> * fix Co-authored-by: Tong Gao <gaotongxiao@gmail.com>pull/1434/head
parent
4fef7d1868
commit
bf921661c6
configs/textrecog
abinet
master
robust_scanner
|
@ -34,13 +34,11 @@ Linguistic knowledge is of great benefit to scene text recognition. However, how
|
|||
|
||||
## Results and models
|
||||
|
||||
Coming Soon!
|
||||
|
||||
| methods | pretrained | | Regular Text | | | Irregular Text | | download |
|
||||
| :----------------------------------------------------------------------: | :--------------: | :----: | :----------: | :--: | :--: | :------------: | :--: | :----------------------- |
|
||||
| | | IIIT5K | SVT | IC13 | IC15 | SVTP | CT80 | |
|
||||
| [ABINet-Vision](/configs/textrecog/abinet/abinet-vision_20e_st-an_mj.py) | - | | | | | | | [model](<>) \| [log](<>) |
|
||||
| [ABINet](/configs/textrecog/abinet/abinet_20e_st-an_mj.py) | [Pretrained](<>) | | | | | | | [model](<>) \| [log](<>) |
|
||||
| methods | pretrained | | Regular Text | | | Irregular Text | | download |
|
||||
| :----------------------------------------------: | :--------------------------------------------------: | :----: | :----------: | :----: | :----: | :------------: | :----: | :------------------------------------------------- |
|
||||
| | | IIIT5K | SVT | IC13 | IC15 | SVTP | CT80 | |
|
||||
| [ABINet-Vision](/configs/textrecog/abinet/abinet-vision_20e_st-an_mj.py) | - | 0.9523 | 0.9057 | 0.9369 | 0.7886 | 0.8403 | 0.8437 | [model](https://download.openmmlab.com/mmocr/textrecog/abinet/abinet-vision_20e_st-an_mj/abinet-vision_20e_st-an_mj_20220915_152445-85cfb03d.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/abinet/abinet-vision_20e_st-an_mj/20220915_152445.log) |
|
||||
| [ABINet](/configs/textrecog/abinet/abinet_20e_st-an_mj.py) | [Pretrained](https://download.openmmlab.com/mmocr/textrecog/abinet/abinet_pretrain-45deac15.pth) | 0.9603 | 0.9382 | 0.9547 | 0.8122 | 0.8868 | 0.8785 | [model](https://download.openmmlab.com/mmocr/textrecog/abinet/abinet_20e_st-an_mj/abinet_20e_st-an_mj_20221005_012617-ead8c139.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/abinet/abinet_20e_st-an_mj/20221005_012617.log) |
|
||||
|
||||
```{note}
|
||||
1. ABINet allows its encoder to run and be trained without decoder and fuser. Its encoder is designed to recognize texts as a stand-alone model and therefore can work as an independent text recognizer. We release it as ABINet-Vision.
|
||||
|
|
|
@ -1,4 +1,19 @@
|
|||
Collections:
|
||||
- Name: ABINet-vision
|
||||
Metadata:
|
||||
Training Data: OCRDataset
|
||||
Training Techniques:
|
||||
- Adam
|
||||
Epochs: 20
|
||||
Batch Size: 1536
|
||||
Training Resources: 2 x NVIDIA A100-SXM4-80GB
|
||||
Architecture:
|
||||
- ResNetABI
|
||||
- ABIVisionModel
|
||||
Paper:
|
||||
URL: https://arxiv.org/pdf/2103.06495.pdf
|
||||
Title: 'Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition'
|
||||
README: configs/textrecog/abinet/README.md
|
||||
- Name: ABINet
|
||||
Metadata:
|
||||
Training Data: OCRDataset
|
||||
|
@ -6,7 +21,7 @@ Collections:
|
|||
- Adam
|
||||
Epochs: 20
|
||||
Batch Size: 1536
|
||||
Training Resources: 8x Tesla V100
|
||||
Training Resources: 8 x NVIDIA A100-SXM4-80GB
|
||||
Architecture:
|
||||
- ResNetABI
|
||||
- ABIVisionModel
|
||||
|
@ -18,9 +33,9 @@ Collections:
|
|||
README: configs/textrecog/abinet/README.md
|
||||
|
||||
Models:
|
||||
- Name: abinet-vision_6e_st-an_mj
|
||||
In Collection: ABINet
|
||||
Config: configs/textrecog/abinet/abinet-vision_6e_st-an_mj.py
|
||||
- Name: abinet-vision_20e_st-an_mj
|
||||
In Collection: ABINet-vision
|
||||
Config: configs/textrecog/abinet/abinet-vision_20e_st-an_mj.py
|
||||
Metadata:
|
||||
Training Data:
|
||||
- SynthText
|
||||
|
@ -29,32 +44,31 @@ Models:
|
|||
- Task: Text Recognition
|
||||
Dataset: IIIT5K
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9523
|
||||
- Task: Text Recognition
|
||||
Dataset: SVT
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9057
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2013
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9369
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2015
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.7886
|
||||
- Task: Text Recognition
|
||||
Dataset: SVTP
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8403
|
||||
- Task: Text Recognition
|
||||
Dataset: CT80
|
||||
Metrics:
|
||||
word_acc:
|
||||
Weights:
|
||||
|
||||
- Name: abinet_6e_st-an_mj
|
||||
word_acc: 0.8437
|
||||
Weights: https://download.openmmlab.com/mmocr/textrecog/abinet/abinet-vision_20e_st-an_mj/abinet-vision_20e_st-an_mj_20220915_152445-85cfb03d.pth
|
||||
- Name: abinet_20e_st-an_mj
|
||||
In Collection: ABINet
|
||||
Config: configs/textrecog/abinet/abinet_6e_st-an_mj.py
|
||||
Config: configs/textrecog/abinet/abinet_20e_st-an_mj.py
|
||||
Metadata:
|
||||
Training Data:
|
||||
- SynthText
|
||||
|
@ -63,25 +77,25 @@ Models:
|
|||
- Task: Text Recognition
|
||||
Dataset: IIIT5K
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9603
|
||||
- Task: Text Recognition
|
||||
Dataset: SVT
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9382
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2013
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9547
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2015
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8122
|
||||
- Task: Text Recognition
|
||||
Dataset: SVTP
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8868
|
||||
- Task: Text Recognition
|
||||
Dataset: CT80
|
||||
Metrics:
|
||||
word_acc:
|
||||
Weights:
|
||||
word_acc: 0.8785
|
||||
Weights: https://download.openmmlab.com/mmocr/textrecog/abinet/abinet_20e_st-an_mj/abinet_20e_st-an_mj_20221005_012617-ead8c139.pth
|
||||
|
|
|
@ -35,12 +35,10 @@ Attention-based scene text recognizers have gained huge success, which leverages
|
|||
|
||||
## Results and Models
|
||||
|
||||
Coming Soon!
|
||||
|
||||
| Methods | Backbone | | Regular Text | | | | Irregular Text | | download |
|
||||
| :-----------------------------------------------------------------: | :-----------: | :----: | :----------: | :--: | :-: | :--: | :------------: | :--: | :----------------------: |
|
||||
| | | IIIT5K | SVT | IC13 | | IC15 | SVTP | CT80 | |
|
||||
| [MASTER](/configs/textrecog/master/master_resnet31_12e_st_mj_sa.py) | R31-GCAModule | | | | | | | | [model](<>) \| [log](<>) |
|
||||
| Methods | Backbone | | Regular Text | | | | Irregular Text | | download |
|
||||
| :----------------------------------------------------------------: | :-----------: | :----: | :----------: | :----: | :-: | :----: | :------------: | :----: | :------------------------------------------------------------------: |
|
||||
| | | IIIT5K | SVT | IC13 | | IC15 | SVTP | CT80 | |
|
||||
| [MASTER](/configs/textrecog/master/master_resnet31_12e_st_mj_sa.py) | R31-GCAModule | 0.9490 | 0.8967 | 0.9517 | | 0.7631 | 0.8465 | 0.8854 | [model](https://download.openmmlab.com/mmocr/textrecog/master/master_resnet31_12e_st_mj_sa/master_resnet31_12e_st_mj_sa_20220915_152443-f4a5cabc.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/master/master_resnet31_12e_st_mj_sa/20220915_152443.log) |
|
||||
|
||||
## Citation
|
||||
|
||||
|
|
|
@ -5,8 +5,8 @@ Collections:
|
|||
Training Techniques:
|
||||
- Adam
|
||||
Epochs: 12
|
||||
Batch Size: 512
|
||||
Training Resources: 4x Tesla A100
|
||||
Batch Size: 2048
|
||||
Training Resources: 4x NVIDIA A100-SXM4-80GB
|
||||
Architecture:
|
||||
- ResNet31-GCAModule
|
||||
- MASTERDecoder
|
||||
|
@ -28,25 +28,25 @@ Models:
|
|||
- Task: Text Recognition
|
||||
Dataset: IIIT5K
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9490
|
||||
- Task: Text Recognition
|
||||
Dataset: SVT
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8967
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2013
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9517
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2015
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.7631
|
||||
- Task: Text Recognition
|
||||
Dataset: SVTP
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8465
|
||||
- Task: Text Recognition
|
||||
Dataset: CT80
|
||||
Metrics:
|
||||
word_acc:
|
||||
Weights:
|
||||
word_acc: 0.8854
|
||||
Weights: https://download.openmmlab.com/mmocr/textrecog/master/master_resnet31_12e_st_mj_sa/master_resnet31_12e_st_mj_sa_20220915_152443-f4a5cabc.pth
|
||||
|
|
|
@ -34,13 +34,12 @@ Scene text recognition has attracted a great many researches due to its importan
|
|||
|
||||
## Results and Models
|
||||
|
||||
Coming Soon!
|
||||
|
||||
| Methods | Backbone | | Regular Text | | | | Irregular Text | | download |
|
||||
| :------------------------------------------------------------------: | :----------: | :----: | :----------: | :--: | :-: | :--: | :------------: | :--: | :----------------------: |
|
||||
| | | IIIT5K | SVT | IC13 | | IC15 | SVTP | CT80 | |
|
||||
| [NRTR](/configs/textrecog/nrtr/nrtr_resnet31-1by16-1by8_6e_st_mj.py) | R31-1/16-1/8 | | | | | | | | [model](<>) \| [log](<>) |
|
||||
| [NRTR](/configs/textrecog/nrtr/nrtr_resnet31-1by8-1by4_6e_st_mj.py) | R31-1/8-1/4 | | | | | | | | [model](<>) \| [log](<>) |
|
||||
| Methods | Backbone | | Regular Text | | | | Irregular Text | | download |
|
||||
| :------------------------------------------------------------: | :-------------------: | :----: | :----------: | :----: | :-: | :----: | :------------: | :----: | :--------------------------------------------------------------: |
|
||||
| | | IIIT5K | SVT | IC13 | | IC15 | SVTP | CT80 | |
|
||||
| [NRTR](/configs/textrecog/nrtr/nrtr_modality-transform_6e_st_mj.py) | NRTRModalityTransform | 0.9150 | 0.8825 | 0.9369 | | 0.7232 | 0.7783 | 0.7500 | [model](https://download.openmmlab.com/mmocr/textrecog/nrtr/nrtr_modality-transform_6e_st_mj/nrtr_modality-transform_6e_st_mj_20220916_103322-bd9425be.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/nrtr/nrtr_modality-transform_6e_st_mj/20220916_103322.log) |
|
||||
| [NRTR](/configs/textrecog/nrtr/nrtr_resnet31-1by8-1by4_6e_st_mj.py) | R31-1/8-1/4 | 0.9483 | 0.8825 | 0.9507 | | 0.7559 | 0.8016 | 0.8889 | [model](https://download.openmmlab.com/mmocr/textrecog/nrtr/nrtr_resnet31-1by8-1by4_6e_st_mj/nrtr_resnet31-1by8-1by4_6e_st_mj_20220916_103322-a6a2a123.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/nrtr/nrtr_resnet31-1by8-1by4_6e_st_mj/20220916_103322.log) |
|
||||
| [NRTR](/configs/textrecog/nrtr/nrtr_resnet31-1by16-1by8_6e_st_mj.py) | R31-1/16-1/8 | 0.9470 | 0.8964 | 0.9399 | | 0.7357 | 0.7969 | 0.8854 | [model](https://download.openmmlab.com/mmocr/textrecog/nrtr/nrtr_resnet31-1by16-1by8_6e_st_mj/nrtr_resnet31-1by16-1by8_6e_st_mj_20220920_143358-43767036.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/nrtr/nrtr_resnet31-1by16-1by8_6e_st_mj/20220920_143358.log) |
|
||||
|
||||
## Citation
|
||||
|
||||
|
|
|
@ -5,8 +5,8 @@ Collections:
|
|||
Training Techniques:
|
||||
- Adam
|
||||
Epochs: 6
|
||||
Batch Size: 6144
|
||||
Training Resources: 1x Tesla A100
|
||||
Batch Size: 384
|
||||
Training Resources: 1x NVIDIA A100-SXM4-80GB
|
||||
Architecture:
|
||||
- CNN
|
||||
- NRTREncoder
|
||||
|
@ -17,9 +17,9 @@ Collections:
|
|||
README: configs/textrecog/nrtr/README.md
|
||||
|
||||
Models:
|
||||
- Name: nrtr_resnet31-1by16-1by8_6e_st_mj
|
||||
- Name: nrtr_modality-transform_6e_st_mj
|
||||
In Collection: NRTR
|
||||
Config: configs/textrecog/nrtr/nrtr_resnet31-1by16-1by8_6e_st_mj.py
|
||||
Config: configs/textrecog/nrtr/nrtr_modality-transform_6e_st_mj.py
|
||||
Metadata:
|
||||
Training Data:
|
||||
- SynthText
|
||||
|
@ -28,29 +28,28 @@ Models:
|
|||
- Task: Text Recognition
|
||||
Dataset: IIIT5K
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9150
|
||||
- Task: Text Recognition
|
||||
Dataset: SVT
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8825
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2013
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9369
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2015
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.7232
|
||||
- Task: Text Recognition
|
||||
Dataset: SVTP
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.7783
|
||||
- Task: Text Recognition
|
||||
Dataset: CT80
|
||||
Metrics:
|
||||
word_acc:
|
||||
Weights:
|
||||
|
||||
word_acc: 0.7500
|
||||
Weights: https://download.openmmlab.com/mmocr/textrecog/nrtr/nrtr_modality-transform_6e_st_mj/nrtr_modality-transform_6e_st_mj_20220916_103322-bd9425be.pth
|
||||
- Name: nrtr_resnet31-1by8-1by4_6e_st_mj
|
||||
In Collection: NRTR
|
||||
Config: configs/textrecog/nrtr/nrtr_resnet31-1by8-1by4_6e_st_mj.py
|
||||
|
@ -62,25 +61,58 @@ Models:
|
|||
- Task: Text Recognition
|
||||
Dataset: IIIT5K
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9483
|
||||
- Task: Text Recognition
|
||||
Dataset: SVT
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8825
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2013
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9507
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2015
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.7559
|
||||
- Task: Text Recognition
|
||||
Dataset: SVTP
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8016
|
||||
- Task: Text Recognition
|
||||
Dataset: CT80
|
||||
Metrics:
|
||||
word_acc:
|
||||
Weights:
|
||||
word_acc: 0.8889
|
||||
Weights: https://download.openmmlab.com/mmocr/textrecog/nrtr/nrtr_resnet31-1by8-1by4_6e_st_mj/nrtr_resnet31-1by8-1by4_6e_st_mj_20220916_103322-a6a2a123.pth
|
||||
- Name: nrtr_resnet31-1by16-1by8_6e_st_mj
|
||||
In Collection: NRTR
|
||||
Config: configs/textrecog/nrtr/nrtr_resnet31-1by16-1by8_6e_st_mj.py
|
||||
Metadata:
|
||||
Training Data:
|
||||
- SynthText
|
||||
- Syn90k
|
||||
Results:
|
||||
- Task: Text Recognition
|
||||
Dataset: IIIT5K
|
||||
Metrics:
|
||||
word_acc: 0.9470
|
||||
- Task: Text Recognition
|
||||
Dataset: SVT
|
||||
Metrics:
|
||||
word_acc: 0.8964
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2013
|
||||
Metrics:
|
||||
word_acc: 0.9399
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2015
|
||||
Metrics:
|
||||
word_acc: 0.7357
|
||||
- Task: Text Recognition
|
||||
Dataset: SVTP
|
||||
Metrics:
|
||||
word_acc: 0.7969
|
||||
- Task: Text Recognition
|
||||
Dataset: CT80
|
||||
Metrics:
|
||||
word_acc: 0.8854
|
||||
Weights: https://download.openmmlab.com/mmocr/textrecog/nrtr/nrtr_resnet31-1by16-1by8_6e_st_mj/nrtr_resnet31-1by16-1by8_6e_st_mj_20220920_143358-43767036.pth
|
||||
|
|
|
@ -40,12 +40,10 @@ The attention-based encoder-decoder framework has recently achieved impressive r
|
|||
|
||||
## Results and Models
|
||||
|
||||
Coming Soon!
|
||||
|
||||
| Methods | GPUs | | Regular Text | | | | Irregular Text | | download |
|
||||
| :--------------------------------------------------------------------------------------------------: | :--: | :----: | :----------: | :--: | :-: | :--: | :------------: | :--: | :----------------------: |
|
||||
| | | IIIT5K | SVT | IC13 | | IC15 | SVTP | CT80 | |
|
||||
| [RobustScanner](configs/textrecog/robust_scanner/robustscanner_resnet31_5e_st-sub_mj-sub_sa_real.py) | | | | | | | | | [model](<>) \| [log](<>) |
|
||||
| Methods | GPUs | | Regular Text | | | | Irregular Text | | download |
|
||||
| :---------------------------------------------------------------------: | :--: | :----: | :----------: | :----: | :-: | :----: | :------------: | :----: | :----------------------------------------------------------------------: |
|
||||
| | | IIIT5K | SVT | IC13 | | IC15 | SVTP | CT80 | |
|
||||
| [RobustScanner](/configs/textrecog/robust_scanner/robustscanner_resnet31_5e_st-sub_mj-sub_sa_real.py) | 4 | 0.9510 | 0.8934 | 0.9320 | | 0.7559 | 0.8078 | 0.8715 | [model](https://download.openmmlab.com/mmocr/textrecog/robust_scanner/robustscanner_resnet31_5e_st-sub_mj-sub_sa_real/robustscanner_resnet31_5e_st-sub_mj-sub_sa_real_20220915_152447-7fc35929.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/robust_scanner/robustscanner_resnet31_5e_st-sub_mj-sub_sa_real/20220915_152447.log) |
|
||||
|
||||
## References
|
||||
|
||||
|
|
|
@ -6,7 +6,7 @@ Collections:
|
|||
- Adam
|
||||
Epochs: 5
|
||||
Batch Size: 1024
|
||||
Training Resources: 16x GeForce GTX 1080 Ti
|
||||
Training Resources: 4x NVIDIA A100-SXM4-80GB
|
||||
Architecture:
|
||||
- ResNet31OCR
|
||||
- ChannelReductionEncoder
|
||||
|
@ -34,25 +34,25 @@ Models:
|
|||
- Task: Text Recognition
|
||||
Dataset: IIIT5K
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9510
|
||||
- Task: Text Recognition
|
||||
Dataset: SVT
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8934
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2013
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9320
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2015
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.7559
|
||||
- Task: Text Recognition
|
||||
Dataset: SVTP
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8078
|
||||
- Task: Text Recognition
|
||||
Dataset: CT80
|
||||
Metrics:
|
||||
word_acc:
|
||||
Weights:
|
||||
word_acc: 0.8715
|
||||
Weights: https://download.openmmlab.com/mmocr/textrecog/robust_scanner/robustscanner_resnet31_5e_st-sub_mj-sub_sa_real/robustscanner_resnet31_5e_st-sub_mj-sub_sa_real_20220915_152447-7fc35929.pth
|
||||
|
|
|
@ -40,13 +40,11 @@ Recognizing irregular text in natural scene images is challenging due to the lar
|
|||
|
||||
## Results and Models
|
||||
|
||||
Coming Soon!
|
||||
|
||||
| Methods | Backbone | Decoder | | Regular Text | | | | Irregular Text | | download |
|
||||
| :-----------------------------------------------------------------: | :---------: | :------------------: | :----: | :----------: | :--: | :-: | :--: | :------------: | :--: | :----------------------: |
|
||||
| | | | IIIT5K | SVT | IC13 | | IC15 | SVTP | CT80 | |
|
||||
| [SAR](/configs/textrecog/sar/sar_r31_parallel_decoder_academic.py) | R31-1/8-1/4 | ParallelSARDecoder | | | | | | | | [model](<>) \| [log](<>) |
|
||||
| [SAR](configs/textrecog/sar/sar_r31_sequential_decoder_academic.py) | R31-1/8-1/4 | SequentialSARDecoder | | | | | | | | [model](<>) \| [log](<>) |
|
||||
| Methods | Backbone | Decoder | | Regular Text | | | | Irregular Text | | download |
|
||||
| :-------------------------------------------------------: | :---------: | :------------------: | :----: | :----------: | :----: | :-: | :----: | :------------: | :----: | :---------------------------------------------------------: |
|
||||
| | | | IIIT5K | SVT | IC13 | | IC15 | SVTP | CT80 | |
|
||||
| [SAR](/configs/textrecog/sar/sar_r31_parallel_decoder_academic.py) | R31-1/8-1/4 | ParallelSARDecoder | 0.9533 | 0.8841 | 0.9369 | | 0.7602 | 0.8326 | 0.9028 | [model](https://download.openmmlab.com/mmocr/textrecog/sar/sar_resnet31_parallel-decoder_5e_st-sub_mj-sub_sa_real/sar_resnet31_parallel-decoder_5e_st-sub_mj-sub_sa_real_20220915_171910-04eb4e75.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/sar/sar_resnet31_parallel-decoder_5e_st-sub_mj-sub_sa_real/20220915_171910.log) |
|
||||
| [SAR](/configs/textrecog/sar/sar_r31_sequential_decoder_academic.py) | R31-1/8-1/4 | SequentialSARDecoder | 0.9553 | 0.8717 | 0.9409 | | 0.7737 | 0.8093 | 0.8924 | [model](https://download.openmmlab.com/mmocr/textrecog/sar/sar_resnet31_sequential-decoder_5e_st-sub_mj-sub_sa_real/sar_resnet31_sequential-decoder_5e_st-sub_mj-sub_sa_real_20220915_185451-1fd6b1fc.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/sar/sar_resnet31_sequential-decoder_5e_st-sub_mj-sub_sa_real/20220915_185451.log) |
|
||||
|
||||
## Citation
|
||||
|
||||
|
|
|
@ -4,7 +4,7 @@ Collections:
|
|||
Training Data: OCRDataset
|
||||
Training Techniques:
|
||||
- Adam
|
||||
Training Resources: 48x GeForce GTX 1080 Ti
|
||||
Training Resources: 8x NVIDIA A100-SXM4-80GB
|
||||
Epochs: 5
|
||||
Batch Size: 3072
|
||||
Architecture:
|
||||
|
@ -34,28 +34,28 @@ Models:
|
|||
- Task: Text Recognition
|
||||
Dataset: IIIT5K
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9533
|
||||
- Task: Text Recognition
|
||||
Dataset: SVT
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8841
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2013
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9369
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2015
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.7602
|
||||
- Task: Text Recognition
|
||||
Dataset: SVTP
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8326
|
||||
- Task: Text Recognition
|
||||
Dataset: CT80
|
||||
Metrics:
|
||||
word_acc:
|
||||
Weights:
|
||||
word_acc: 0.9028
|
||||
Weights: https://download.openmmlab.com/mmocr/textrecog/sar/sar_resnet31_parallel-decoder_5e_st-sub_mj-sub_sa_real/sar_resnet31_parallel-decoder_5e_st-sub_mj-sub_sa_real_20220915_171910-04eb4e75.pth
|
||||
|
||||
- Name: sar_resnet31_sequential-decoder_5e_st-sub_mj-sub_sa_real
|
||||
In Collection: SAR
|
||||
|
@ -74,25 +74,25 @@ Models:
|
|||
- Task: Text Recognition
|
||||
Dataset: IIIT5K
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9553
|
||||
- Task: Text Recognition
|
||||
Dataset: SVT
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8717
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2013
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9409
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2015
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.7737
|
||||
- Task: Text Recognition
|
||||
Dataset: SVTP
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8093
|
||||
- Task: Text Recognition
|
||||
Dataset: CT80
|
||||
Metrics:
|
||||
word_acc:
|
||||
Weights:
|
||||
word_acc: 0.8924
|
||||
Weights: https://download.openmmlab.com/mmocr/textrecog/sar/sar_resnet31_sequential-decoder_5e_st-sub_mj-sub_sa_real/sar_resnet31_sequential-decoder_5e_st-sub_mj-sub_sa_real_20220915_185451-1fd6b1fc.pth
|
||||
|
|
|
@ -34,13 +34,11 @@ Scene text recognition (STR) is the task of recognizing character sequences in n
|
|||
|
||||
## Results and Models
|
||||
|
||||
Coming Soon!
|
||||
|
||||
| Methods | | Regular Text | | | | Irregular Text | | download |
|
||||
| :---------------------------------------------------------------------: | :----: | :----------: | :--: | :-: | :--: | :------------: | :--: | :----------------------: |
|
||||
| | IIIT5K | SVT | IC13 | | IC15 | SVTP | CT80 | |
|
||||
| [Satrn](/configs/textrecog/satrn/satrn_shallow_5e_st_mj.py) | | | | | | | | [model](<>) \| [log](<>) |
|
||||
| [Satrn_small](/configs/textrecog/satrn/satrn_shallow-small_5e_st_mj.py) | | | | | | | | [model](<>) \| [log](<>) |
|
||||
| Methods | | Regular Text | | | | Irregular Text | | download |
|
||||
| :---------------------------------------------------------------------: | :----: | :----------: | :----: | :-: | :----: | :------------: | :----: | :--------------------------------------------------------------------------: |
|
||||
| | IIIT5K | SVT | IC13 | | IC15 | SVTP | CT80 | |
|
||||
| [Satrn](/configs/textrecog/satrn/satrn_shallow_5e_st_mj.py) | 0.9600 | 0.9196 | 0.9606 | | 0.8031 | 0.8837 | 0.8993 | [model](https://download.openmmlab.com/mmocr/textrecog/satrn/satrn_shallow_5e_st_mj/satrn_shallow_5e_st_mj_20220915_152443-5fd04a4c.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/satrn/satrn_shallow_5e_st_mj/20220915_152443.log) |
|
||||
| [Satrn_small](/configs/textrecog/satrn/satrn_shallow-small_5e_st_mj.py) | 0.9423 | 0.8995 | 0.9567 | | 0.7877 | 0.8574 | 0.8507 | [model](https://download.openmmlab.com/mmocr/textrecog/satrn/satrn_shallow-small_5e_st_mj/satrn_shallow-small_5e_st_mj_20220915_152442-5591bf27.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/satrn/satrn_shallow-small_5e_st_mj/20220915_152442.log) |
|
||||
|
||||
## Citation
|
||||
|
||||
|
|
|
@ -28,28 +28,28 @@ Models:
|
|||
- Task: Text Recognition
|
||||
Dataset: IIIT5K
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9600
|
||||
- Task: Text Recognition
|
||||
Dataset: SVT
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9196
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2013
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9606
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2015
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8031
|
||||
- Task: Text Recognition
|
||||
Dataset: SVTP
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8837
|
||||
- Task: Text Recognition
|
||||
Dataset: CT80
|
||||
Metrics:
|
||||
word_acc:
|
||||
Weights:
|
||||
word_acc: 0.8993
|
||||
Weights: https://download.openmmlab.com/mmocr/textrecog/satrn/satrn_shallow_5e_st_mj/satrn_shallow_5e_st_mj_20220915_152443-5fd04a4c.pth
|
||||
|
||||
- Name: satrn_shallow-small_5e_st_mj
|
||||
In Collection: SATRN
|
||||
|
@ -62,25 +62,25 @@ Models:
|
|||
- Task: Text Recognition
|
||||
Dataset: IIIT5K
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9423
|
||||
- Task: Text Recognition
|
||||
Dataset: SVT
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8995
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2013
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.9567
|
||||
- Task: Text Recognition
|
||||
Dataset: ICDAR2015
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.7877
|
||||
- Task: Text Recognition
|
||||
Dataset: SVTP
|
||||
Metrics:
|
||||
word_acc:
|
||||
word_acc: 0.8574
|
||||
- Task: Text Recognition
|
||||
Dataset: CT80
|
||||
Metrics:
|
||||
word_acc:
|
||||
Weights:
|
||||
word_acc: 0.8507
|
||||
Weights: https://download.openmmlab.com/mmocr/textrecog/satrn/satrn_shallow-small_5e_st_mj/satrn_shallow-small_5e_st_mj_20220915_152442-5591bf27.pth
|
||||
|
|
Loading…
Reference in New Issue