From 9d62bdf84cc6dffd145881517d1598d0d1910586 Mon Sep 17 00:00:00 2001 From: lizz Date: Mon, 5 Apr 2021 16:06:06 +0800 Subject: [PATCH] Format readme (#23) * Format readme Signed-off-by: lizz * try Signed-off-by: lizz * Remove redudant config link Signed-off-by: lizz --- configs/textdet/dbnet/README.md | 6 ++--- configs/textdet/psenet/README.md | 14 +++++------ configs/textdet/textsnake/README.md | 6 ++--- configs/textrecog/crnn/README.md | 8 +++---- configs/textrecog/sar/README.md | 32 +++++++++++++------------ configs/textrecog/seg/README.md | 8 +++---- configs/textrecog/transformer/README.md | 8 ++++--- docs/datasets.md | 2 +- 8 files changed, 44 insertions(+), 40 deletions(-) diff --git a/configs/textdet/dbnet/README.md b/configs/textdet/dbnet/README.md index 8db7c5fc..c8b6a094 100644 --- a/configs/textdet/dbnet/README.md +++ b/configs/textdet/dbnet/README.md @@ -23,6 +23,6 @@ ### ICDAR2015 -| Method | Pretrained Model | Training set | Test set | #epochs | Test size | Recall | Precision | Hmean | Download | -| :--------------------------------------------------------------------: | :--------------: | :-------------: | :------------: | :-----: | :-------: | :----: | :-------: | :---: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | -| [DBNet](/configs/textdet/dbnet/dbnet_r50dcnv2_fpnc_1200e_icdar2015.py) | [Synthtext](https://download.openmmlab.com/mmocr/textdet/dbnet/dbnet_r50dcnv2_fpnc_sbn_2e_synthtext_20210325-aa96e477.pth) | ICDAR2015 Train | ICDAR2015 Test | 1200 | 1024 | 0.796 | 0.866 | 0.830 | [model](https://download.openmmlab.com/mmocr/textdet/dbnet/dbnet_r50dcnv2_fpnc_sbn_1200e_icdar2015_20210325-91cef9af.pth) \| [log](https://download.openmmlab.com/mmocr/textdet/dbnet/dbnet_r50dcnv2_fpnc_sbn_1200e_icdar2015_20210325-91cef9af.pth.log.json) | +| Method | Pretrained Model | Training set | Test set | #epochs | Test size | Recall | Precision | Hmean | Download | +| :--------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------: | :-------------: | :------------: | :-----: | :-------: | :----: | :-------: | :---: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | +| [DBNet](/configs/textdet/dbnet/dbnet_r50dcnv2_fpnc_1200e_icdar2015.py) | [Synthtext](https://download.openmmlab.com/mmocr/textdet/dbnet/dbnet_r50dcnv2_fpnc_sbn_2e_synthtext_20210325-aa96e477.pth) | ICDAR2015 Train | ICDAR2015 Test | 1200 | 1024 | 0.796 | 0.866 | 0.830 | [model](https://download.openmmlab.com/mmocr/textdet/dbnet/dbnet_r50dcnv2_fpnc_sbn_1200e_icdar2015_20210325-91cef9af.pth) \| [log](https://download.openmmlab.com/mmocr/textdet/dbnet/dbnet_r50dcnv2_fpnc_sbn_1200e_icdar2015_20210325-91cef9af.pth.log.json) | diff --git a/configs/textdet/psenet/README.md b/configs/textdet/psenet/README.md index d7da8af2..f2460fab 100644 --- a/configs/textdet/psenet/README.md +++ b/configs/textdet/psenet/README.md @@ -17,13 +17,13 @@ ### CTW1500 -|Method | Backbone|Extra Data | Training set | Test set | #epochs | Test size|Recall|Precision|Hmean|Download| -|:------:| :------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:| -|[PSENet-4s](/configs/textdet/psenet/psenet_r50_fpnf_600e_ctw1500.py) |ResNet50 |-|CTW1500 Train|CTW1500 Test|600|1280|0.728|0.849|0.784|[model](https://download.openmmlab.com/mmocr/textdet/psenet/psenet_r50_fpnf_600e_ctw1500_20210401-216fed50.pth) | [config](https://download.openmmlab.com/mmocr/textdet/psenet/psenet_r50_fpnf_600e_ctw1500_20210401.py) | [log](https://download.openmmlab.com/mmocr/textdet/psenet/20210401_215421.log.json)| +| Method | Backbone | Extra Data | Training set | Test set | #epochs | Test size | Recall | Precision | Hmean | Download | +| :------------------------------------------------------------------: | :------: | :--------: | :-----------: | :----------: | :-----: | :-------: | :----: | :-------: | :---: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | +| [PSENet-4s](/configs/textdet/psenet/psenet_r50_fpnf_600e_ctw1500.py) | ResNet50 | - | CTW1500 Train | CTW1500 Test | 600 | 1280 | 0.728 | 0.849 | 0.784 | [model](https://download.openmmlab.com/mmocr/textdet/psenet/psenet_r50_fpnf_600e_ctw1500_20210401-216fed50.pth) \| [log](https://download.openmmlab.com/mmocr/textdet/psenet/20210401_215421.log.json) | ### ICDAR2015 -|Method | Backbone| Extra Data | Training set | Test set | #epochs | Test size|Recall|Precision|Hmean|Download| -|:------:| :------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:| -|[PSENet-4s](/configs/textdet/psenet/psenet_r50_fpnf_600e_icdar2015.py) |ResNet50 |-|IC15 Train|IC15 Test|600|2240|0.784|0.831|0.807|[model](https://download.openmmlab.com/mmocr/textdet/psenet/psenet_r50_fpnf_600e_icdar2015-c6131f0d.pth) | [config](https://download.openmmlab.com/mmocr/textdet/psenet/psenet_r50_fpnf_600e_icdar2015.py) | [log](https://download.openmmlab.com/mmocr/textdet/psenet/20210331_214145.log.json)| -|[PSENet-4s](/configs/textdet/psenet/psenet_r50_fpnf_600e_icdar2015.py) |ResNet50 |pretrain on IC17 MLT [model](https://download.openmmlab.com/mmocr/textdet/psenet/psenet_r50_fpnf_600e_icdar2017_as_pretrain-0af6d62c.pth)|IC15 Train|IC15 Test|600|2240|0.834|0.861|0.847|[model](https://download.openmmlab.com/mmocr/textdet/psenet/psenet_r50_fpnf_600e_icdar2015_pretrain-ac477383.pth) | [config](https://download.openmmlab.com/mmocr/textdet/psenet/psenet_r50_fpnf_600e_icdar2015.py) | +| Method | Backbone | Extra Data | Training set | Test set | #epochs | Test size | Recall | Precision | Hmean | Download | +| :--------------------------------------------------------------------: | :------: | :---------------------------------------------------------------------------------------------------------------------------------------: | :----------: | :-------: | :-----: | :-------: | :----: | :-------: | :---: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | +| [PSENet-4s](/configs/textdet/psenet/psenet_r50_fpnf_600e_icdar2015.py) | ResNet50 | - | IC15 Train | IC15 Test | 600 | 2240 | 0.784 | 0.831 | 0.807 | [model](https://download.openmmlab.com/mmocr/textdet/psenet/psenet_r50_fpnf_600e_icdar2015-c6131f0d.pth) \| [log](https://download.openmmlab.com/mmocr/textdet/psenet/20210331_214145.log.json) | +| [PSENet-4s](/configs/textdet/psenet/psenet_r50_fpnf_600e_icdar2015.py) | ResNet50 | pretrain on IC17 MLT [model](https://download.openmmlab.com/mmocr/textdet/psenet/psenet_r50_fpnf_600e_icdar2017_as_pretrain-0af6d62c.pth) | IC15 Train | IC15 Test | 600 | 2240 | 0.834 | 0.861 | 0.847 | [model](https://download.openmmlab.com/mmocr/textdet/psenet/psenet_r50_fpnf_600e_icdar2015_pretrain-ac477383.pth) \| [log]() | diff --git a/configs/textdet/textsnake/README.md b/configs/textdet/textsnake/README.md index 50812e37..8e761f4c 100644 --- a/configs/textdet/textsnake/README.md +++ b/configs/textdet/textsnake/README.md @@ -18,6 +18,6 @@ ### CTW1500 -| Method | Pretrained Model | Training set | Test set | #epochs | Test size | Recall | Precision | Hmean | Download | -| :----------------------------------------------------------------------------: | :--------------: | :-----------: | :----------: | :-----: | :-------: | :----: | :-------: | :---: | :-------------------: | -| [TextSnake](/configs/textdet/textsnake/textsnake_r50_fpn_unet_600e_ctw1500.py) | ImageNet | CTW1500 Train | CTW1500 Test | 1200 | 736 | 0.795 | 0.840 | 0.817 | [model](https://download.openmmlab.com/mmocr/textdet/textsnake/textsnake_r50_fpn_unet_1200e_ctw1500-27f65b64.pth) | [config](https://download.openmmlab.com/mmocr/textdet/textsnake/textsnake_r50_fpn_unet_1200e_ctw1500.py) | +| Method | Pretrained Model | Training set | Test set | #epochs | Test size | Recall | Precision | Hmean | Download | +| :----------------------------------------------------------------------------: | :--------------: | :-----------: | :----------: | :-----: | :-------: | :----: | :-------: | :---: | :--------------------------------------------------------------------------------------------------------------------------: | +| [TextSnake](/configs/textdet/textsnake/textsnake_r50_fpn_unet_600e_ctw1500.py) | ImageNet | CTW1500 Train | CTW1500 Test | 1200 | 736 | 0.795 | 0.840 | 0.817 | [model](https://download.openmmlab.com/mmocr/textdet/textsnake/textsnake_r50_fpn_unet_1200e_ctw1500-27f65b64.pth) \| [log]() | diff --git a/configs/textrecog/crnn/README.md b/configs/textrecog/crnn/README.md index 6846217e..4e5de206 100644 --- a/configs/textrecog/crnn/README.md +++ b/configs/textrecog/crnn/README.md @@ -4,7 +4,7 @@ [ALGORITHM] -```latex +```bibtex @article{shi2016end, title={An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition}, author={Shi, Baoguang and Bai, Xiang and Yao, Cong}, @@ -31,7 +31,7 @@ ## Results and models -| methods | | Regular Text | | | | Irregular Text | | download | -| :-----: | :----: | :----------: | :--: | :-: | :--: | :------------: | :--: | :---------------------------------------------------------: | +| methods | | Regular Text | | | | Irregular Text | | download | +| :-----: | :----: | :----------: | :--: | :-: | :--: | :------------: | :--: | :------------------: | | methods | IIIT5K | SVT | IC13 | | IC15 | SVTP | CT80 | -| CRNN | 80.5 | 81.5 | 86.5 | | - | - | - | [config](https://download.openmmlab.com/mmocr/textrecog/crnn/crnn_academic_dataset.py) [log]() [model](https) | +| CRNN | 80.5 | 81.5 | 86.5 | | - | - | - | [model]() \| [log]() | diff --git a/configs/textrecog/sar/README.md b/configs/textrecog/sar/README.md index 517c6985..8854ae04 100644 --- a/configs/textrecog/sar/README.md +++ b/configs/textrecog/sar/README.md @@ -4,7 +4,7 @@ [ALGORITHM] -``` +```bibtex @inproceedings{li2019show, title={Show, attend and read: A simple and strong baseline for irregular text recognition}, author={Li, Hui and Wang, Peng and Shen, Chunhua and Zhang, Guyu}, @@ -43,22 +43,24 @@ | CT80 | 288 | irregular | ## Results and Models -| Methods|Backbone|Decoder||Regular Text||||Irregular Text||download| -| :-------------: | :-----: | :-----: | :-----: | :------: | :-----: | :----: | :-----: | :-----: | :-----: |:-----: | -||||IIIT5K|SVT|IC13||IC15|SVTP|CT80| -|[SAR](/configs/textrecog/sar/sar_r31_parallel_decoder_academic.py)|R31-1/8-1/4|ParallelSARDecoder|95.0|89.6|93.7||79.0|82.2|88.9|[model](https://download.openmmlab.com/mmocr/textrecog/sar/sar_r31_parallel_decoder_academic-dba3a4a3.pth) | [config](https://download.openmmlab.com/mmocr/textrecog/sar/sar_r31_parallel_decoder_academic.py) | [log](https://download.openmmlab.com/mmocr/textrecog/sar/20210327_154129.log.json) | -|[SAR](configs/textrecog/sar/sar_r31_sequential_decoder_academic.py)|R31-1/8-1/4|SequentialSARDecoder|95.2|88.7|92.4||78.2|81.9|89.6|[model](https://download.openmmlab.com/mmocr/textrecog/sar/sar_r31_sequential_decoder_academic-d06c9a8e.pth) | [config](https://download.openmmlab.com/mmocr/textrecog/sar/sar_r31_sequential_decoder_academic.py) | [log](https://download.openmmlab.com/mmocr/textrecog/sar/20210330_105728.log.json)| + +| Methods | Backbone | Decoder | | Regular Text | | | | Irregular Text | | download | +| :-----------------------------------------------------------------: | :---------: | :------------------: | :----: | :----------: | :--: | :-: | :--: | :------------: | :--: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | +| | | | IIIT5K | SVT | IC13 | | IC15 | SVTP | CT80 | +| [SAR](/configs/textrecog/sar/sar_r31_parallel_decoder_academic.py) | R31-1/8-1/4 | ParallelSARDecoder | 95.0 | 89.6 | 93.7 | | 79.0 | 82.2 | 88.9 | [model](https://download.openmmlab.com/mmocr/textrecog/sar/sar_r31_parallel_decoder_academic-dba3a4a3.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/sar/20210327_154129.log.json) | +| [SAR](configs/textrecog/sar/sar_r31_sequential_decoder_academic.py) | R31-1/8-1/4 | SequentialSARDecoder | 95.2 | 88.7 | 92.4 | | 78.2 | 81.9 | 89.6 | [model](https://download.openmmlab.com/mmocr/textrecog/sar/sar_r31_sequential_decoder_academic-d06c9a8e.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/sar/20210330_105728.log.json) | **Notes:** -- `R31-1/8-1/4` means the height of feature from backbone is 1/8 of input image, where 1/4 for width. -- We did not use beam search during decoding. -- We implemented two kinds of decoder. Namely, `ParallelSARDecoder` and `SequentialSARDecoder`. - - `ParallelSARDecoder`: Parallel decoding during training with `LSTM` layer. It would be faster. - - `SequentialSARDecoder`: Sequential Decoding during training with `LSTMCell`. It would be easier to understand. -- For train dataset. - - We did not construct distinct data groups (20 groups in [[1]](#1)) to train the model group-by-group since it would render model training too complicated. - - Instead, we randomly selected `2.4m` patches from `Syn90k`, `2.4m` from `SynthText` and `1.2m` from `SynthAdd`, and grouped all data together. See [config](https://download.openmmlab.com/mmocr/textrecog/sar/sar_r31_academic.py) for details. -- We used 48 GPUs with `total_batch_size = 64 * 48` in the experiment above to speedup training, while keeping the `initial lr = 1e-3` unchanged. + +- `R31-1/8-1/4` means the height of feature from backbone is 1/8 of input image, where 1/4 for width. +- We did not use beam search during decoding. +- We implemented two kinds of decoder. Namely, `ParallelSARDecoder` and `SequentialSARDecoder`. + - `ParallelSARDecoder`: Parallel decoding during training with `LSTM` layer. It would be faster. + - `SequentialSARDecoder`: Sequential Decoding during training with `LSTMCell`. It would be easier to understand. +- For train dataset. + - We did not construct distinct data groups (20 groups in [[1]](#1)) to train the model group-by-group since it would render model training too complicated. + - Instead, we randomly selected `2.4m` patches from `Syn90k`, `2.4m` from `SynthText` and `1.2m` from `SynthAdd`, and grouped all data together. See [config](https://download.openmmlab.com/mmocr/textrecog/sar/sar_r31_academic.py) for details. +- We used 48 GPUs with `total_batch_size = 64 * 48` in the experiment above to speedup training, while keeping the `initial lr = 1e-3` unchanged. ## References diff --git a/configs/textrecog/seg/README.md b/configs/textrecog/seg/README.md index 49ca2014..62c8a2e3 100644 --- a/configs/textrecog/seg/README.md +++ b/configs/textrecog/seg/README.md @@ -24,11 +24,11 @@ A Baseline Method for Segmentation based Text Recognition. | CT80 | 288 | irregular | ## Results and Models -|Backbone|Neck|Head|||Regular Text|||Irregular Text|base_lr|batch_size/gpu|gpus|download -| :-------------: | :-----: | :-----: | :------: | :-----: | :----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | :-----: | -|||||IIIT5K|SVT|IC13||CT80| -|R31-1/16|FPNOCR|1x||90.9|81.8|90.7||80.9|1e-4|16|4|[model](https://download.openmmlab.com/mmocr/textrecog/seg/seg_r31_1by16_fpnocr_academic-0c50e163.pth) | [config](https://download.openmmlab.com/mmocr/textrecog/seg/seg_r31_1by16_fpnocr_academic.py) | [log](https://download.openmmlab.com/mmocr/textrecog/seg/20210325_112835.log.json) | +| Backbone | Neck | Head | | | Regular Text | | | Irregular Text | base_lr | batch_size/gpu | gpus | download | +| :------: | :----: | :--: | :-: | :----: | :----------: | :--: | :-: | :------------: | :-----: | :------------: | :--: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | +| | | | | IIIT5K | SVT | IC13 | | CT80 | +| R31-1/16 | FPNOCR | 1x | | 90.9 | 81.8 | 90.7 | | 80.9 | 1e-4 | 16 | 4 | [model](https://download.openmmlab.com/mmocr/textrecog/seg/seg_r31_1by16_fpnocr_academic-0c50e163.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/seg/20210325_112835.log.json) | **Notes:** diff --git a/configs/textrecog/transformer/README.md b/configs/textrecog/transformer/README.md index 46a132ad..9f94328f 100644 --- a/configs/textrecog/transformer/README.md +++ b/configs/textrecog/transformer/README.md @@ -1,5 +1,7 @@ ## Introduction +[ALGORITHM] + ### Train Dataset | trainset | instance_num | repeat_num | note | @@ -24,7 +26,7 @@ ## Results and models -| methods | | Regular Text | | | | Irregular Text | | download | -| :---------: | :----: | :----------: | :--: | :-: | :--: | :------------: | :--: | :------: | +| methods | | Regular Text | | | | Irregular Text | | download | +| :---------: | :----: | :----------: | :--: | :-: | :--: | :------------: | :--: | :------------------: | | | IIIT5K | SVT | IC13 | | IC15 | SVTP | CT80 | -| Transformer | 93.3 | 85.8 | 91.3 | | 73.2 | 76.6 | 87.8 | | +| Transformer | 93.3 | 85.8 | 91.3 | | 73.2 | 76.6 | 87.8 | [model]() \| [log]() | diff --git a/docs/datasets.md b/docs/datasets.md index 73134172..b17407db 100644 --- a/docs/datasets.md +++ b/docs/datasets.md @@ -114,7 +114,7 @@ This page lists the datasets which are commonly used in text detection, text rec | svt | | [homepage](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset) | - | [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/svt/test_label.txt) | | | svtp | | - | - | [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/svtp/test_label.txt) | | | Synth90k | | [homepage](https://www.robots.ox.ac.uk/~vgg/data/text/) | [shuffle_labels.txt](https://download.openmmlab.com/mmocr/data/mixture/Synth90k/shuffle_labels.txt) | - | | -| SynthText | | [homepage](https://www.robots.ox.ac.uk/~vgg/data/scenetext/) | [shuffle_labels.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthText/shuffle_labels.txt) | [instances_train.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthText/instances_train.txt) | - | | +| SynthText | | [homepage](https://www.robots.ox.ac.uk/~vgg/data/scenetext/) | [shuffle_labels.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthText/shuffle_labels.txt) \| [instances_train.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthText/instances_train.txt) | - | | | SynthAdd | | [SynthText_Add.zip](https://download.openmmlab.com/mmocr/data/mixture/SynthAdd/SynthText_Add.zip) | [label.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthAdd/label.txt)|- | | - For `icdar_2013`: