2021-04-05 16:34:09 +08:00
|
|
|
# SegOCR Simple Baseline.
|
2021-04-02 23:54:57 +08:00
|
|
|
|
2021-11-30 16:40:18 +08:00
|
|
|
## Abstract
|
2021-04-02 23:54:57 +08:00
|
|
|
|
2021-11-30 16:40:18 +08:00
|
|
|
<!-- [ABSTRACT] -->
|
|
|
|
Just a simple Seg-based baseline for text recognition tasks.
|
|
|
|
|
|
|
|
## Citation
|
|
|
|
|
|
|
|
<!-- [ALGORITHM] -->
|
2021-04-05 16:39:31 +08:00
|
|
|
|
2021-04-05 16:34:09 +08:00
|
|
|
```bibtex
|
|
|
|
@unpublished{key,
|
|
|
|
title={SegOCR Simple Baseline.},
|
|
|
|
author={},
|
|
|
|
note={Unpublished Manuscript},
|
|
|
|
year={2021}
|
|
|
|
}
|
|
|
|
```
|
2021-04-05 16:39:31 +08:00
|
|
|
|
2021-04-02 23:54:57 +08:00
|
|
|
## Dataset
|
|
|
|
|
|
|
|
### Train Dataset
|
|
|
|
|
|
|
|
| trainset | instance_num | repeat_num | source |
|
|
|
|
| :-------: | :----------: | :--------: | :----: |
|
|
|
|
| SynthText | 7266686 | 1 | synth |
|
|
|
|
|
|
|
|
### Test Dataset
|
|
|
|
|
|
|
|
| testset | instance_num | type |
|
|
|
|
| :-----: | :----------: | :-------: |
|
|
|
|
| IIIT5K | 3000 | regular |
|
|
|
|
| SVT | 647 | regular |
|
|
|
|
| IC13 | 1015 | regular |
|
|
|
|
| CT80 | 288 | irregular |
|
|
|
|
|
|
|
|
## Results and Models
|
|
|
|
|
2021-04-08 15:49:46 +08:00
|
|
|
| Backbone | Neck | Head | | | Regular Text | | | Irregular Text | download |
|
|
|
|
| :------: | :----: | :---: | :---: | :----: | :----------: | :---: | :---: | :------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
|
|
|
|
| | | | | IIIT5K | SVT | IC13 | | CT80 |
|
|
|
|
| R31-1/16 | FPNOCR | 1x | | 90.9 | 81.8 | 90.7 | | 80.9 | [model](https://download.openmmlab.com/mmocr/textrecog/seg/seg_r31_1by16_fpnocr_academic-72235b11.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/seg/20210325_112835.log.json) |
|
2021-04-02 23:54:57 +08:00
|
|
|
|
|
|
|
**Notes:**
|
|
|
|
|
|
|
|
- `R31-1/16` means the size (both height and width ) of feature from backbone is 1/16 of input image.
|
|
|
|
- `1x` means the size (both height and width) of feature from head is the same with input image.
|