mirror of https://github.com/open-mmlab/mmocr.git
152 lines
8.3 KiB
Markdown
152 lines
8.3 KiB
Markdown
|
|
# Text Detection
|
|
|
|
## Overview
|
|
|
|
The structure of the text detection dataset directory is organized as follows.
|
|
|
|
```text
|
|
├── ctw1500
|
|
│ ├── annotations
|
|
│ ├── imgs
|
|
│ ├── instances_test.json
|
|
│ └── instances_training.json
|
|
├── icdar2015
|
|
│ ├── imgs
|
|
│ ├── instances_test.json
|
|
│ └── instances_training.json
|
|
├── icdar2017
|
|
│ ├── imgs
|
|
│ ├── instances_training.json
|
|
│ └── instances_val.json
|
|
├── synthtext
|
|
│ ├── imgs
|
|
│ └── instances_training.lmdb
|
|
│ ├── data.mdb
|
|
│ └── lock.mdb
|
|
├── textocr
|
|
│ ├── train
|
|
│ ├── instances_training.json
|
|
│ └── instances_val.json
|
|
├── totaltext
|
|
│ ├── imgs
|
|
│ ├── instances_test.json
|
|
│ └── instances_training.json
|
|
```
|
|
|
|
|Dataset|Images| | Annotation Files | | |
|
|
| :-------: | :------------------------------------------------------------: | :----------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------: | :-------------------------------------: | :--------------------------------------------------------------------------------------------: |
|
|
| | | training | validation | testing | |
|
|
| CTW1500 | [homepage](https://github.com/Yuliang-Liu/Curve-Text-Detector) | - | - | - |
|
|
| ICDAR2015 | [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads) | [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_training.json) | - | [instances_test.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_test.json) |
|
|
| ICDAR2017 | [homepage](https://rrc.cvc.uab.es/?ch=8&com=downloads) | [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2017/instances_training.json) | [instances_val.json](https://download.openmmlab.com/mmocr/data/icdar2017/instances_val.json) | - | | |
|
|
| Synthtext | [homepage](https://www.robots.ox.ac.uk/~vgg/data/scenetext/) | instances_training.lmdb ([data.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/data.mdb), [lock.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/lock.mdb)) | - | - |
|
|
| TextOCR | [homepage](https://textvqa.org/textocr/dataset) | - | - | -
|
|
| Totaltext | [homepage](https://github.com/cs-chan/Total-Text-Dataset) | - | - | -
|
|
|
|
## Important Note
|
|
|
|
:::{note}
|
|
**For users who want to train models on CTW1500, ICDAR 2015/2017, and Totaltext dataset,** there might be some images containing orientation info in EXIF data. The default OpenCV
|
|
backend used in MMCV would read them and apply the rotation on the images. However, their gold annotations are made on the raw pixels, and such
|
|
inconsistency results in false examples in the training set. Therefore, users should use `dict(type='LoadImageFromFile', color_type='color_ignore_orientation')` in pipelines to change MMCV's default loading behaviour. (see [DBNet's config](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py) for example)
|
|
:::
|
|
|
|
## Preparation Steps
|
|
### ICDAR 2015
|
|
- Step0: Read [Important Note](#important-note)
|
|
- Step1: Download `ch4_training_images.zip`, `ch4_test_images.zip`, `ch4_training_localization_transcription_gt.zip`, `Challenge4_Test_Task1_GT.zip` from [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads)
|
|
- Step2:
|
|
```bash
|
|
mkdir icdar2015 && cd icdar2015
|
|
mkdir imgs && mkdir annotations
|
|
# For images,
|
|
mv ch4_training_images imgs/training
|
|
mv ch4_test_images imgs/test
|
|
# For annotations,
|
|
mv ch4_training_localization_transcription_gt annotations/training
|
|
mv Challenge4_Test_Task1_GT annotations/test
|
|
```
|
|
- Step3: Download [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_training.json) and [instances_test.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_test.json) and move them to `icdar2015`
|
|
- Or, generate `instances_training.json` and `instances_test.json` with following command:
|
|
```bash
|
|
python tools/data/textdet/icdar_converter.py /path/to/icdar2015 -o /path/to/icdar2015 -d icdar2015 --split-list training test
|
|
```
|
|
|
|
### ICDAR 2017
|
|
- Follow similar steps as [ICDAR 2015](#icdar-2015).
|
|
|
|
### CTW1500
|
|
- Step0: Read [Important Note](#important-note)
|
|
- Step1: Download `train_images.zip`, `test_images.zip`, `train_labels.zip`, `test_labels.zip` from [github](https://github.com/Yuliang-Liu/Curve-Text-Detector)
|
|
```bash
|
|
mkdir ctw1500 && cd ctw1500
|
|
mkdir imgs && mkdir annotations
|
|
|
|
# For annotations
|
|
cd annotations
|
|
wget -O train_labels.zip https://universityofadelaide.box.com/shared/static/jikuazluzyj4lq6umzei7m2ppmt3afyw.zip
|
|
wget -O test_labels.zip https://cloudstor.aarnet.edu.au/plus/s/uoeFl0pCN9BOCN5/download
|
|
unzip train_labels.zip && mv ctw1500_train_labels training
|
|
unzip test_labels.zip -d test
|
|
cd ..
|
|
# For images
|
|
cd imgs
|
|
wget -O train_images.zip https://universityofadelaide.box.com/shared/static/py5uwlfyyytbb2pxzq9czvu6fuqbjdh8.zip
|
|
wget -O test_images.zip https://universityofadelaide.box.com/shared/static/t4w48ofnqkdw7jyc4t11nsukoeqk9c3d.zip
|
|
unzip train_images.zip && mv train_images training
|
|
unzip test_images.zip && mv test_images test
|
|
```
|
|
- Step2: Generate `instances_training.json` and `instances_test.json` with following command:
|
|
|
|
```bash
|
|
python tools/data/textdet/ctw1500_converter.py /path/to/ctw1500 -o /path/to/ctw1500 --split-list training test
|
|
```
|
|
|
|
### SynthText
|
|
|
|
- Download [data.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/data.mdb) and [lock.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/lock.mdb) to `synthtext/instances_training.lmdb/`.
|
|
|
|
### TextOCR
|
|
- Step1: Download [train_val_images.zip](https://dl.fbaipublicfiles.com/textvqa/images/train_val_images.zip), [TextOCR_0.1_train.json](https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_train.json) and [TextOCR_0.1_val.json](https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_val.json) to `textocr/`.
|
|
```bash
|
|
mkdir textocr && cd textocr
|
|
|
|
# Download TextOCR dataset
|
|
wget https://dl.fbaipublicfiles.com/textvqa/images/train_val_images.zip
|
|
wget https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_train.json
|
|
wget https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_val.json
|
|
|
|
# For images
|
|
unzip -q train_val_images.zip
|
|
mv train_images train
|
|
```
|
|
- Step2: Generate `instances_training.json` and `instances_val.json` with the following command:
|
|
```bash
|
|
python tools/data/textdet/textocr_converter.py /path/to/textocr
|
|
```
|
|
### Totaltext
|
|
- Step0: Read [Important Note](#important-note)
|
|
- Step1: Download `totaltext.zip` from [github dataset](https://github.com/cs-chan/Total-Text-Dataset/tree/master/Dataset) and `groundtruth_text.zip` from [github Groundtruth](https://github.com/cs-chan/Total-Text-Dataset/tree/master/Groundtruth/Text) (Our totaltext_converter.py supports groundtruth with both .mat and .txt format).
|
|
```bash
|
|
mkdir totaltext && cd totaltext
|
|
mkdir imgs && mkdir annotations
|
|
|
|
# For images
|
|
# in ./totaltext
|
|
unzip totaltext.zip
|
|
mv Images/Train imgs/training
|
|
mv Images/Test imgs/test
|
|
|
|
# For annotations
|
|
unzip groundtruth_text.zip
|
|
cd Groundtruth
|
|
mv Polygon/Train ../annotations/training
|
|
mv Polygon/Test ../annotations/test
|
|
|
|
```
|
|
- Step2: Generate `instances_training.json` and `instances_test.json` with the following command:
|
|
```bash
|
|
python tools/data/textdet/totaltext_converter.py /path/to/totaltext -o /path/to/totaltext --split-list training test
|
|
```
|