mmocr/docs/datasets/det.md


# Text Detection

## Overview

The structure of the text detection dataset directory is organized as follows.

```text
├── ctw1500
│   ├── annotations
│   ├── imgs
│   ├── instances_test.json
│   └── instances_training.json
├── icdar2015
│   ├── imgs
│   ├── instances_test.json
│   └── instances_training.json
├── icdar2017
│   ├── imgs
│   ├── instances_training.json
│   └── instances_val.json
├── synthtext
│   ├── imgs
│   └── instances_training.lmdb
│       ├── data.mdb
│       └── lock.mdb
├── textocr
│   ├── train
│   ├── instances_training.json
│   └── instances_val.json
├── totaltext
│   ├── imgs
│   ├── instances_test.json
│   └── instances_training.json
```

|Dataset|Images|                                                                                      |  Annotation Files                                                                                                      |                         |                                                                                                |
| :-------: | :------------------------------------------------------------: | :----------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------: | :-------------------------------------: | :--------------------------------------------------------------------------------------------: |
|      |                                                                                      |                                                training                                                |               validation                |                                            testing                                             |       |
|  CTW1500  | [homepage](https://github.com/Yuliang-Liu/Curve-Text-Detector) |                    -                    |                    -                    |                    -                    |
| ICDAR2015 | [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads)     | [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_training.json) |                    -                    | [instances_test.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_test.json) |
| ICDAR2017 | [homepage](https://rrc.cvc.uab.es/?ch=8&com=downloads)     | [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2017/instances_training.json) | [instances_val.json](https://download.openmmlab.com/mmocr/data/icdar2017/instances_val.json) | - |       |       |
| Synthtext | [homepage](https://www.robots.ox.ac.uk/~vgg/data/scenetext/)  | instances_training.lmdb ([data.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/data.mdb), [lock.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/lock.mdb)) |                    -                    | - |
| TextOCR | [homepage](https://textvqa.org/textocr/dataset)  | - |                    -                    | -
| Totaltext | [homepage](https://github.com/cs-chan/Total-Text-Dataset)  | - |                    -                    | -

## Important Note

:::{note}
**For users who want to train models on CTW1500, ICDAR 2015/2017, and Totaltext dataset,** there might be some images containing orientation info in EXIF data. The default OpenCV
backend used in MMCV would read them and apply the rotation on the images.  However, their gold annotations are made on the raw pixels, and such
inconsistency results in false examples in the training set. Therefore, users should use `dict(type='LoadImageFromFile', color_type='color_ignore_orientation')` in pipelines to change MMCV's default loading behaviour. (see [DBNet's config](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py) for example)
:::

## Preparation Steps
### ICDAR 2015
- Step0: Read [Important Note](#important-note)
- Step1: Download `ch4_training_images.zip`, `ch4_test_images.zip`, `ch4_training_localization_transcription_gt.zip`, `Challenge4_Test_Task1_GT.zip` from [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads)
- Step2:
```bash
mkdir icdar2015 && cd icdar2015
mkdir imgs && mkdir annotations
# For images,
mv ch4_training_images imgs/training
mv ch4_test_images imgs/test
# For annotations,
mv ch4_training_localization_transcription_gt annotations/training
mv Challenge4_Test_Task1_GT annotations/test
```
- Step3: Download [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_training.json) and [instances_test.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_test.json) and move them to `icdar2015`
- Or, generate `instances_training.json` and `instances_test.json` with following command:
```bash
python tools/data/textdet/icdar_converter.py /path/to/icdar2015 -o /path/to/icdar2015 -d icdar2015 --split-list training test
```

### ICDAR 2017
- Follow similar steps as [ICDAR 2015](#icdar-2015).

### CTW1500
- Step0: Read [Important Note](#important-note)
- Step1: Download `train_images.zip`, `test_images.zip`, `train_labels.zip`, `test_labels.zip` from [github](https://github.com/Yuliang-Liu/Curve-Text-Detector)
```bash
mkdir ctw1500 && cd ctw1500
mkdir imgs && mkdir annotations

# For annotations
cd annotations
wget -O train_labels.zip https://universityofadelaide.box.com/shared/static/jikuazluzyj4lq6umzei7m2ppmt3afyw.zip
wget -O test_labels.zip https://cloudstor.aarnet.edu.au/plus/s/uoeFl0pCN9BOCN5/download
unzip train_labels.zip && mv ctw1500_train_labels training
unzip test_labels.zip -d test
cd ..
# For images
cd imgs
wget -O train_images.zip https://universityofadelaide.box.com/shared/static/py5uwlfyyytbb2pxzq9czvu6fuqbjdh8.zip
wget -O test_images.zip https://universityofadelaide.box.com/shared/static/t4w48ofnqkdw7jyc4t11nsukoeqk9c3d.zip
unzip train_images.zip && mv train_images training
unzip test_images.zip && mv test_images test
```
- Step2: Generate `instances_training.json` and `instances_test.json` with following command:

```bash
python tools/data/textdet/ctw1500_converter.py /path/to/ctw1500 -o /path/to/ctw1500 --split-list training test
```

### SynthText

- Download [data.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/data.mdb) and [lock.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/lock.mdb) to `synthtext/instances_training.lmdb/`.

### TextOCR
- Step1: Download [train_val_images.zip](https://dl.fbaipublicfiles.com/textvqa/images/train_val_images.zip), [TextOCR_0.1_train.json](https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_train.json) and [TextOCR_0.1_val.json](https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_val.json) to `textocr/`.
```bash
mkdir textocr && cd textocr

# Download TextOCR dataset
wget https://dl.fbaipublicfiles.com/textvqa/images/train_val_images.zip
wget https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_train.json
wget https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_val.json

# For images
unzip -q train_val_images.zip
mv train_images train
```
- Step2: Generate `instances_training.json` and `instances_val.json` with the following command:
```bash
python tools/data/textdet/textocr_converter.py /path/to/textocr
```
### Totaltext
- Step0: Read [Important Note](#important-note)
- Step1: Download `totaltext.zip` from [github dataset](https://github.com/cs-chan/Total-Text-Dataset/tree/master/Dataset) and `groundtruth_text.zip` from [github Groundtruth](https://github.com/cs-chan/Total-Text-Dataset/tree/master/Groundtruth/Text) (Our totaltext_converter.py supports groundtruth with both .mat and .txt format).
```bash
mkdir totaltext && cd totaltext
mkdir imgs && mkdir annotations

# For images
# in ./totaltext
unzip totaltext.zip
mv Images/Train imgs/training
mv Images/Test imgs/test

# For annotations
unzip groundtruth_text.zip
cd Groundtruth
mv Polygon/Train ../annotations/training
mv Polygon/Test ../annotations/test

```
- Step2: Generate `instances_training.json` and `instances_test.json` with the following command:
```bash
python tools/data/textdet/totaltext_converter.py /path/to/totaltext -o /path/to/totaltext --split-list training test
```
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00
			`# Text Detection`

			`## Overview`

			`The structure of the text detection dataset directory is organized as follows.`

			```text
			`├── ctw1500`
			`│ ├── annotations`
			`│ ├── imgs`
			`│ ├── instances_test.json`
			`│ └── instances_training.json`
			`├── icdar2015`
			`│ ├── imgs`
			`│ ├── instances_test.json`
			`│ └── instances_training.json`
			`├── icdar2017`
			`│ ├── imgs`
			`│ ├── instances_training.json`
			`│ └── instances_val.json`
			`├── synthtext`
			`│ ├── imgs`
			`│ └── instances_training.lmdb`
			`│ ├── data.mdb`
			`│ └── lock.mdb`
			`├── textocr`
			`│ ├── train`
			`│ ├── instances_training.json`
			`│ └── instances_val.json`
			`├── totaltext`
			`│ ├── imgs`
			`│ ├── instances_test.json`
			`│ └── instances_training.json`
			```

			`\|Dataset\|Images\| \| Annotation Files \| \| \|`
			`\| :-------: \| :------------------------------------------------------------: \| :----------------------------------------------------------------------------------: \| :----------------------------------------------------------------------------------------------------: \| :-------------------------------------: \| :--------------------------------------------------------------------------------------------: \|`
			`\| \| \| training \| validation \| testing \| \|`
			`\| CTW1500 \| [homepage](https://github.com/Yuliang-Liu/Curve-Text-Detector) \| - \| - \| - \|`
			`\| ICDAR2015 \| [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads) \| [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_training.json) \| - \| [instances_test.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_test.json) \|`
			`\| ICDAR2017 \| [homepage](https://rrc.cvc.uab.es/?ch=8&com=downloads) \| [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2017/instances_training.json) \| [instances_val.json](https://download.openmmlab.com/mmocr/data/icdar2017/instances_val.json) \| - \| \| \|`
			`\| Synthtext \| [homepage](https://www.robots.ox.ac.uk/~vgg/data/scenetext/) \| instances_training.lmdb ([data.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/data.mdb), [lock.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/lock.mdb)) \| - \| - \|`
			`\| TextOCR \| [homepage](https://textvqa.org/textocr/dataset) \| - \| - \| -`
			`\| Totaltext \| [homepage](https://github.com/cs-chan/Total-Text-Dataset) \| - \| - \| -`

			`## Important Note`

[Docs] Improve docs style (#474) * new theme * add theme * update zh_cn * improve docs style * use customized * fix * update req * docs * Update docs * update conf * update * update layout * disable logo url * free version limit * update conf * Fix api ref * fix version 2021-09-08 11:40:51 +08:00			`:::{note}`
			`For users who want to train models on CTW1500, ICDAR 2015/2017, and Totaltext dataset, there might be some images containing orientation info in EXIF data. The default OpenCV`
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`backend used in MMCV would read them and apply the rotation on the images. However, their gold annotations are made on the raw pixels, and such`
			inconsistency results in false examples in the training set. Therefore, users should use `dict(type='LoadImageFromFile', color_type='color_ignore_orientation')` in pipelines to change MMCV's default loading behaviour. (see [DBNet's config](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py) for example)
[Docs] Improve docs style (#474) * new theme * add theme * update zh_cn * improve docs style * use customized * fix * update req * docs * Update docs * update conf * update * update layout * disable logo url * free version limit * update conf * Fix api ref * fix version 2021-09-08 11:40:51 +08:00			`:::`
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00
			`## Preparation Steps`
			`### ICDAR 2015`
			`- Step0: Read [Important Note](#important-note)`
			- Step1: Download `ch4_training_images.zip`, `ch4_test_images.zip`, `ch4_training_localization_transcription_gt.zip`, `Challenge4_Test_Task1_GT.zip` from [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads)
			`- Step2:`
			```bash
			`mkdir icdar2015 && cd icdar2015`
			`mkdir imgs && mkdir annotations`
			`# For images,`
			`mv ch4_training_images imgs/training`
			`mv ch4_test_images imgs/test`
			`# For annotations,`
			`mv ch4_training_localization_transcription_gt annotations/training`
			`mv Challenge4_Test_Task1_GT annotations/test`
			```
			- Step3: Download [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_training.json) and [instances_test.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_test.json) and move them to `icdar2015`
			- Or, generate `instances_training.json` and `instances_test.json` with following command:
			```bash
			`python tools/data/textdet/icdar_converter.py /path/to/icdar2015 -o /path/to/icdar2015 -d icdar2015 --split-list training test`
			```

			`### ICDAR 2017`
			`- Follow similar steps as [ICDAR 2015](#icdar-2015).`

			`### CTW1500`
			`- Step0: Read [Important Note](#important-note)`
			- Step1: Download `train_images.zip`, `test_images.zip`, `train_labels.zip`, `test_labels.zip` from [github](https://github.com/Yuliang-Liu/Curve-Text-Detector)
			```bash
			`mkdir ctw1500 && cd ctw1500`
			`mkdir imgs && mkdir annotations`

			`# For annotations`
			`cd annotations`
			`wget -O train_labels.zip https://universityofadelaide.box.com/shared/static/jikuazluzyj4lq6umzei7m2ppmt3afyw.zip`
			`wget -O test_labels.zip https://cloudstor.aarnet.edu.au/plus/s/uoeFl0pCN9BOCN5/download`
			`unzip train_labels.zip && mv ctw1500_train_labels training`
			`unzip test_labels.zip -d test`
			`cd ..`
			`# For images`
			`cd imgs`
			`wget -O train_images.zip https://universityofadelaide.box.com/shared/static/py5uwlfyyytbb2pxzq9czvu6fuqbjdh8.zip`
			`wget -O test_images.zip https://universityofadelaide.box.com/shared/static/t4w48ofnqkdw7jyc4t11nsukoeqk9c3d.zip`
			`unzip train_images.zip && mv train_images training`
			`unzip test_images.zip && mv test_images test`
			```
			- Step2: Generate `instances_training.json` and `instances_test.json` with following command:

			```bash
			`python tools/data/textdet/ctw1500_converter.py /path/to/ctw1500 -o /path/to/ctw1500 --split-list training test`
			```

			`### SynthText`

			- Download [data.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/data.mdb) and [lock.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/lock.mdb) to `synthtext/instances_training.lmdb/`.

			`### TextOCR`
			- Step1: Download [train_val_images.zip](https://dl.fbaipublicfiles.com/textvqa/images/train_val_images.zip), [TextOCR_0.1_train.json](https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_train.json) and [TextOCR_0.1_val.json](https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_val.json) to `textocr/`.
			```bash
			`mkdir textocr && cd textocr`

			`# Download TextOCR dataset`
			`wget https://dl.fbaipublicfiles.com/textvqa/images/train_val_images.zip`
			`wget https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_train.json`
			`wget https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_val.json`

			`# For images`
			`unzip -q train_val_images.zip`
			`mv train_images train`
			```
			- Step2: Generate `instances_training.json` and `instances_val.json` with the following command:
			```bash
			`python tools/data/textdet/textocr_converter.py /path/to/textocr`
			```
			`### Totaltext`
			`- Step0: Read [Important Note](#important-note)`
			- Step1: Download `totaltext.zip` from [github dataset](https://github.com/cs-chan/Total-Text-Dataset/tree/master/Dataset) and `groundtruth_text.zip` from [github Groundtruth](https://github.com/cs-chan/Total-Text-Dataset/tree/master/Groundtruth/Text) (Our totaltext_converter.py supports groundtruth with both .mat and .txt format).
			```bash
			`mkdir totaltext && cd totaltext`
			`mkdir imgs && mkdir annotations`

			`# For images`
			`# in ./totaltext`
			`unzip totaltext.zip`
			`mv Images/Train imgs/training`
			`mv Images/Test imgs/test`

			`# For annotations`
			`unzip groundtruth_text.zip`
			`cd Groundtruth`
			`mv Polygon/Train ../annotations/training`
			`mv Polygon/Test ../annotations/test`

			```
			- Step2: Generate `instances_training.json` and `instances_test.json` with the following command:
			```bash
			`python tools/data/textdet/totaltext_converter.py /path/to/totaltext -o /path/to/totaltext --split-list training test`
			```