mmocr/docs/en/getting_started.md

# Getting Started

In this guide we will show you some useful commands and familiarize you with MMOCR. We also provide [a notebook](https://github.com/open-mmlab/mmocr/blob/main/demo/MMOCR_Tutorial.ipynb) that can help you get the most out of MMOCR.

## Installation

Check out our [installation guide](install.md) for full steps.

## Dataset Preparation

MMOCR supports numerous datasets which are classified by the type of their corresponding tasks. You may find their preparation steps in these sections: [Detection Datasets](datasets/det.md), [Recognition Datasets](datasets/recog.md), [KIE Datasets](datasets/kie.md) and [NER Datasets](datasets/ner.md).

## Inference with Pretrained Models

You can perform end-to-end OCR on our demo image with one simple line of command:

```shell
python mmocr/utils/ocr.py demo/demo_text_ocr.jpg --print-result --imshow
```

Its detection result will be printed out and a new window will pop up with result visualization. More demo and full instructions can be found in [Demo](demo.md).

## Training

### Training with Toy Dataset

We provide a toy dataset under `tests/data` on which you can get a sense of training before the academic dataset is prepared.

For example, to train a text recognition task with `seg` method and toy dataset,
```shell
python tools/train.py configs/textrecog/seg/seg_r31_1by16_fpnocr_toy_dataset.py --work-dir seg
```

To train a text recognition task with `sar` method and toy dataset,
```shell
python tools/train.py configs/textrecog/sar/sar_r31_parallel_decoder_toy_dataset.py --work-dir sar
```

### Training with Academic Dataset

Once you have prepared required academic dataset following our instruction, the only last thing to check is if the model's config points MMOCR to the correct dataset path. Suppose we want to train DBNet on ICDAR 2015, and part of `configs/_base_/det_datasets/icdar2015.py` looks like the following:
```python
dataset_type = 'IcdarDataset'
data_root = 'data/icdar2015'
train = dict(
    type=dataset_type,
    ann_file=f'{data_root}/instances_training.json',
    img_prefix=f'{data_root}/imgs',
    pipeline=None)
test = dict(
    type=dataset_type,
    ann_file=f'{data_root}/instances_test.json',
    img_prefix=f'{data_root}/imgs',
    pipeline=None)
train_list = [train]
test_list = [test]
```
You would need to check if `data/icdar2015` is right. Then you can start training with the command:
```shell
python tools/train.py configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py --work-dir dbnet
```

You can find full training instructions, explanations and useful training configs in [Training](training.md).

## Testing

Suppose now you have finished the training of DBNet and the latest model has been saved in `dbnet/latest.pth`. You can evaluate its performance on the test set using the `hmean-iou` metric with the following command:
```shell
python tools/test.py configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py dbnet/latest.pth --eval hmean-iou
```

Evaluating any pretrained model accessible online is also allowed:
```shell
python tools/test.py configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py https://download.openmmlab.com/mmocr/textdet/dbnet/dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597.pth --eval hmean-iou
```

More instructions on testing are available in [Testing](testing.md).
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00			`# Getting Started`

[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`In this guide we will show you some useful commands and familiarize you with MMOCR. We also provide [a notebook](https://github.com/open-mmlab/mmocr/blob/main/demo/MMOCR_Tutorial.ipynb) that can help you get the most out of MMOCR.`
remove markdown toc (#3) * remove toc from md * remove toc 2021-04-08 01:12:01 +08:00
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`## Installation`
remove markdown toc (#3) * remove toc from md * remove toc 2021-04-08 01:12:01 +08:00
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`Check out our [installation guide](install.md) for full steps.`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`## Dataset Preparation`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`MMOCR supports numerous datasets which are classified by the type of their corresponding tasks. You may find their preparation steps in these sections: [Detection Datasets](datasets/det.md), [Recognition Datasets](datasets/recog.md), [KIE Datasets](datasets/kie.md) and [NER Datasets](datasets/ner.md).`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`## Inference with Pretrained Models`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`You can perform end-to-end OCR on our demo image with one simple line of command:`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00
			```shell
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`python mmocr/utils/ocr.py demo/demo_text_ocr.jpg --print-result --imshow`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00			```

fix broken links (#576) 2021-11-11 17:57:15 +08:00			`Its detection result will be printed out and a new window will pop up with result visualization. More demo and full instructions can be found in [Demo](demo.md).`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`## Training`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`### Training with Toy Dataset`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			We provide a toy dataset under `tests/data` on which you can get a sense of training before the academic dataset is prepared.
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			For example, to train a text recognition task with `seg` method and toy dataset,
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00			```shell
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`python tools/train.py configs/textrecog/seg/seg_r31_1by16_fpnocr_toy_dataset.py --work-dir seg`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00			```

[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			To train a text recognition task with `sar` method and toy dataset,
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00			```shell
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`python tools/train.py configs/textrecog/sar/sar_r31_parallel_decoder_toy_dataset.py --work-dir sar`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00			```

[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`### Training with Academic Dataset`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00
[Docs] Add getting_started.md in docs/zh (#841) * Update README_zh-CN.md * Update README_zh-CN.md * Add files via upload * Update getting_started.md * Update getting_started.md * Update getting_started.md Co-authored-by: Tong Gao <gaotongxiao@gmail.com> 2022-03-16 14:18:13 +08:00			Once you have prepared required academic dataset following our instruction, the only last thing to check is if the model's config points MMOCR to the correct dataset path. Suppose we want to train DBNet on ICDAR 2015, and part of `configs/_base_/det_datasets/icdar2015.py` looks like the following:
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00			```python
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`dataset_type = 'IcdarDataset'`
			`data_root = 'data/icdar2015'`
[Docs] Add getting_started.md in docs/zh (#841) * Update README_zh-CN.md * Update README_zh-CN.md * Add files via upload * Update getting_started.md * Update getting_started.md * Update getting_started.md Co-authored-by: Tong Gao <gaotongxiao@gmail.com> 2022-03-16 14:18:13 +08:00			`train = dict(`
			`type=dataset_type,`
			`ann_file=f'{data_root}/instances_training.json',`
			`img_prefix=f'{data_root}/imgs',`
			`pipeline=None)`
			`test = dict(`
			`type=dataset_type,`
			`ann_file=f'{data_root}/instances_test.json',`
			`img_prefix=f'{data_root}/imgs',`
			`pipeline=None)`
			`train_list = [train]`
			`test_list = [test]`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00			```
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			You would need to check if `data/icdar2015` is right. Then you can start training with the command:
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00			```shell
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`python tools/train.py configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py --work-dir dbnet`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00			```

[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`You can find full training instructions, explanations and useful training configs in [Training](training.md).`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`## Testing`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			Suppose now you have finished the training of DBNet and the latest model has been saved in `dbnet/latest.pth`. You can evaluate its performance on the test set using the `hmean-iou` metric with the following command:
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00			```shell
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`python tools/test.py configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py dbnet/latest.pth --eval hmean-iou`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00			```

[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`Evaluating any pretrained model accessible online is also allowed:`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00			```shell
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`python tools/test.py configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py https://download.openmmlab.com/mmocr/textdet/dbnet/dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597.pth --eval hmean-iou`
[feature]: dbnet and docs 2021-04-03 01:21:33 +08:00			```
support batch inference during testing (#310) * support batch inference during testing * fix unittest * update docs using url * set cfg for train, val and test * update docs * update docs and test.py * samples_per_gpu as global setting * changes revert 2021-06-23 11:34:29 +08:00
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`More instructions on testing are available in [Testing](testing.md).`