mmocr/docs/en/tools.md

# Useful Tools

We provide some useful tools under `mmocr/tools` directory.

## Publish a Model

Before you upload a model to AWS, you may want to
(1) convert the model weights to CPU tensors, (2) delete the optimizer states and
(3) compute the hash of the checkpoint file and append the hash id to the filename. These functionalities could be achieved by `tools/publish_model.py`.

  ```shell
  python tools/publish_model.py ${INPUT_FILENAME} ${OUTPUT_FILENAME}
  ```

For example,

  ```shell
  python tools/publish_model.py work_dirs/psenet/latest.pth psenet_r50_fpnf_sbn_1x_20190801.pth
  ```

The final output filename will be `psenet_r50_fpnf_sbn_1x_20190801-{hash id}.pth`.


## Convert text recognition dataset to lmdb format
Reading images or labels from files can be slow when data are excessive, e.g. on a scale of millions. Besides, in academia, most of the scene text recognition datasets are stored in lmdb format, including images and labels. To get closer to the mainstream practice and enhance the data storage efficiency, MMOCR now provides `tools/data/utils/lmdb_converter.py` to convert text recognition datasets to lmdb format.

| Arguments         | Type | Description                                                        |
| ----------------- | ---- | ------------------------------------------------------------------ |
| `label_path`      | str  | Path to label file.                                                |
| `output`          | str  | Output lmdb path.                                                  |
| `--img-root`      | str  | Input imglist path.                                                |
| `--label-only`    | bool | Only converter label to lmdb                                       |
| `--label-format`  | str  | The format of the label file, either txt or jsonl.                 |
| `--batch-size`    | int  | Processing batch size, defaults to 1000                            |
| `--encoding`      | str  | Bytes coding scheme, defaults to utf8.                             |
| `--lmdb-map-size` | int  | Maximum size database may grow to , defaults to 109951162776 bytes |

### Examples

Generate a mixed lmdb file with label.txt and images in `imgs/`:

```bash
python tools/data/utils/lmdb_converter.py label.txt imgs.lmdb -i imgs
```

Generate a mixed lmdb file with label.jsonl and images in `imgs/`:

```bash
python tools/data/utils/lmdb_converter.py label.json imgs.lmdb -i imgs -f jsonl
```

Generate a label-only lmdb file with label.txt:

```bash
python tools/data/utils/lmdb_converter.py label.txt label.lmdb --label-only
```

Generate a label-only lmdb file with label.jsonl:

```bash
python tools/data/utils/lmdb_converter.py label.json label.lmdb --label-only -f jsonl
```


## Convert annotations from Labelme
[Labelme](https://github.com/wkentaro/labelme) is a popular graphical image annotation tool. You can convert the labels generated by labelme to the MMOCR data format using `tools/data/common/labelme_converter.py`. Both detection and recognition tasks are supported.

  ```bash
  # tasks can be "det" or both "det", "recog"
  python tools/data/common/labelme_converter.py <json_dir> <image_dir> <out_dir> --tasks <tasks>
  ```

For example, converting the labelme format annotation in `tests/data/toy_dataset/labelme` to MMOCR detection labels `instances_training.txt` and cropping the image patches for recognition task to `tests/data/toy_dataset/crops` with the labels `train_label.jsonl`:

  ```bash
  python tools/data/common/labelme_converter.py tests/data/toy_dataset/labelme tests/data/toy_dataset/imgs tests/data/toy_dataset --tasks det recog
  ```


## Log Analysis

You can use `tools/analyze_logs.py` to plot loss/hmean curves given a training log file. Run `pip install seaborn` first to install the dependency.

![](../../demo/resources/log_analysis_demo.png)

 ```shell
python tools/analyze_logs.py plot_curve [--keys ${KEYS}] [--title ${TITLE}] [--legend ${LEGEND}] [--backend ${BACKEND}] [--style ${STYLE}] [--out ${OUT_FILE}]
 ```

| Arguments   | Type | Description                                                                                                     |
| ----------- | ---- | --------------------------------------------------------------------------------------------------------------- |
| `--keys`    | str  | The metric that you want to plot. Defaults to `loss`.                                                           |
| `--title`   | str  | Title of figure.                                                                                                |
| `--legend`  | str  | Legend of each plot.                                                                                            |
| `--backend` | str  | Backend of the plot. [more info](https://matplotlib.org/stable/users/explain/backends.html)                     |
| `--style`   | str  | Style of the plot. Defaults to `dark`. [more info](https://seaborn.pydata.org/generated/seaborn.set_style.html) |
| `--out`     | str  | Path of output figure.                                                                                          |

**Examples:**

Download the following DBNet and CRNN training logs to run demos.
```shell
wget https://download.openmmlab.com/mmocr/textdet/dbnet/dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597.log.json -O DBNet_log.json

wget https://download.openmmlab.com/mmocr/textrecog/crnn/20210326_111035.log.json -O CRNN_log.json
```

Please specify an output path if you are running the codes on systems without a GUI.

- Plot loss metric.

  ```shell
  python tools/analyze_logs.py plot_curve DBNet_log.json --keys loss --legend loss
  ```

- Plot hmean-iou:hmean metric of text detection.

  ```shell
  python tools/analyze_logs.py plot_curve DBNet_log.json --keys hmean-iou:hmean --legend hmean-iou:hmean
  ```

- Plot 0_1-N.E.D metric of text recognition.

  ```shell
  python tools/analyze_logs.py plot_curve CRNN_log.json --keys 0_1-N.E.D --legend 0_1-N.E.D
  ```

- Compute the average training speed.

  ```shell
  python tools/analyze_logs.py cal_train_time CRNN_log.json --include-outliers
  ```

  The output is expected to be like the following.

  ```text
  -----Analyze train time of CRNN_log.json-----
  slowest epoch 4, average time is 0.3464
  fastest epoch 5, average time is 0.2365
  time std over epochs is 0.0356
  average iter time: 0.2906 s/iter
  ```
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			`# Useful Tools`

			We provide some useful tools under `mmocr/tools` directory.

			`## Publish a Model`

			`Before you upload a model to AWS, you may want to`
			`(1) convert the model weights to CPU tensors, (2) delete the optimizer states and`
			(3) compute the hash of the checkpoint file and append the hash id to the filename. These functionalities could be achieved by `tools/publish_model.py`.

[Feature] Add labelme converter for textdet and textrecog (#972) * add labelme converter * move to common * add labelme sample annos * add doc * remove useless field generated by labelme to reduce size * add recog_format option; add skip ignored instances while cropping * set warp as false by default * update doc * fix typo Co-authored-by: xinke-wang <wangxinyu2017@gmail.com> Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com> 2022-05-03 17:28:22 +08:00			```shell
			`python tools/publish_model.py ${INPUT_FILENAME} ${OUTPUT_FILENAME}`
			```
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00
			`For example,`

[Feature] Add labelme converter for textdet and textrecog (#972) * add labelme converter * move to common * add labelme sample annos * add doc * remove useless field generated by labelme to reduce size * add recog_format option; add skip ignored instances while cropping * set warp as false by default * update doc * fix typo Co-authored-by: xinke-wang <wangxinyu2017@gmail.com> Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com> 2022-05-03 17:28:22 +08:00			```shell
			`python tools/publish_model.py work_dirs/psenet/latest.pth psenet_r50_fpnf_sbn_1x_20190801.pth`
			```
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00
			The final output filename will be `psenet_r50_fpnf_sbn_1x_20190801-{hash id}.pth`.


[Feature] Add recog2lmdb and new toy dataset files (#979) * loss * fix * add img2lmdb and test files * update * add reference * fix lint * fix typo * use total_numer instead to fit mmocr's lmdbloader * reorganize and update * fix lint * update test file * refactor and update * fix test * update doc in tools * fix lint * update old lmdb test file * update * mask the unittest for recog2lmdb and use json format for label_only * remove if __name__ * fix case, doc, typo, formats * fix typos * fix docs and variable names * Apply suggestions from code review Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com> * update test_loader.py and fix a bug Co-authored-by: gaotongxiao <gaotongxiao@gmail.com> Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com> 2022-04-29 22:30:36 +08:00			`## Convert text recognition dataset to lmdb format`
			Reading images or labels from files can be slow when data are excessive, e.g. on a scale of millions. Besides, in academia, most of the scene text recognition datasets are stored in lmdb format, including images and labels. To get closer to the mainstream practice and enhance the data storage efficiency, MMOCR now provides `tools/data/utils/lmdb_converter.py` to convert text recognition datasets to lmdb format.

			`\| Arguments \| Type \| Description \|`
			`\| ----------------- \| ---- \| ------------------------------------------------------------------ \|`
			\| `label_path` \| str \| Path to label file. \|
			\| `output` \| str \| Output lmdb path. \|
			\| `--img-root` \| str \| Input imglist path. \|
			\| `--label-only` \| bool \| Only converter label to lmdb \|
			\| `--label-format` \| str \| The format of the label file, either txt or jsonl. \|
			\| `--batch-size` \| int \| Processing batch size, defaults to 1000 \|
			\| `--encoding` \| str \| Bytes coding scheme, defaults to utf8. \|
			\| `--lmdb-map-size` \| int \| Maximum size database may grow to , defaults to 109951162776 bytes \|

			`### Examples`

			Generate a mixed lmdb file with label.txt and images in `imgs/`:

[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			```bash
[Feature] Add recog2lmdb and new toy dataset files (#979) * loss * fix * add img2lmdb and test files * update * add reference * fix lint * fix typo * use total_numer instead to fit mmocr's lmdbloader * reorganize and update * fix lint * update test file * refactor and update * fix test * update doc in tools * fix lint * update old lmdb test file * update * mask the unittest for recog2lmdb and use json format for label_only * remove if __name__ * fix case, doc, typo, formats * fix typos * fix docs and variable names * Apply suggestions from code review Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com> * update test_loader.py and fix a bug Co-authored-by: gaotongxiao <gaotongxiao@gmail.com> Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com> 2022-04-29 22:30:36 +08:00			`python tools/data/utils/lmdb_converter.py label.txt imgs.lmdb -i imgs`
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			```
[Feature] Add recog2lmdb and new toy dataset files (#979) * loss * fix * add img2lmdb and test files * update * add reference * fix lint * fix typo * use total_numer instead to fit mmocr's lmdbloader * reorganize and update * fix lint * update test file * refactor and update * fix test * update doc in tools * fix lint * update old lmdb test file * update * mask the unittest for recog2lmdb and use json format for label_only * remove if __name__ * fix case, doc, typo, formats * fix typos * fix docs and variable names * Apply suggestions from code review Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com> * update test_loader.py and fix a bug Co-authored-by: gaotongxiao <gaotongxiao@gmail.com> Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com> 2022-04-29 22:30:36 +08:00
			Generate a mixed lmdb file with label.jsonl and images in `imgs/`:

[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			```bash
[Feature] Add recog2lmdb and new toy dataset files (#979) * loss * fix * add img2lmdb and test files * update * add reference * fix lint * fix typo * use total_numer instead to fit mmocr's lmdbloader * reorganize and update * fix lint * update test file * refactor and update * fix test * update doc in tools * fix lint * update old lmdb test file * update * mask the unittest for recog2lmdb and use json format for label_only * remove if __name__ * fix case, doc, typo, formats * fix typos * fix docs and variable names * Apply suggestions from code review Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com> * update test_loader.py and fix a bug Co-authored-by: gaotongxiao <gaotongxiao@gmail.com> Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com> 2022-04-29 22:30:36 +08:00			`python tools/data/utils/lmdb_converter.py label.json imgs.lmdb -i imgs -f jsonl`
[Docs] Refactor docs (#409) 2021-08-25 16:41:07 +08:00			```
[Feature] Add analyze_logs in tools and its description in docs (#899) * Create analyze_logs.py * Update tools.md * fix lint and typo * Update analyze_logs.py * Add arg table and demo log file * Delete line66 for lint error * fix captial letters * update doc * fix markdown indentation * Add log_analysis_demo.png to demo/resources * Add log_analysis_demo.png and two links in table * Improve epoch-based metric * fix lint error * fix lint error(tabs and spaces) * check code lints and format Co-authored-by: Mountchicken <mountchicken@outlook.com> Co-authored-by: xinke-wang <wangxinyu2017@gmail.com> 2022-04-02 22:40:39 +08:00
[Feature] Add recog2lmdb and new toy dataset files (#979) * loss * fix * add img2lmdb and test files * update * add reference * fix lint * fix typo * use total_numer instead to fit mmocr's lmdbloader * reorganize and update * fix lint * update test file * refactor and update * fix test * update doc in tools * fix lint * update old lmdb test file * update * mask the unittest for recog2lmdb and use json format for label_only * remove if __name__ * fix case, doc, typo, formats * fix typos * fix docs and variable names * Apply suggestions from code review Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com> * update test_loader.py and fix a bug Co-authored-by: gaotongxiao <gaotongxiao@gmail.com> Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com> 2022-04-29 22:30:36 +08:00			`Generate a label-only lmdb file with label.txt:`

			```bash
			`python tools/data/utils/lmdb_converter.py label.txt label.lmdb --label-only`
			```

			`Generate a label-only lmdb file with label.jsonl:`

			```bash
			`python tools/data/utils/lmdb_converter.py label.json label.lmdb --label-only -f jsonl`
			```
[Feature] Add analyze_logs in tools and its description in docs (#899) * Create analyze_logs.py * Update tools.md * fix lint and typo * Update analyze_logs.py * Add arg table and demo log file * Delete line66 for lint error * fix captial letters * update doc * fix markdown indentation * Add log_analysis_demo.png to demo/resources * Add log_analysis_demo.png and two links in table * Improve epoch-based metric * fix lint error * fix lint error(tabs and spaces) * check code lints and format Co-authored-by: Mountchicken <mountchicken@outlook.com> Co-authored-by: xinke-wang <wangxinyu2017@gmail.com> 2022-04-02 22:40:39 +08:00
[Feature] Add labelme converter for textdet and textrecog (#972) * add labelme converter * move to common * add labelme sample annos * add doc * remove useless field generated by labelme to reduce size * add recog_format option; add skip ignored instances while cropping * set warp as false by default * update doc * fix typo Co-authored-by: xinke-wang <wangxinyu2017@gmail.com> Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com> 2022-05-03 17:28:22 +08:00
			`## Convert annotations from Labelme`
			[Labelme](https://github.com/wkentaro/labelme) is a popular graphical image annotation tool. You can convert the labels generated by labelme to the MMOCR data format using `tools/data/common/labelme_converter.py`. Both detection and recognition tasks are supported.

			```bash
			`# tasks can be "det" or both "det", "recog"`
			`python tools/data/common/labelme_converter.py <json_dir> <image_dir> <out_dir> --tasks <tasks>`
			```

			For example, converting the labelme format annotation in `tests/data/toy_dataset/labelme` to MMOCR detection labels `instances_training.txt` and cropping the image patches for recognition task to `tests/data/toy_dataset/crops` with the labels `train_label.jsonl`:

			```bash
			`python tools/data/common/labelme_converter.py tests/data/toy_dataset/labelme tests/data/toy_dataset/imgs tests/data/toy_dataset --tasks det recog`
			```


[Feature] Add analyze_logs in tools and its description in docs (#899) * Create analyze_logs.py * Update tools.md * fix lint and typo * Update analyze_logs.py * Add arg table and demo log file * Delete line66 for lint error * fix captial letters * update doc * fix markdown indentation * Add log_analysis_demo.png to demo/resources * Add log_analysis_demo.png and two links in table * Improve epoch-based metric * fix lint error * fix lint error(tabs and spaces) * check code lints and format Co-authored-by: Mountchicken <mountchicken@outlook.com> Co-authored-by: xinke-wang <wangxinyu2017@gmail.com> 2022-04-02 22:40:39 +08:00			`## Log Analysis`

			You can use `tools/analyze_logs.py` to plot loss/hmean curves given a training log file. Run `pip install seaborn` first to install the dependency.

			`![](../../demo/resources/log_analysis_demo.png)`

			```shell
			`python tools/analyze_logs.py plot_curve [--keys ${KEYS}] [--title ${TITLE}] [--legend ${LEGEND}] [--backend ${BACKEND}] [--style ${STYLE}] [--out ${OUT_FILE}]`
			```

			`\| Arguments \| Type \| Description \|`
			`\| ----------- \| ---- \| --------------------------------------------------------------------------------------------------------------- \|`
			\| `--keys` \| str \| The metric that you want to plot. Defaults to `loss`. \|
			\| `--title` \| str \| Title of figure. \|
			\| `--legend` \| str \| Legend of each plot. \|
			\| `--backend` \| str \| Backend of the plot. [more info](https://matplotlib.org/stable/users/explain/backends.html) \|
			\| `--style` \| str \| Style of the plot. Defaults to `dark`. [more info](https://seaborn.pydata.org/generated/seaborn.set_style.html) \|
			\| `--out` \| str \| Path of output figure. \|

			`Examples:`

			`Download the following DBNet and CRNN training logs to run demos.`
			```shell
			`wget https://download.openmmlab.com/mmocr/textdet/dbnet/dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597.log.json -O DBNet_log.json`

			`wget https://download.openmmlab.com/mmocr/textrecog/crnn/20210326_111035.log.json -O CRNN_log.json`
			```

			`Please specify an output path if you are running the codes on systems without a GUI.`

			`- Plot loss metric.`

			```shell
			`python tools/analyze_logs.py plot_curve DBNet_log.json --keys loss --legend loss`
			```

			`- Plot hmean-iou:hmean metric of text detection.`

			```shell
			`python tools/analyze_logs.py plot_curve DBNet_log.json --keys hmean-iou:hmean --legend hmean-iou:hmean`
			```

			`- Plot 0_1-N.E.D metric of text recognition.`

			```shell
			`python tools/analyze_logs.py plot_curve CRNN_log.json --keys 0_1-N.E.D --legend 0_1-N.E.D`
			```

			`- Compute the average training speed.`

			```shell
			`python tools/analyze_logs.py cal_train_time CRNN_log.json --include-outliers`
			```

			`The output is expected to be like the following.`

			```text
			`-----Analyze train time of CRNN_log.json-----`
			`slowest epoch 4, average time is 0.3464`
			`fastest epoch 5, average time is 0.2365`
			`time std over epochs is 0.0356`
			`average iter time: 0.2906 s/iter`
			```