[Docs] Quick run (#1352)

* [Docs] Quick run

* fix link

Co-authored-by: liukuikun <liukuikun@sensetime.com>
This commit is contained in:
Tong Gao 2022-08-31 16:21:50 +08:00 committed by GitHub
parent e72edd6dcb
commit f788bfdbb9
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 281 additions and 113 deletions

View File

@ -1,84 +1,168 @@
# Quick Run # Quick Run
In this guide we will show you some useful commands and familiarize you with MMOCR. We also provide [a notebook](https://github.com/open-mmlab/mmocr/blob/main/demo/MMOCR_Tutorial.ipynb) that can help you get the most out of MMOCR. ## Inference
## Installation In addition to using our well-provided pre-trained models, you can also train models on your own datasets. In the next section, we will take you through the basic functions of MMOCR by training DBNet on the mini [ICDAR 2015](https://rrc.cvc.uab.es/?ch=4&com=downloads) dataset as an example.
Check out our [installation guide](install.md) for full steps. The next sections assume that you are using the [editorial approach to install](install.md) the MMOCR codebase.
## Dataset Preparation ## Prepare a Dataset
MMOCR supports numerous datasets which are classified by the type of their corresponding tasks. You may find their preparation steps in these sections: [Detection Datasets](datasets/det.md), [Recognition Datasets](datasets/recog.md), [KIE Datasets](datasets/kie.md) and [NER Datasets](datasets/ner.md). Since the variety of OCR dataset formats are not conducive to either switching or joint training of multiple datasets, MMOCR proposes a uniform [data format](../user_guides/dataset_prepare.md), and provides conversion scripts and [tutorials](../user_guides/dataset_prepare.md) for all commonly used OCR datasets. Usually, to use those datasets in MMOCR, you just need to follow the steps to get them ready for use.
## Inference with Pretrained Models ```{note}
But here, efficiency means everything.
You can perform end-to-end OCR on our demo image with one simple line of command:
```shell
python mmocr/utils/ocr.py demo/demo_text_ocr.jpg --print-result --imshow
``` ```
Its detection result will be printed out and a new window will pop up with result visualization. More demo and full instructions can be found in [Demo](demo.md). Here, we have prepared a lite version of ICDAR 2015 dataset for demonstration purposes. Download our pre-prepared [zip](https://download.openmmlab.com/mmocr/data/icdar2015/mini_icdar2015.tar.gz) and extract it to the `data/det/` directory under mmocr to get our prepared image and annotation file.
```Bash
wget https://download.openmmlab.com/mmocr/data/icdar2015/mini_icdar2015.tar.gz
mkdir -p data/det/
tar xzvf mini_icdar2015.tar.gz -C data/det/
```
## Modify the Config
Once the dataset is prepared, we will then specify the location of the training set and the training parameters by modifying the config file.
In this example, we will train a DBNet using resnet18 as its backbone. Since MMOCR already has a config file for the full ICDAR 2015 dataset (`configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py`), we just need to make some modifications on top of it.
We first need to modify the path to the dataset. In this config, most of the key config files are imported in `_base_`, such as the database configuration from `configs/_base_/det_datasets/icdar2015.py`. Open that file and replace the path pointed to by `ic15_det_data_root` in the first line with:
```Python
ic15_det_data_root = 'data/det/mini_icdar2015'
```
Also, because of the reduced dataset size, we have to reduce the number of training epochs to 400 accordingly, shorten the validation interval as well as the weight storage interval to 10 rounds, and drop the learning rate decay strategy. The following lines of configuration can be directly put into `configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py` to take effect.
```Python
# Save checkpoints every 10 epochs
default_hooks = dict(checkpoint=dict(type='CheckpointHook', interval=10), )
# Set the maximum number of epochs to 400, and validate the model every 10 epochs
train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=400, val_interval=10)
# Fix learning rate as a constant
param_scheduler = [dict(type='ConstantLR', factor=1.0),]
```
Here, we have rewritten the corresponding parameters in the base configuration directly through the [inheritance](https://mmengine.readthedocs.io/en/latest/tutorials/config.html) mechanism of the configuration. The original fields are distributed in `configs/_base_/schedules/schedule_sgd_1200e.py` and `configs/_base_/textdet_default_runtime.py`. You may check them out if interested.
```{tip}
For a more detailed description of config, please refer to [here](../user_guides/config.md).
```
## Browse the Dataset
Before we start the training, we can also visualize the image processed by training-time [data transforms](../basic_concepts/transforms.md). It's quite simple: pass the config file we need to visualize into the [browse_dataset.py](/tools/analysis_tools/browse_dataset.py) script.
```Bash
python tools/analysis_tools/browse_dataset.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py
```
The transformed images and annotations will be displayed one by one in a pop-up window.
<center class="half">
<img src="https://user-images.githubusercontent.com/24622904/187611542-01e9aa94-fc12-4756-964b-a0e472522a3a.jpg" width="250"/><img src="https://user-images.githubusercontent.com/24622904/187611555-3f5ea616-863d-4538-884f-bccbebc2f7e7.jpg" width="250"/><img src="https://user-images.githubusercontent.com/24622904/187611581-88be3970-fbfe-4f62-8cdf-7a8a7786af29.jpg" width="250"/>
</center>
For details on the parameters and usage of this script, please refer to [here](../user_guides/useful_tools.md).
```{tip}
In addition to satisfying our curiosity, visualization can also help us check the parts that may affect the model's performance before training, such as problems in configs, datasets and data transforms.
```
## Training ## Training
### Training with Toy Dataset Start the training by running the following command:
We provide a toy dataset under `tests/data` on which you can get a sense of training before the academic dataset is prepared. ```Bash
python tools/train.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py
For example, to train a text recognition task with `seg` method and toy dataset,
```shell
python tools/train.py configs/textrecog/seg/seg_r31_1by16_fpnocr_toy_dataset.py --work-dir seg
``` ```
To train a text recognition task with `sar` method and toy dataset, Depending on the system environment, MMOCR will automatically use the best device for training. If a GPU is available, a single GPU training will be started by default. When you start to see the output of the losses, you have successfully started the training.
```shell ```Bash
python tools/train.py configs/textrecog/sar/sar_r31_parallel_decoder_toy_dataset.py --work-dir sar 2022/08/22 18:42:22 - mmengine - INFO - Epoch(train) [1][5/7] lr: 7.0000e-03 memory: 7730 data_time: 0.4496 loss_prob: 14.6061 loss_thr: 2.2904 loss_db: 0.9879 loss: 17.8843 time: 1.8666
2022/08/22 18:42:24 - mmengine - INFO - Exp name: dbnet_resnet18_fpnc_1200e_icdar2015
2022/08/22 18:42:28 - mmengine - INFO - Epoch(train) [2][5/7] lr: 7.0000e-03 memory: 6695 data_time: 0.2052 loss_prob: 6.7840 loss_thr: 1.4114 loss_db: 0.9855 loss: 9.1809 time: 0.7506
2022/08/22 18:42:29 - mmengine - INFO - Exp name: dbnet_resnet18_fpnc_1200e_icdar2015
2022/08/22 18:42:33 - mmengine - INFO - Epoch(train) [3][5/7] lr: 7.0000e-03 memory: 6690 data_time: 0.2101 loss_prob: 3.0700 loss_thr: 1.1800 loss_db: 0.9967 loss: 5.2468 time: 0.6244
2022/08/22 18:42:33 - mmengine - INFO - Exp name: dbnet_resnet18_fpnc_1200e_icdar2015
``` ```
### Training with Academic Dataset Without extra configurations, model weights will be saved to `work_dirs/dbnet_resnet18_fpnc_1200e_icdar2015/`, while the logs will be stored in `work_dirs/dbnet_resnet18_fpnc_1200e_icdar2015/TIMESTAMP/`. Next, we just need to wait with some patience for training to finish.
Once you have prepared required academic dataset following our instruction, the only last thing to check is if the model's config points MMOCR to the correct dataset path. Suppose we want to train DBNet on ICDAR 2015, and part of `configs/_base_/det_datasets/icdar2015.py` looks like the following: ```{tip}
For advanced usage of training, such as CPU training, multi-GPU training, and cluster training, please refer to [Training and Testing](../user_guides/train_test.md).
```python
dataset_type = 'IcdarDataset'
data_root = 'data/icdar2015'
train = dict(
type=dataset_type,
ann_file=f'{data_root}/instances_training.json',
img_prefix=f'{data_root}/imgs',
pipeline=None)
test = dict(
type=dataset_type,
ann_file=f'{data_root}/instances_test.json',
img_prefix=f'{data_root}/imgs',
pipeline=None)
train_list = [train]
test_list = [test]
``` ```
You would need to check if `data/icdar2015` is right. Then you can start training with the command:
```shell
python tools/train.py configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py --work-dir dbnet
```
You can find full training instructions, explanations and useful training configs in [Training](training.md).
## Testing ## Testing
Suppose now you have finished the training of DBNet and the latest model has been saved in `dbnet/latest.pth`. You can evaluate its performance on the test set using the `hmean-iou` metric with the following command: After 400 epochs, we observe that DBNet performs best in the last epoch, with `hmean` reaching 60.86:
```shell ```Bash
python tools/test.py configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py dbnet/latest.pth --eval hmean-iou 08/22 19:24:52 - mmengine - INFO - Epoch(val) [400][100/100] icdar/precision: 0.7285 icdar/recall: 0.5226 icdar/hmean: 0.6086
``` ```
Evaluating any pretrained model accessible online is also allowed: ```{note}
It may not have been trained to be optimal, but it is sufficient for a demo.
```shell
python tools/test.py configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py https://download.openmmlab.com/mmocr/textdet/dbnet/dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597.pth --eval hmean-iou
``` ```
More instructions on testing are available in [Testing](testing.md). However, this value only reflects the performance of DBNet on the mini ICDAR 2015 dataset. For a comprehensive evaluation, we also need to see how it performs on out-of-distribution datasets. For example, `tests/data/det_toy_dataset` is a very small real dataset that we can use to verify the actual performance of DBNet.
Before testing, we also need to make some changes to the location of the dataset. Open `configs/_base_/det_datasets/icdar2015.py` and change `data_root` of `ic15_det_test` to `tests/data/det_toy_dataset`:
```Python
# ...
ic15_det_test = dict(
type='OCRDataset',
data_root='tests/data/det_toy_dataset',
# ...
)
```
Start testing:
```Bash
python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py work_dirs/dbnet_resnet18_fpnc_1200e_icdar2015/epoch_400.pth
```
And get the outputs:
```Bash
08/21 21:45:59 - mmengine - INFO - Epoch(test) [5/10] memory: 8562
08/21 21:45:59 - mmengine - INFO - Epoch(test) [10/10] eta: 0:00:00 time: 0.4893 data_time: 0.0191 memory: 283
08/21 21:45:59 - mmengine - INFO - Evaluating hmean-iou...
08/21 21:45:59 - mmengine - INFO - prediction score threshold: 0.30, recall: 0.6190, precision: 0.4815, hmean: 0.5417
08/21 21:45:59 - mmengine - INFO - prediction score threshold: 0.40, recall: 0.6190, precision: 0.5909, hmean: 0.6047
08/21 21:45:59 - mmengine - INFO - prediction score threshold: 0.50, recall: 0.6190, precision: 0.6842, hmean: 0.6500
08/21 21:45:59 - mmengine - INFO - prediction score threshold: 0.60, recall: 0.6190, precision: 0.7222, hmean: 0.6667
08/21 21:45:59 - mmengine - INFO - prediction score threshold: 0.70, recall: 0.3810, precision: 0.8889, hmean: 0.5333
08/21 21:45:59 - mmengine - INFO - prediction score threshold: 0.80, recall: 0.0000, precision: 0.0000, hmean: 0.0000
08/21 21:45:59 - mmengine - INFO - prediction score threshold: 0.90, recall: 0.0000, precision: 0.0000, hmean: 0.0000
08/21 21:45:59 - mmengine - INFO - Epoch(test) [10/10] icdar/precision: 0.7222 icdar/recall: 0.6190 icdar/hmean: 0.6667
```
The model achieves an hmean of 0.6667 on this dataset.
```{tip}
For advanced usage of testing, such as CPU testing, multi-GPU testing, and cluster testing, please refer to [Training and Testing] (../user_guides/train_test.md).
```
## Visualize the Outputs
We can also visualize its prediction output in `test.py`. You can open a pop-up visualization window with the `show` parameter; and can also specify the directory where the prediction result images are exported with the `show-dir` parameter.
```Bash
python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py work_dirs/dbnet_r18_fpnc_1200e_icdar2015/epoch_400.pth --show-dir imgs/
```
The true labels and predicted values are displayed in a tiled fashion in the visualization results. The green boxes in the left panel indicate the true labels and the red boxes in the right panel indicate the predicted values.
<div align="center">
<img src="https://user-images.githubusercontent.com/22607038/187423562-6a85e209-4b12-46ee-8a41-5c67b1ba83f9.png"/><br>
</div>
```{tip}
For a description of more visualization features, see [here](../user_guides/visualization.md).
```

View File

@ -1,84 +1,168 @@
# 开始 # 快速运行
在这个指南中,我们将介绍一些常用的命令,来帮助你熟悉 MMOCR。我们同时还提供了[notebook](https://github.com/open-mmlab/mmocr/blob/main/demo/MMOCR_Tutorial.ipynb) 版本的代码,可以让您快速上手 MMOCR。 ## 推理
## 安装 除了使用我们提供好的预训练模型,用户也可以在自己的数据集上训练流行模型。接下来我们以在迷你的 [ICDAR 2015](https://rrc.cvc.uab.es/?ch=4&com=downloads) 数据集上训练 DBNet 为例,带大家熟悉 MMOCR 的基本功能。
查看[安装指南](install.md),了解完整步骤 接下来的部分都假设你使用的是[编辑方式安装 MMOCR 代码库](install.md)
## 数据集准备 ## 准备数据集
MMOCR 支持许多种类数据集,这些数据集根据其相应任务的类型进行分类。可以在以下部分找到它们的准备步骤:[检测数据集](datasets/det.md)、[识别数据集](datasets/recog.md)、[KIE 数据集](datasets/kie.md)和 [NER 数据集](datasets/ner.md) 由于 OCR 任务的数据集种类多样,格式不一,不利于多数据集的切换和联合训练,因此 MMOCR 约定了一种[统一的数据格式](../user_guides/dataset_prepare.md),并针对常用的 OCR 数据集都提供了对应的转换脚本和[教程](../user_guides/dataset_prepare.md)。通常,要在 MMOCR 中使用数据集,你只需要按照对应步骤运行指令即可
## 使用预训练模型进行推理 ```{note}
但我们亦深知,效率就是生命——尤其对想要快速上手 MMOCR 的你来说。
下面通过一个简单的命令来演示端到端的识别:
```shell
python mmocr/utils/ocr.py demo/demo_text_ocr.jpg --print-result --imshow
``` ```
其检测结果将被打印出来,并弹出一个新窗口显示结果。更多示例和完整说明可以在[示例](demo.md)中找到。 在这里,我们准备了一个用于演示的精简版 ICDAR 2015 数据集。下载我们预先准备好的[压缩包](https://download.openmmlab.com/mmocr/data/icdar2015/mini_icdar2015.tar.gz),解压到 mmocr 的 `data/det/` 目录下,就能得到我们准备好的图片和标注文件。
```Bash
wget https://download.openmmlab.com/mmocr/data/icdar2015/mini_icdar2015.tar.gz
mkdir -p data/det/
tar xzvf mini_icdar2015.tar.gz -C data/det/
```
## 修改配置
准备好数据集后,我们接下来就需要通过修改配置的方式指定训练集的位置和训练参数。
在这个例子中,我们将会训练一个以 resnet18 作为骨干网络backbone的 DBNet。由于 MMOCR 已经有针对完整 ICDAR 2015 数据集的配置 `configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py`),我们只需要在它的基础上作出一点修改。
我们首先需要修改数据集的路径。在这个配置中,大部分关键的配置文件都在 `_base_` 中被导入,如数据库的配置就来自 `configs/_base_/det_datasets/icdar2015.py`。打开该文件,把第一行`ic15_det_data_root` 指向的路径替换:
```Python
ic15_det_data_root = 'data/det/mini_icdar2015'
```
另外,因为数据集尺寸缩小了,我们也要相应地减少训练的轮次到 400缩短验证和储存权重的间隔到10轮并放弃学习率衰减策略。直接把以下几行配置放入 `configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py`即可生效:
```Python
# 每 10 个 epoch 储存一次权重
default_hooks = dict(checkpoint=dict(type='CheckpointHook', interval=10), )
# 设置最大 epoch 数为 400每 10 个 epoch 运行一次验证
train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=400, val_interval=10)
# 令学习率为常量,即不进行学习率衰减
param_scheduler = [dict(type='ConstantLR', factor=1.0),]
```
这里,我们通过配置的[继承](https://mmengine.readthedocs.io/zh_CN/latest/tutorials/config.html)机制将基础配置中的相应参数直接进行了改写。原本的字段分布在 `configs/_base_/schedules/schedule_sgd_1200e.py``configs/_base_/textdet_default_runtime.py` 中,感兴趣的读者可以自行查看。
```{tip}
关于配置文件更加详尽的说明,请参考[此处](../user_guides/config.md)。
```
## 可视化数据集
在正式开始训练前,我们还可以可视化一下经过训练过程中[数据变换transforms](../basic_concepts/transforms.md)后的图像。方法也很简单,把我们需要可视化的配置传入 [browse_dataset.py](/tools/analysis_tools/browse_dataset.py) 脚本即可:
```Bash
python tools/analysis_tools/browse_dataset.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py
```
数据变换后的图片和标签会在弹窗中逐张被展示出来。
<center class="half">
<img src="https://user-images.githubusercontent.com/24622904/187611542-01e9aa94-fc12-4756-964b-a0e472522a3a.jpg" width="250"/><img src="https://user-images.githubusercontent.com/24622904/187611555-3f5ea616-863d-4538-884f-bccbebc2f7e7.jpg" width="250"/><img src="https://user-images.githubusercontent.com/24622904/187611581-88be3970-fbfe-4f62-8cdf-7a8a7786af29.jpg" width="250"/>
</center>
有关该脚本更详细的指南,请参考[此处](../user_guides/useful_tools.md).
```{tip}
除了满足好奇心之外,可视化还可以帮助我们在训练前检查可能影响到模型表现的部分,如配置文件、数据集及数据变换中的问题。
```
## 训练 ## 训练
### 小数据集训练 万事俱备,只欠东风。运行以下命令启动训练:
`tests/data`目录下提供了一个用于训练演示的小数据集,在准备学术数据集之前,它可以演示一个初步的训练。 ```Bash
python tools/train.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py
例如:用 `seg` 方法和小数据集来训练文本识别任务,
```shell
python tools/train.py configs/textrecog/seg/seg_r31_1by16_fpnocr_toy_dataset.py --work-dir seg
``` ```
`sar` 方法和小数据集训练文本识别, 根据系统情况MMOCR 会自动使用最佳的设备进行训练。如果有 GPU则会默认在第一张卡启动单卡训练。当开始看到 loss 的输出,就说明你已经成功启动了训练。
```shell ```Bash
python tools/train.py configs/textrecog/sar/sar_r31_parallel_decoder_toy_dataset.py --work-dir sar 2022/08/22 18:42:22 - mmengine - INFO - Epoch(train) [1][5/7] lr: 7.0000e-03 memory: 7730 data_time: 0.4496 loss_prob: 14.6061 loss_thr: 2.2904 loss_db: 0.9879 loss: 17.8843 time: 1.8666
2022/08/22 18:42:24 - mmengine - INFO - Exp name: dbnet_resnet18_fpnc_1200e_icdar2015
2022/08/22 18:42:28 - mmengine - INFO - Epoch(train) [2][5/7] lr: 7.0000e-03 memory: 6695 data_time: 0.2052 loss_prob: 6.7840 loss_thr: 1.4114 loss_db: 0.9855 loss: 9.1809 time: 0.7506
2022/08/22 18:42:29 - mmengine - INFO - Exp name: dbnet_resnet18_fpnc_1200e_icdar2015
2022/08/22 18:42:33 - mmengine - INFO - Epoch(train) [3][5/7] lr: 7.0000e-03 memory: 6690 data_time: 0.2101 loss_prob: 3.0700 loss_thr: 1.1800 loss_db: 0.9967 loss: 5.2468 time: 0.6244
2022/08/22 18:42:33 - mmengine - INFO - Exp name: dbnet_resnet18_fpnc_1200e_icdar2015
``` ```
### 使用学术数据集进行训练 在不指定额外参数时,训练的权重默认会被保存到 `work_dirs/dbnet_resnet18_fpnc_1200e_icdar2015/` 下面,而日志则会保存在`work_dirs/dbnet_resnet18_fpnc_1200e_icdar2015/开始训练的时间戳/`里。接下来,我们只需要耐心等待模型训练完成即可。
按照说明准备好所需的学术数据集后,最后要检查模型的配置是否将 MMOCR 指向正确的数据集路径。假设在 ICDAR2015 数据集上训练 DBNet,部分配置如 `configs/_base_/det_datasets/icdar2015.py` 所示: ```{tip}
若需要了解训练的高级用法,如 CPU 训练、多卡训练及集群训练等,请查阅[训练与测试](../user_guides/train_test.md)。
```python
dataset_type = 'IcdarDataset'
data_root = 'data/icdar2015'
train = dict(
type=dataset_type,
ann_file=f'{data_root}/instances_training.json',
img_prefix=f'{data_root}/imgs',
pipeline=None)
test = dict(
type=dataset_type,
ann_file=f'{data_root}/instances_test.json',
img_prefix=f'{data_root}/imgs',
pipeline=None)
train_list = [train]
test_list = [test]
``` ```
这里需要检查数据集路径 `data/icdar2015` 是否正确. 然后可以启动训练命令:
```shell
python tools/train.py configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py --work-dir dbnet
```
想要了解完整的训练参数配置可以查看 [Training](training.md)了解。
## 测试 ## 测试
假设我们完成了 DBNet 模型训练,并将最新的模型保存在 `dbnet/latest.pth`。则可以使用以下命令,及`hmean-iou`指标来评估其在测试集上的性能: 经过数十分钟的等待模型顺利完成了400 epochs的训练。我们通过控制台的输出观察到 DBNet 在最后一个 epoch 的表现最好,`hmean` 达到了 60.86
```shell ```Bash
python tools/test.py configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py dbnet/latest.pth --eval hmean-iou 08/22 19:24:52 - mmengine - INFO - Epoch(val) [400][100/100] icdar/precision: 0.7285 icdar/recall: 0.5226 icdar/hmean: 0.6086
``` ```
还可以在线评估预训练模型,命令如下: ```{note}
它或许还没被训练到最优状态,但对于一个演示而言已经足够了。
```shell
python tools/test.py configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py https://download.openmmlab.com/mmocr/textdet/dbnet/dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597.pth --eval hmean-iou
``` ```
有关测试的更多说明,请参阅 [测试](testing.md). 然而,这个数值只反映了 DBNet 在迷你 ICDAR 2015 数据集上的性能。要想更加客观地评判它的检测能力,我们还要看看它在分布外数据集上的表现。例如,`tests/data/det_toy_dataset` 就是一个很小的真实数据集,我们可以用它来验证一下 DBNet 的实际性能。
在测试前,我们同样需要对数据集的位置做一下修改。打开 `configs/_base_/det_datasets/icdar2015.py`,修改 `ic15_det_test``data_root``tests/data/det_toy_dataset`:
```Python
# ...
ic15_det_test = dict(
type='OCRDataset',
data_root='tests/data/det_toy_dataset',
# ...
)
```
修改完毕,运行命令启动测试。
```Bash
python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py work_dirs/dbnet_resnet18_fpnc_1200e_icdar2015/epoch_400.pth
```
得到输出:
```Bash
08/21 21:45:59 - mmengine - INFO - Epoch(test) [5/10] memory: 8562
08/21 21:45:59 - mmengine - INFO - Epoch(test) [10/10] eta: 0:00:00 time: 0.4893 data_time: 0.0191 memory: 283
08/21 21:45:59 - mmengine - INFO - Evaluating hmean-iou...
08/21 21:45:59 - mmengine - INFO - prediction score threshold: 0.30, recall: 0.6190, precision: 0.4815, hmean: 0.5417
08/21 21:45:59 - mmengine - INFO - prediction score threshold: 0.40, recall: 0.6190, precision: 0.5909, hmean: 0.6047
08/21 21:45:59 - mmengine - INFO - prediction score threshold: 0.50, recall: 0.6190, precision: 0.6842, hmean: 0.6500
08/21 21:45:59 - mmengine - INFO - prediction score threshold: 0.60, recall: 0.6190, precision: 0.7222, hmean: 0.6667
08/21 21:45:59 - mmengine - INFO - prediction score threshold: 0.70, recall: 0.3810, precision: 0.8889, hmean: 0.5333
08/21 21:45:59 - mmengine - INFO - prediction score threshold: 0.80, recall: 0.0000, precision: 0.0000, hmean: 0.0000
08/21 21:45:59 - mmengine - INFO - prediction score threshold: 0.90, recall: 0.0000, precision: 0.0000, hmean: 0.0000
08/21 21:45:59 - mmengine - INFO - Epoch(test) [10/10] icdar/precision: 0.7222 icdar/recall: 0.6190 icdar/hmean: 0.6667
```
可以发现,模型在这个数据集上能达到的 hmean 为 0.6667,效果还是不错的。
```{tip}
若需要了解测试的高级用法,如 CPU 测试、多卡测试及集群测试等,请查阅[训练与测试](../user_guides/train_test.md)。
```
## 可视化输出
为了对模型的输出有一个更直观的感受,我们还可以直接可视化它的预测输出。在 `test.py` 中,用户可以通过 `show` 参数打开弹窗可视化;也可以通过 `show-dir` 参数指定预测结果图导出的目录。
```Bash
python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py work_dirs/dbnet_r18_fpnc_1200e_icdar2015/epoch_400.pth --show-dir imgs/
```
真实标签和预测值会在可视化结果中以平铺的方式展示。左图的绿框表示真实标签,右图的红框表示预测值。
<div align="center">
<img src="https://user-images.githubusercontent.com/22607038/187423562-6a85e209-4b12-46ee-8a41-5c67b1ba83f9.png"/><br>
</div>
```{tip}
有关更多可视化功能的介绍,请参阅[这里](../user_guides/visualization.md)。
```