[Docs] Fix quick run (#1775)

* Fix quick run * fix ut * debug * fix * fix ut
2025-06-03 21:54:47 +08:00 · 2023-03-22 10:09:54 +08:00 · 2023-03-22 10:09:54 +08:00 · 6d9582b6c7
commit 6d9582b6c7
parent e0707bf5f2
6 changed files with 1777 additions and 28 deletions
--- a/docs/en/get_started/quick_run.md
+++ b/docs/en/get_started/quick_run.md
@ -54,24 +54,31 @@ Once the dataset is prepared, we will then specify the location of the training

 In this example, we will train a DBNet using resnet18 as its backbone. Since MMOCR already has a config file for the full ICDAR 2015 dataset (`configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py`), we just need to make some modifications on top of it.

-We first need to modify the path to the dataset. In this config, most of the key config files are imported in `_base_`, such as the database configuration from `configs/_base_/det_datasets/icdar2015.py`. Open that file and replace the path pointed to by `ic15_det_data_root` in the first line with:
+We first need to modify the path to the dataset. In this config, most of the key config files are imported in `_base_`, such as the database configuration from `configs/textdet/_base_/datasets/icdar2015.py`. Open that file and replace the path pointed to by `icdar2015_textdet_data_root` in the first line with:

 ```Python
-ic15_det_data_root = 'data/det/mini_icdar2015'
+icdar2015_textdet_data_root = 'data/mini_icdar2015'
 ```

 Also, because of the reduced dataset size, we have to reduce the number of training epochs to 400 accordingly, shorten the validation interval as well as the weight storage interval to 10 rounds, and drop the learning rate decay strategy. The following lines of configuration can be directly put into `configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py` to take effect.

 ```Python
-# Save checkpoints every 10 epochs
-default_hooks = dict(checkpoint=dict(type='CheckpointHook', interval=10), )
+# Save checkpoints every 10 epochs, and only keep the latest checkpoint
+default_hooks = dict(
+    checkpoint=dict(
+        type='CheckpointHook',
+        interval=10,
+        max_keep_ckpts=1,
+    ))
 # Set the maximum number of epochs to 400, and validate the model every 10 epochs
 train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=400, val_interval=10)
 # Fix learning rate as a constant
-param_scheduler = [dict(type='ConstantLR', factor=1.0),]
+param_scheduler = [
+    dict(type='ConstantLR', factor=1.0),
+]
 ```

-Here, we have rewritten the corresponding parameters in the base configuration directly through the [inheritance](https://mmengine.readthedocs.io/en/latest/tutorials/config.html) mechanism of the configuration. The original fields are distributed in `configs/_base_/schedules/schedule_sgd_1200e.py` and `configs/_base_/textdet_default_runtime.py`. You may check them out if interested.
+Here, we have rewritten the corresponding parameters in the base configuration directly through the inheritance ({external+mmengine:doc}`MMEngine: Config <advanced_tutorials/config>`) mechanism of the config. The original fields are distributed in `configs/textdet/_base_/schedules/schedule_sgd_1200e.py` and `configs/textdet/_base_/default_runtime.py`.

 ```{note}
 For a more detailed description of config, please refer to [here](../user_guides/config.md).
@ -126,7 +133,7 @@ For advanced usage of training, such as CPU training, multi-GPU training, and cl

 ## Testing

-After 400 epochs, we observe that DBNet performs best in the last epoch, with `hmean` reaching 60.86:
+After 400 epochs, we observe that DBNet performs best in the last epoch, with `hmean` reaching 60.86 (You may see a different result):

 ```Bash
 08/22 19:24:52 - mmengine - INFO - Epoch(val) [400][100/100]  icdar/precision: 0.7285  icdar/recall: 0.5226  icdar/hmean: 0.6086
@ -138,7 +145,7 @@ It may not have been trained to be optimal, but it is sufficient for a demo.

 However, this value only reflects the performance of DBNet on the mini ICDAR 2015 dataset. For a comprehensive evaluation, we also need to see how it performs on out-of-distribution datasets. For example, `tests/data/det_toy_dataset` is a very small real dataset that we can use to verify the actual performance of DBNet.

-Before testing, we also need to make some changes to the location of the dataset. Open `configs/_base_/det_datasets/icdar2015.py` and change `data_root` of `icdar2015_textdet_test` to `tests/data/det_toy_dataset`:
+Before testing, we also need to make some changes to the location of the dataset. Open `configs/textdet/_base_/datasets/icdar2015.py` and change `data_root` of `icdar2015_textdet_test` to `tests/data/det_toy_dataset`:

 ```Python
 # ...
@ -155,7 +162,7 @@ Start testing:
 python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py work_dirs/dbnet_resnet18_fpnc_1200e_icdar2015/epoch_400.pth
 ```

-And get the outputs:
+And get the outputs like:

 ```Bash
 08/21 21:45:59 - mmengine - INFO - Epoch(test) [5/10]    memory: 8562
@ -182,7 +189,7 @@ For advanced usage of testing, such as CPU testing, multi-GPU testing, and clust
 We can also visualize its prediction output in `test.py`. You can open a pop-up visualization window with the `show` parameter; and can also specify the directory where the prediction result images are exported with the `show-dir` parameter.

 ```Bash
-python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py work_dirs/dbnet_r18_fpnc_1200e_icdar2015/epoch_400.pth --show-dir imgs/
+python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py work_dirs/dbnet_resnet18_fpnc_1200e_icdar2015/epoch_400.pth --show-dir imgs/
 ```

 The true labels and predicted values are displayed in a tiled fashion in the visualization results. The green boxes in the left panel indicate the true labels and the red boxes in the right panel indicate the predicted values.
--- a/docs/zh_cn/get_started/quick_run.md
+++ b/docs/zh_cn/get_started/quick_run.md
@ -54,24 +54,29 @@ tar xzvf mini_icdar2015.tar.gz -C data/

 在这个例子中，我们将会训练一个以 resnet18 作为骨干网络（backbone）的 DBNet。由于 MMOCR 已经有针对完整 ICDAR 2015 数据集的配置 （`configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py`），我们只需要在它的基础上作出一点修改。

-我们首先需要修改数据集的路径。在这个配置中，大部分关键的配置文件都在 `_base_` 中被导入，如数据库的配置就来自 `configs/_base_/det_datasets/icdar2015.py`。打开该文件，把第一行`ic15_det_data_root` 指向的路径替换：
+我们首先需要修改数据集的路径。在这个配置中，大部分关键的配置文件都在 `_base_` 中被导入，如数据库的配置就来自 `configs/textdet/_base_/datasets/icdar2015.py`。打开该文件，把第一行 `icdar2015_textdet_data_root` 指向的路径替换：

 ```Python
-ic15_det_data_root = 'data/det/mini_icdar2015'
+icdar2015_textdet_data_root = 'data/mini_icdar2015'
 ```

 另外，因为数据集尺寸缩小了，我们也要相应地减少训练的轮次到 400，缩短验证和储存权重的间隔到10轮，并放弃学习率衰减策略。直接把以下几行配置放入 `configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py`即可生效：

 ```Python
-# 每 10 个 epoch 储存一次权重
-default_hooks = dict(checkpoint=dict(type='CheckpointHook', interval=10), )
+# 每 10 个 epoch 储存一次权重，且只保留最后一个权重
+default_hooks = dict(
+    checkpoint=dict(
+        type='CheckpointHook',
+        interval=10,
+        max_keep_ckpts=1,
+    ))
 # 设置最大 epoch 数为 400，每 10 个 epoch 运行一次验证
 train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=400, val_interval=10)
 # 令学习率为常量，即不进行学习率衰减
 param_scheduler = [dict(type='ConstantLR', factor=1.0),]
 ```

-这里，我们通过配置的[继承](https://mmengine.readthedocs.io/zh_CN/latest/tutorials/config.html)机制将基础配置中的相应参数直接进行了改写。原本的字段分布在 `configs/_base_/schedules/schedule_sgd_1200e.py` 和 `configs/_base_/textdet_default_runtime.py` 中，感兴趣的读者可以自行查看。
+这里，我们通过配置的继承 ({external+mmengine:doc}`MMEngine: Config <advanced_tutorials/config>`) 机制将基础配置中的相应参数直接进行了改写。原本的字段分布在 `configs/textdet/_base_/schedules/schedule_sgd_1200e.py` 和 `configs/textdet/_base_/default_runtime.py` 中，感兴趣的读者可以自行查看。

 ```{note}
 关于配置文件更加详尽的说明，请参考[此处](../user_guides/config.md)。
@ -126,7 +131,7 @@ python tools/train.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.

 ## 测试

-经过数十分钟的等待，模型顺利完成了400 epochs的训练。我们通过控制台的输出，观察到 DBNet 在最后一个 epoch 的表现最好，`hmean` 达到了 60.86：
+经过数十分钟的等待，模型顺利完成了400 epochs的训练。我们通过控制台的输出，观察到 DBNet 在最后一个 epoch 的表现最好，`hmean` 达到了 60.86（你可能会得到一个不太一样的结果）：

 ```Bash
 08/22 19:24:52 - mmengine - INFO - Epoch(val) [400][100/100]  icdar/precision: 0.7285  icdar/recall: 0.5226  icdar/hmean: 0.6086
@ -138,7 +143,7 @@ python tools/train.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.

 然而，这个数值只反映了 DBNet 在迷你 ICDAR 2015 数据集上的性能。要想更加客观地评判它的检测能力，我们还要看看它在分布外数据集上的表现。例如，`tests/data/det_toy_dataset` 就是一个很小的真实数据集，我们可以用它来验证一下 DBNet 的实际性能。

-在测试前，我们同样需要对数据集的位置做一下修改。打开 `configs/_base_/det_datasets/icdar2015.py`，修改 `icdar2015_textdet_test` 的 `data_root` 为 `tests/data/det_toy_dataset`:
+在测试前，我们同样需要对数据集的位置做一下修改。打开 `configs/textdet/_base_/datasets/icdar2015.py`，修改 `icdar2015_textdet_test` 的 `data_root` 为 `tests/data/det_toy_dataset`:

 ```Python
 # ...
@ -182,7 +187,7 @@ python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.p
 为了对模型的输出有一个更直观的感受，我们还可以直接可视化它的预测输出。在 `test.py` 中，用户可以通过 `show` 参数打开弹窗可视化；也可以通过 `show-dir` 参数指定预测结果图导出的目录。

 ```Bash
-python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py work_dirs/dbnet_r18_fpnc_1200e_icdar2015/epoch_400.pth --show-dir imgs/
+python tools/test.py configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py work_dirs/dbnet_resnet18_fpnc_1200e_icdar2015/epoch_400.pth --show-dir imgs/
 ```

 真实标签和预测值会在可视化结果中以平铺的方式展示。左图的绿框表示真实标签，右图的红框表示预测值。
--- a/tests/data/det_toy_dataset/instances_test.json
+++ b/tests/data/det_toy_dataset/instances_test.json
--- a/tests/data/det_toy_dataset/textdet_test.json
+++ b/tests/data/det_toy_dataset/textdet_test.json
--- a/tests/test_datasets/test_dataset_wrapper.py
+++ b/tests/test_datasets/test_dataset_wrapper.py
@ -34,13 +34,13 @@ class TestConcatDataset(TestCase):
            data_root=osp.join(
                osp.dirname(__file__), '../data/det_toy_dataset'),
            data_prefix=dict(img_path='imgs'),
-            ann_file='instances_test.json')
+            ann_file='textdet_test.json')

        self.dataset_a_with_pipeline = dataset(
            data_root=osp.join(
                osp.dirname(__file__), '../data/det_toy_dataset'),
            data_prefix=dict(img_path='imgs'),
-            ann_file='instances_test.json',
+            ann_file='textdet_test.json',
            pipeline=[dict(type='MockTransform', return_value=1)])

        # create dataset_b
@ -50,12 +50,12 @@ class TestConcatDataset(TestCase):
            data_root=osp.join(
                osp.dirname(__file__), '../data/det_toy_dataset'),
            data_prefix=dict(img_path='imgs'),
-            ann_file='instances_test.json')
+            ann_file='textdet_test.json')
        self.dataset_b_with_pipeline = dataset(
            data_root=osp.join(
                osp.dirname(__file__), '../data/det_toy_dataset'),
            data_prefix=dict(img_path='imgs'),
-            ann_file='instances_test.json',
+            ann_file='textdet_test.json',
            pipeline=[dict(type='MockTransform', return_value=2)])

    def test_init(self):
@ -83,7 +83,7 @@ class TestConcatDataset(TestCase):
                data_root=osp.join(
                    osp.dirname(__file__), '../data/det_toy_dataset'),
                data_prefix=dict(img_path='imgs'),
-                ann_file='instances_test.json')
+                ann_file='textdet_test.json')
            ConcatDataset(datasets=[dataset_a, dataset_b])
        # test lazy init
        ConcatDataset(
--- a/tests/test_utils/test_fileio.py
+++ b/tests/test_utils/test_fileio.py
@ -132,8 +132,10 @@ class TestIsArchive(unittest.TestCase):
 class TestCheckIntegrity(unittest.TestCase):

    def setUp(self) -> None:
-        self.file1 = ('tests/data/det_toy_dataset/instances_test.json',
-                      '77b17b0125996af519ef82aaacc8d96b')
+        # Do not use text files for tests, because the md5 value of text files
+        # is different on different platforms (CR - CRLF)
+        self.file1 = ('tests/data/det_toy_dataset/imgs/test/img_2.jpg',
+                      '52b28b5dfc92d9027e70ec3ff95d8702')
        self.file2 = ('tests/data/det_toy_dataset/imgs/test/img_1.jpg',
                      'abc123')
        self.file3 = ('abc/abc.jpg', 'abc123')
@ -151,8 +153,10 @@ class TestCheckIntegrity(unittest.TestCase):
 class TextGetMD5(unittest.TestCase):

    def setUp(self) -> None:
-        self.file1 = ('tests/data/det_toy_dataset/instances_test.json',
-                      '77b17b0125996af519ef82aaacc8d96b')
+        # Do not use text files for tests, because the md5 value of text files
+        # is different on different platforms (CR - CRLF)
+        self.file1 = ('tests/data/det_toy_dataset/imgs/test/img_2.jpg',
+                      '52b28b5dfc92d9027e70ec3ff95d8702')
        self.file2 = ('tests/data/det_toy_dataset/imgs/test/img_1.jpg',
                      'abc123')