Commit Graph

96 Commits (1398e4200e19386894829a4418c03a491b434a9f)

Author SHA1 Message Date
Shu Liqiang b8a31671a4
[Feature] Runner supports setting the number of iterations for per epoch (#1292) 2023-10-08 16:43:44 +08:00
6V e9e08dbb65
[Enhance] Add unit tests for autocast with Ascend device (#1363) 2023-09-27 10:20:13 +08:00
Zeyuan ccd17571ce
[Feature] Implement gradient checkpointing (#1319) 2023-09-04 23:29:24 +08:00
i-aki-y 6df9621a06
[Feature] Add support for full wandb's define_metric arguments (#1099) 2023-06-01 21:50:29 +08:00
Mashiro 298a4b1e49
[Fix] Fix build unnecessary loop during train/test/val (#1107)
* [Fix] Fix build unnecessary loop during train/test/val

* move unit test to runner

* Update unit test

* Fix unit test

* check train_loop is None

* update comment

* replace(type(None)) with is not None
2023-04-27 19:20:35 +08:00
Ma Zerun 49b27dd83f
[Imporve] Support `_load_state_dict_post_hooks` in `load_state_dict`. (#1103)
* [Imporve] Support `_load_state_dict_post_hooks` in `load_state_dict`.

* Update

* Add unit test
2023-04-26 16:48:57 +08:00
Zaida Zhou cdec4cbd4a
[Fix] collate_fn does not support passing a function object (#1093) 2023-04-24 20:42:54 +08:00
shufan wu 2aef53d3fa
[Fix] No training log when the num of iterations is smaller than the interval (#1046) 2023-04-24 12:29:20 +08:00
Mashiro 17c5414d16
[Fix] Fix the resuming error caused by HistoryBuffer (#1078) 2023-04-21 17:23:38 +08:00
Mashiro f1aca8e307
[Fix] Failed to remove the previous best checkpoints (#1086)
* [Fix] Only reserve one best checkpoint

* [Fix] Only reserve one best checkpoint

* Fix unit test

* shutdown logging

* clean the save_checkpoint logic
2023-04-20 21:28:56 +08:00
Mashiro be347df770
[Fix] KeyError is thrown in _collect_scalars when log_with_hierarchy is True (#1085)
* Fix log processor

* Fix custom key
2023-04-20 10:52:32 +08:00
黄启元 60b4c199fc
[Feature] Support MLU backend (#1075)
* support mlu device

* support mlu device

* fix lint error

* fix lint error builder.py

* fix lint error in amp.py

* fix lint errors

* fix data type in instance_data.py
2023-04-14 19:06:19 +08:00
shufan wu 5e1ed7aaf0
[Enhance] Allow users to customize worker_init_fn of Dataloader (#1038)
* customize worker init fn function

* add assert

* narrow worker_init_fn type
2023-04-10 17:32:36 +08:00
zhouhui 093068e4ff
[Enhancement] Align the evaluation result in log (#1034)
* align the evaluation result in log

* align the evaluation result in log

* align the evaluation result in log

* align the evaluation result in log

* fix test log_processor
2023-04-04 00:17:42 +08:00
Mashiro 83c4f3e643
[Enhance] Make sure the FileHandler still alive after `torch.compile` (#1021)
* [Enhance] Make sure the FileHandler still alive after

* Resume filter

* avoid bc

* Fix unit test

* clean the code

* revert changes and set mode from 'm' to 'a'

* mode to file_mode

* add comments

* refine comments

* Fix duplicated the
2023-03-30 17:41:26 +08:00
Mashiro eb79d64af1
[Fix] Add PyTorch 2.0 CI and fix unit tests (#1026)
* [Enhance] Make sure the FileHandler still alive after

* minor refine

* minor refine

* refine unit test

* update CI

* update CI

* Fix CI

* fix build_windows

* fix build_windows

* fix build_windows

* fix build_windows

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* test windows CI

* Fix

* Debug

* Fix unit test

* Fix unit test

* Fix CI

* update image version

* update action/checkout and action/setup-python

* add condition to skip test compile

* [Fix] Update CI and fix unit test

* check compiling by attempting compilation

* check compiling by attempting compilation

* check compiling by attempting compilation

* use windows-2022 in runs on

* Apply suggestions from code review

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* update yml

* remove unnecessary assert

* assert grad is None according to the PyTorch version

* Fix code

---------

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2023-03-29 13:09:23 +08:00
Mashiro d174952a3e
[Fix] Fix there is no space between data_time and metric (#1025) 2023-03-28 19:20:56 +08:00
Mashiro 8063d2cce7
[Enhancement] Support writing data to `vis_backend` with prefix (#972)
* Log with prefix

* Fix test of loggerhook

* minor refine

* minor refine

* Fix unit test

* clean the code

* deepcopy in method

* replace regex

* Fix as comment

* Enhance readable

* rename reserve_prefix to remove_prefix

* Fix as comment

* Refine unit test

* Adjust sequence

* clean the code

* clean the code

* revert renaming reserve prefix

* Count the dataloader length in _get_dataloader_size
2023-03-13 13:07:37 +08:00
Qian Zhao 0d25625ba2
[Feature] Support torch.compile since PyTorch2.0 (#976)
* enable compile configurations to support torch.compile in Runner

* enable compilation in train, val and test

* fix as comments

* add docstring to illustrate usage

* minor refine error message

* add unittests

* fix ut skip

* add logging message to inform users

* compile `train_step`, `val_step`, `test_step` instead

* fix as comments

* revert to compile `train_step` only due to pt2 issue

* add documentation about torch.compile
2023-03-12 18:26:43 +08:00
Mashiro 44f30f649e
[Enhancement] Add `FUNCTIONS` root Registry (#983)
* [Enhancement] Add FUNCTIONS Registry

* Refine as comment

* clean the code
2023-03-08 12:53:24 +08:00
Mashiro dbae83c52f
[Enhancement] Replace warnings.warn with print_log (#961)
* Replace warning with print_log

* Add comments for testing warning
2023-03-06 17:25:28 +08:00
Zaida Zhou c94e7518e5
[Enhancement] Clear UT warning caused by pytest (#947)
* [Enhancement] Clear UT warning caused by pytest

* revert some changes for unittest

* revert

* update

* clear a numpy warning

* Update tests/test_visualizer/test_vis_backend.py

* fix a warning
2023-02-22 12:17:56 +08:00
wxDai 1d97c07068
[Docs] Fix typo shedule (#936) 2023-02-19 20:44:24 +08:00
Ma Zerun fcd783fcb2
[Enhance] Support non-scalar type metric value. (#827)
* [Enhance] Support non-scalar type metric value.

* Refactor support.

* Fix non-scalar tags problem during validation

* Update tag processor.
2023-01-12 20:28:55 +08:00
Mashiro d876d4e0f8
[Enhance] Add support of TorchVision's Model Registration API (#793)
* enhance get_torchvision_model

* remove mmcv
2022-12-11 22:08:51 +08:00
songyuc 6636f07cfe
[Feature] Add get_hooks_info() to print hooks messages (#672)
* Add test of get_hooks_info()

* Change to use original Runner for get_hook_info() test

* Change to test after_train_iter hooks for get_hook_info()

* Complement the stages list

* Add logging hooks information in Runner.__init__()

* Rearrange the stages list

* Restore the stages to tuple type

* Clean the unnecessary changes

* Replace  statement with TestCase's methods

* add test stages in method_stages_map

* change the hooks info into a f-string

* return list(trigger_stages) directly

* change keys of method_stages_map

* Fix previous changes to method_stages_map.keys
2022-11-22 20:02:29 +08:00
Mashiro b06234cfcd
[Enhance] Right align the log (#436)
* right allign the log

* fix as comment

* Add comments for magic number 3

* remove max_len_str

* Update mmengine/runner/log_processor.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2022-11-21 11:55:18 +08:00
Mashiro c478bdca27
[Enhance] enhance runner test case (#631)
* Add runner test cast

* Fix unit test

* fix unit test

* pop None if key does not exist

* Fix is_model_wrapper and force register class in test_runner

* [Fix] Fix is_model_wrapper

* destroy group after ut

* register module in testcase

* fix as comment

* minor refine

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* fix lint

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2022-11-21 11:54:05 +08:00
Mashiro abe56651db
[Fix] Make autocast compatible with mps (#587)
* [Fix] Make autocast compatible with mps

* Enhance unit test

* fix unit test

* clean the code

* fix unit test
2022-10-18 18:03:17 +08:00
Mashiro 62456217f9
[Feature] Add test time augmentation base model. (#538)
* First commit

* add BaseTestTimeAugModel

* Add unit test

* move loop logic to test_step

* fix ddp test

* rename model to module

* optim __init__

* Fix as comment

* Fix as comment

* make val_step should not be called

* make tta do not inherit base model

* Fix unit test

* Enhance docstring

* Fix as comment

* Fix as comment

* minor refine

* minor refine

* minor refine

* fix unit test

* minor refine

* minor refine

* minor refine

* minor refine

* minor refine

* minor refine

* fix unit test
2022-10-10 14:29:33 +08:00
Mashiro 8ee31dbc3b
[Feature] Support convert `BN` to `SyncBN` by config (#506)
* [Feature] Support convert BN to SyncBN by config

* make unit test compatible with cpu

* Fix as comment

* fix unit test

* change signature of convert_sync_batchnorm: rename sync_bn to implemention

* fix unit test

* fix unit test
2022-09-15 18:08:36 +08:00
Mashiro 6b1b8a3751
[Fix] Fix unit test in windows (#515) 2022-09-13 11:46:21 +08:00
Mashiro 8770c6c7fc
[Refactor] Refactor data flow to make the interface more natural (#468)
* [Refactor]: modify interface of Visualizer.add_datasample (#365)

* [Refactor] Refactor data flow: refine `data_preprocessor`. (#359)

* refine data_preprocessor

* remove unused BATCH_DATA alias

* Fix type hints

* rename move_data to cast_data

* [Refactor] Refactor data flow: collate data in `collate_fn` of `DataLoader`  (#323)

* acollate data in dataloader

* fix docstring

* refine comment

* fix as comment

* refactor default collate and psedo collate

* foramt test file

* fix docstring

* fix as comment

* rename elem to data_item

* minor fix

* fix as comment

* [Refactor] Refactor data flow: `data_batch` argument of `Evaluator.process is a `dict` (#360)

* refine evaluator and metric

* compatible with new default collate

* replace default collate with pseudo

* Handle data_batch in metric

* fix unit test

* fix unit test

* fix unit test

* minor refine

* make data_batch optional

make data_batch optional

* rename outputs to predictions

* fix ut

* rename predictions to outputs

* fix docstring

* fix docstring

* fix unit test

* make outputs and data_batch to kwargs

* fix unit test

* keep signature of metric

* fix ut

* rename pred_sample arguments to data_sample(Visualizer)

* fix loop and ut

* [refactor]: Refactor model dataflow (#398)

* [Refactor] Refactor data flow: refine `data_preprocessor`. (#359)

* refine data_preprocessor

* remove unused BATCH_DATA alias

* Fix type hints

* rename move_data to cast_data

* refactor model data flow

tmp_commt

tmp commit

* make val_cfg and test_cfg optional

* roll back runner

* pass test mmdet

* fix as comment

fix as comment

fix ci in DataPreprocessor

* fix ut

* fix ut

* fix rebase main

* [Fix]: Fix test val ddp (#462)

* [Fix] Fix docstring and type hint of data flow (#463)

* Fix docstring of data flow

* change signature of hook

* fix unit test

* resolve conflicts

* fix lint
2022-08-24 22:04:55 +08:00
Zaida Zhou 7e1d7af2d9
[Refactor] Refactor code structure (#395)
* Rename data to structure

* adjust the way to import module

* adjust the way to import module

* rename Structure to Data Structures in docs api

* rename structure to structures

* support using some modules of mmengine without torch

* fix circleci config

* fix circleci config

* fix registry ut

* minor fix

* move init method from model/utils to model/weight_init.py

* move init method from model/utils to model/weight_init.py

* move sync_bn to model

* move functions depending on torch to dl_utils

* format import

* fix logging ut

* add weight init in model/__init__.py

* move get_config and get_model to mmengine/hub

* move log_processor.py to mmengine/runner

* fix ut

* Add TimeCounter in dl_utils/__init__.py
2022-08-24 19:14:07 +08:00
Mashiro e907931fb8
Fix unit tests (#449) 2022-08-21 14:54:24 +08:00
Mashiro 4abf1a0454
[Enhance] Support build evaluator from list of built metric (#423)
* Support build evaluator from list of built metric

* regist evaluator

* fix as comment

* add unit test
2022-08-19 10:56:51 +08:00
Mashiro d6ad01a4cf
[Fix]: fix ci (#441) 2022-08-18 14:04:19 +08:00
Mashiro e08b9031fc
[Enhance] Support building optimizer wrapper from built Optimizer instance (#422)
* support build optimizer wrapper from built Optimizer instance

* refine comments
2022-08-17 19:17:00 +08:00
Zaida Zhou f98ba60629
[Enhancement] Improve unit tests of mmengine/runner (#182)
* [Enhancement] Add unit test for get_priority

* fix priority ut

* fix typo

Co-authored-by: Wenwei Zhang <40779233+ZwwWayne@users.noreply.github.com>
2022-08-15 10:57:58 +08:00
Mashiro 2708b7ed48
fix ci (#424) 2022-08-13 09:15:08 +08:00
Mashiro ee56f151f6
[Fix] Support training with data without `metainfo`. (#417)
* support training with data without metainfo

* clean the code

* clean the code
2022-08-11 14:51:11 +08:00
Ma Zerun 9b2a0e02da
[Enhance] Add `data_preprocessor` config as an argument of runner. (#343)
* [Enhance] Add `preprocess_cfg` as an argument of runner.

* Rename `preprocess_cfg` to `data_preprocessor`

* Fix docstring
2022-08-09 11:25:29 +08:00
Mashiro a07a063306
[Enhance] Add build function for scheduler. (#372)
* add build function for scheduler

* add unit test

add unit test

* handle convert_to_iter in build_scheduler_from_cfg

* restore deleted code

* format import

* fix lint
2022-08-08 20:34:16 +08:00
Mashiro 5580542666
[Fix] Fix build multiple list of scheduler for multiple optimizers (#383)
* fix build multiple scheduler

* add new unit test

* fix comment and error message

* fix comment and error message

* extract _parse_scheduler_cfg

* always call build_param_scheduler during train and resume. If there is only one optimizer, the defaut value for sheduler will be a list, otherwise there is multiple optimizer, the default value of sheduler will be a dict

* minor refine

* rename runner test exp name

* fix as comment

* minor refine

* fix ut

* only check parameter scheduler

* minor refine
2022-08-08 17:05:27 +08:00
Mashiro 1a8f013937
[Refine] Make scheduler default to None (#396)
* make scheduler default to None

* fix bc breaking

* refine warning message

* fix as comment

* fix as comment

* fix lint
2022-08-04 20:13:13 +08:00
RangiLyu 1241c21296
[Fix] Fix weight initializing in test and refine registry logging. (#367)
* [Fix] Fix weight initializing and registry logging.

* sync params

* resolve comments
2022-07-19 18:28:57 +08:00
Ma Zerun 3da66d1f87
[Enhance] Auto set the `end` of param schedulers. (#361)
* [Enhance] Auto set the `end` of param schedulers.

* Add log output and unit test

* Update docstring

* Update unit tests of `CosineAnnealingParamScheduler`.
2022-07-15 19:53:28 +08:00
Mashiro 78fad67d0d
[Fix] fix resume message_hub (#353)
* fix resume message_hub

* add unit test

* support resume from messagehub

* minor refine

* add comment

* fix typo

* update docstring
2022-07-14 20:13:22 +08:00
Mashiro 2853045e96
[Fix] Fix build multiple runners error (#348)
* fix build multiple runner error

* fix comments

* fix cpu ci
2022-07-05 20:35:06 +08:00
Cedric Luo 9c55b4300c
[Enhance] Support dynamic interval (#342)
* support dynamic interval in iterbasedtrainloop

* update typehint

* update typehint

* add dynamic interval in epochbasedtrainloop

* update

* fix

Co-authored-by: luochunhua.vendor <luochunhua@pjlab.org.cn>
2022-06-30 15:08:56 +08:00