Commit Graph

338 Commits (b638d3b1fe931bed233dae868c7960078ed34899)

Author SHA1 Message Date
Mashiro b638d3b1fe
[Fix] Fix new config (#1227) 2023-07-01 22:35:11 +08:00
Mashiro 1480261e8f
[Enhance] Config adds copy method (#1224) 2023-06-30 17:11:16 +08:00
Mashiro f930b9fe53
Fix docstring format (#1223) 2023-06-30 10:39:19 +08:00
Mashiro 399f76ffa8 [Experimental] Add support for FSDP (#1213) 2023-06-29 15:19:33 +08:00
Mashiro 478c952a6d
[Refacor] Replace 'if base' with 'with read_base' context manager (#1207) 2023-06-25 13:53:19 +08:00
Mashiro 04b0ffee76
[Fix] Fix ut error in docker (#1204) 2023-06-16 22:05:08 +08:00
Mashiro 6ece63ed35
[Feature] Support Pure Python style Configuration File (#1071) 2023-06-16 12:52:07 +08:00
Akide Liu 94e7a3bb57
[Enhance] Learning rate in log can show the base learning rate of optimizer (#1019) 2023-06-08 19:51:15 +08:00
Zaida Zhou 19aa1eb780
[Fix] Save checkpoint again to update best_ckpt of ckpt (#1168) 2023-06-02 14:42:56 +08:00
i-aki-y 6df9621a06
[Feature] Add support for full wandb's define_metric arguments (#1099) 2023-06-01 21:50:29 +08:00
vugia truong 68414516aa
[Feature] Add vis backend for clearml (#878) (#1091) 2023-06-01 17:41:34 +08:00
Zaida Zhou 193b7fdfcc
[Refactor] Let unit tests not affect each other (#1169) 2023-05-27 22:36:04 +08:00
Mashiro 5d4e72144a
[Fix] Fix `ProfileHook` can not profile performance in ddp-training (#1140) 2023-05-26 10:55:15 +08:00
Mashiro 2085046d22
[Fix] The ann_file and data_root of BaseDataset can be None (#850) 2023-05-04 22:22:52 +08:00
Mashiro 3715fea15b
[Refactor] Refactor the unit tests of SyncBuffersHook (#813) 2023-04-28 17:32:30 +08:00
Mashiro 298a4b1e49
[Fix] Fix build unnecessary loop during train/test/val (#1107)
* [Fix] Fix build unnecessary loop during train/test/val

* move unit test to runner

* Update unit test

* Fix unit test

* check train_loop is None

* update comment

* replace(type(None)) with is not None
2023-04-27 19:20:35 +08:00
Ma Zerun 49b27dd83f
[Imporve] Support `_load_state_dict_post_hooks` in `load_state_dict`. (#1103)
* [Imporve] Support `_load_state_dict_post_hooks` in `load_state_dict`.

* Update

* Add unit test
2023-04-26 16:48:57 +08:00
Mashiro 6ba667c8cf
[Fix] Save optimizer.state_dict() in cpu by default (#966) 2023-04-26 16:47:47 +08:00
Mashiro 9868131c98
[Enhance] Enhance error message during custom import (#1102) 2023-04-26 11:08:58 +08:00
Zaida Zhou cdec4cbd4a
[Fix] collate_fn does not support passing a function object (#1093) 2023-04-24 20:42:54 +08:00
shufan wu 2aef53d3fa
[Fix] No training log when the num of iterations is smaller than the interval (#1046) 2023-04-24 12:29:20 +08:00
Mashiro 4afed1332b
[Enhance] Visualizer.show supports calling opencv to show images (#1015)
* [Enhance] Enhance the efficiency of Visualizer.show

* Update unit test

* Simplify the logic of creating opencv window

* Update docstring

* Update unit test

* Update mmengine/visualization/visualizer.py

---------

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2023-04-23 20:30:29 +08:00
cyberslack_lee 0687b377b2
[Enhancement] MessageHub.get_info() supports returning a default value (#991) 2023-04-23 17:35:35 +08:00
sjiang95 fafb476e58
[Feature] get_model_complexity_info() supports multiple inputs (#1065) 2023-04-23 16:11:31 +08:00
Mashiro 17c5414d16
[Fix] Fix the resuming error caused by HistoryBuffer (#1078) 2023-04-21 17:23:38 +08:00
Mashiro f1aca8e307
[Fix] Failed to remove the previous best checkpoints (#1086)
* [Fix] Only reserve one best checkpoint

* [Fix] Only reserve one best checkpoint

* Fix unit test

* shutdown logging

* clean the save_checkpoint logic
2023-04-20 21:28:56 +08:00
Mashiro be347df770
[Fix] KeyError is thrown in _collect_scalars when log_with_hierarchy is True (#1085)
* Fix log processor

* Fix custom key
2023-04-20 10:52:32 +08:00
Mashiro a7d4b7c742
[Enhance] Support configuring directory used to synchronize results in BaseMetric (#1074)
* [Enhance] Support configuring synchronize directory for BaseMetric

* Raise error if tmpdir is not an shared dirctory for ann ranks

* Raise error if tmpdir is not an shared dirctory for ann ranks

* Update mmengine/evaluator/metric.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* refine

* Update mmengine/evaluator/metric.py

---------

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2023-04-14 19:15:04 +08:00
黄启元 60b4c199fc
[Feature] Support MLU backend (#1075)
* support mlu device

* support mlu device

* fix lint error

* fix lint error builder.py

* fix lint error in amp.py

* fix lint errors

* fix data type in instance_data.py
2023-04-14 19:06:19 +08:00
Qian Zhao b2ad2210b5
[Feature] Support registering partial functions and more (#595)
* support registering partial functions

* Update mmengine/registry/build_functions.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/registry/registry.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Revert unit test and refine

* add current logger and set log level

---------

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
Co-authored-by: HAOCHENYE <21724054@zju.edu.cn>
2023-04-10 19:42:04 +08:00
shufan wu 5e1ed7aaf0
[Enhance] Allow users to customize worker_init_fn of Dataloader (#1038)
* customize worker init fn function

* add assert

* narrow worker_init_fn type
2023-04-10 17:32:36 +08:00
sung-hwa kim 8bf1ecad38
[Feature] Add vis backend for MLflow. (#878)
* add vis mlflow backend
2023-04-07 16:35:41 +08:00
Mashiro 5762b28847
[Refactor] Refactor logger hook unit tests (#797)
* Enhance config

* add unit test data

* reafactor unittest of loggerhook

* fix rebase error

* Fix permission error in windows

* Fix CI

* Fix windows ci

* Fix windows ci

* Fix windows ci

* Fix windows CI

* Apply suggestions from code review

Co-authored-by: Qian Zhao <112053249+C1rN09@users.noreply.github.com>

* clean the code

* Refine as comment

* Refine error rasing

* Update mmengine/hooks/logger_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* replace assert_called_with with assert_has_calls

* Fix as comment

* Do not remove filehandler and fix unit test

---------

Co-authored-by: Qian Zhao <112053249+C1rN09@users.noreply.github.com>
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2023-04-07 16:20:38 +08:00
Mashiro 2dbc8ed253
[Refactor] Refactor checkpointhook unit tests (#789)
* Enhance config

* add unit test data

* Refacotr unitest of checkpointhook

* add comments

* Fix unit test

* remove _get_metric_scope

* tmp save

* Revert "remove _get_metric_scope"

This reverts commit eeb7a8c5ed.

* Revert "Revert "remove _get_metric_scope""

This reverts commit 5398255f6f.

* Revert "tmp save"

This reverts commit cdc9919be8.

* clean the code

* Fix ut

* minor fix

* use str.replace
2023-04-06 10:55:16 +08:00
Mashiro dc931fd2c0
[Fix] Initialize nested modules in ddp which define 'init_weights' method (#1045) 2023-04-05 10:33:24 +08:00
zhouhui 093068e4ff
[Enhancement] Align the evaluation result in log (#1034)
* align the evaluation result in log

* align the evaluation result in log

* align the evaluation result in log

* align the evaluation result in log

* fix test log_processor
2023-04-04 00:17:42 +08:00
Mashiro 83c4f3e643
[Enhance] Make sure the FileHandler still alive after `torch.compile` (#1021)
* [Enhance] Make sure the FileHandler still alive after

* Resume filter

* avoid bc

* Fix unit test

* clean the code

* revert changes and set mode from 'm' to 'a'

* mode to file_mode

* add comments

* refine comments

* Fix duplicated the
2023-03-30 17:41:26 +08:00
Mashiro eb79d64af1
[Fix] Add PyTorch 2.0 CI and fix unit tests (#1026)
* [Enhance] Make sure the FileHandler still alive after

* minor refine

* minor refine

* refine unit test

* update CI

* update CI

* Fix CI

* fix build_windows

* fix build_windows

* fix build_windows

* fix build_windows

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* test windows CI

* Fix

* Debug

* Fix unit test

* Fix unit test

* Fix CI

* update image version

* update action/checkout and action/setup-python

* add condition to skip test compile

* [Fix] Update CI and fix unit test

* check compiling by attempting compilation

* check compiling by attempting compilation

* check compiling by attempting compilation

* use windows-2022 in runs on

* Apply suggestions from code review

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* update yml

* remove unnecessary assert

* assert grad is None according to the PyTorch version

* Fix code

---------

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2023-03-29 13:09:23 +08:00
KerwinKai 5b35c5b6ad
[Feature] Publish models after training if published_keys is set in CheckpointHook (#987)
* add publish keys in checkpointhook and update hook.md file

* Update checkpoint_hook.py

To avoid `mypy` warning `mmengine/hooks/checkpoint_hook.py:358: error: Unsupported right operand type for in ("Optional[List[str]]") Found 1 error in 1 file (checked 224 source files)`

* Update hook.md

Try to avoid trim trailing whitespace waring in hook.md

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update checkpoint_hook.py

* Update docs/en/tutorials/hook.md

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update hook.md

add 自动发布最好的和最后的权重

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update checkpoint_hook.py

add condition when the best checkpoints more than 1.

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update checkpoint_hook.py

delete re judge

* Update checkpoint_hook.py

* Update checkpoint_hook.py

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update checkpoint_hook.py

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update checkpoint_hook.py

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Add Test for publish model

* Update checkpoint_hook.py

* Update test_checkpoint_hook.py

* Fix file to pass pre-commit check

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Fix mypy warning

* rm not necessary line in checkpoint_hook.py

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* rm unnecessary messages add to message_hub

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update docs/zh_cn/tutorials/hook.md

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update docs/zh_cn/tutorials/hook.md

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* update checkpoint hook and hook.md file

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

---------

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2023-03-29 10:25:14 +08:00
Mashiro d174952a3e
[Fix] Fix there is no space between data_time and metric (#1025) 2023-03-28 19:20:56 +08:00
Mashiro ad33a7d0e5
[Fix] Fix inferencer gets wrong configs path (#996)
* [Fix] Fix inferencer get wrong configs path

* Update CI

* Fix indent

* Fix CI arguments

* gpu test in CI

gpu test in CI

* require lint

* Adjust pytorch version and cuda version

* Fix docker

* Fix docker syntax

* Use bach -c

* Use bach -c

* Replace is_installed with is_imported

* Fix

* Fix PYTHONPATH
2023-03-14 18:28:33 +08:00
Mashiro 395ebf8d82
[Enhancement] Support dumping logs of different ranks (#968)
* Add hostname

* Update mmengine/logging/logger.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Fix lint

* support record device id

* Fix unit test

* Clean the code

* Fix as comment

* Fix as comment

* Fix unit test

* Update doc

* Fix unit test

* Adjust sequence

* Replace \ with ()

* remove unnecessary ()

* does not change filename in single gpu training

* Fix ci

* fix docs

* Fix as comment

---------

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2023-03-13 14:35:31 +08:00
Mashiro 8063d2cce7
[Enhancement] Support writing data to `vis_backend` with prefix (#972)
* Log with prefix

* Fix test of loggerhook

* minor refine

* minor refine

* Fix unit test

* clean the code

* deepcopy in method

* replace regex

* Fix as comment

* Enhance readable

* rename reserve_prefix to remove_prefix

* Fix as comment

* Refine unit test

* Adjust sequence

* clean the code

* clean the code

* revert renaming reserve prefix

* Count the dataloader length in _get_dataloader_size
2023-03-13 13:07:37 +08:00
Qian Zhao 0d25625ba2
[Feature] Support torch.compile since PyTorch2.0 (#976)
* enable compile configurations to support torch.compile in Runner

* enable compilation in train, val and test

* fix as comments

* add docstring to illustrate usage

* minor refine error message

* add unittests

* fix ut skip

* add logging message to inform users

* compile `train_step`, `val_step`, `test_step` instead

* fix as comments

* revert to compile `train_step` only due to pt2 issue

* add documentation about torch.compile
2023-03-12 18:26:43 +08:00
Mashiro 6ea23a2f71
[Fix] Fix duplicated warning (#992)
* [Fix] Fix repeated warning

* Add type hint

* Fix unit test

* Rename recorder_dict to seen

* Fix as comment
2023-03-10 19:27:36 +08:00
Mashiro 7a074fa478
[Enhancement] Silence error when `ManagerMixin` built instance with duplicate name. (#990)
* [Fix]Silence error when ManagerMixin built duplicate name instance

* [Fix]Silence error when ManagerMixin built duplicate name instance

* Update mmengine/utils/manager.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

---------

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2023-03-10 11:07:31 +08:00
Mashiro 8beacd3b58
[Fix] Support calculate the flops of `matmul` with single dimension matrix (#970)
* Support calculate the flops of matmul

* Remove unnecessary type ignore

* Update mmengine/analysis/jit_handles.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

---------

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2023-03-09 17:29:26 +08:00
Mashiro 44f30f649e
[Enhancement] Add `FUNCTIONS` root Registry (#983)
* [Enhancement] Add FUNCTIONS Registry

* Refine as comment

* clean the code
2023-03-08 12:53:24 +08:00
Mashiro dbae83c52f
[Enhancement] Replace warnings.warn with print_log (#961)
* Replace warning with print_log

* Add comments for testing warning
2023-03-06 17:25:28 +08:00
Hakjin Lee b3430e4257
[Feature] Support EarlyStoppingHook (#739)
* [Feature] EarlyStoppingHook

* delete redundant line

* Assert stop_training and rename tests

* Fix UT

* rename `metric` to `monitor`

* Fix UT

* Fix UT

* edit docstring on patience

* Draft for new code

* fix ut

* add test case

* add test case

* fix ut

* Apply suggestions from code review

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Append hook

* Append hook

* Apply suggestions

* Update suggestions

* Update mmengine/hooks/__init__.py

* fix min_delta

* Apply suggestions from code review

* lint

* Apply suggestions from code review

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* delete save_last

* infer rule more robust

* refine unit test

* Update mmengine/hooks/early_stopping_hook.py

---------

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
Co-authored-by: zhouzaida <zhouzaida@163.com>
Co-authored-by: HAOCHENYE <21724054@zju.edu.cn>
2023-03-06 13:18:42 +08:00