Commit Graph

171 Commits (5b648c119fc0f8789b6cb285d00e8c7fe8cba2ba)

Author SHA1 Message Date
RangiLyu 1241c21296
[Fix] Fix weight initializing in test and refine registry logging. (#367)
* [Fix] Fix weight initializing and registry logging.

* sync params

* resolve comments
2022-07-19 18:28:57 +08:00
Ma Zerun 3da66d1f87
[Enhance] Auto set the `end` of param schedulers. (#361)
* [Enhance] Auto set the `end` of param schedulers.

* Add log output and unit test

* Update docstring

* Update unit tests of `CosineAnnealingParamScheduler`.
2022-07-15 19:53:28 +08:00
Mashiro 78fad67d0d
[Fix] fix resume message_hub (#353)
* fix resume message_hub

* add unit test

* support resume from messagehub

* minor refine

* add comment

* fix typo

* update docstring
2022-07-14 20:13:22 +08:00
Mashiro 45001a1f6f
[Enhance] Support using variables in base config directly as normal variables. (#329)
* first commit

* Support modify base config and add unit test

* remove import mmengine in config

* add unit test

* fix lint

* add unit test

* move RemoveAssignFromAST to config utils

* git add utils

* fix format issue in test file

* refine unit test

* refine unit test
2022-07-14 13:05:55 +08:00
Mashiro 6b608b4ef1
[Enhance] Add `build_model_from_cfg` (#328)
* clean code

* fix as comment

* fix as comment

* add get_registry_by_scope method

* add unit test and docstring example

* rename get_registry_by_scope to switch_scope_and_registry

* move build function to registry/builder

* fix docstring

* rename builder->registry_builder, move build_from_cfg to registry_builder

rename builder->registry_builder, move build_from_cfg to registry_builder

* rename registry_builder to build_function

rename registry_builder to build_function

* fix docstring and type hint

* rename build_function to build_functions
2022-07-13 19:01:59 +08:00
Mashiro b2ee9f8b11
[Fix] Fix loss could be nan in optimizer wrapper (#345)
* fix optimizer wrapper counts

* fix ut
2022-07-06 16:42:49 +08:00
RangiLyu a3d2916790
[Enhance] Support scheduling betas with MomentumScheduler. (#346)
* [Enhance] Support scheduling betas with MomentumScheduler.

* enhance ut

* test adam betas

* enhance ut

* enhance ut
2022-07-05 20:37:23 +08:00
Mashiro 2853045e96
[Fix] Fix build multiple runners error (#348)
* fix build multiple runner error

* fix comments

* fix cpu ci
2022-07-05 20:35:06 +08:00
Mashiro 38e78d5549
[Fix] Fix ema hook and add unit test (#327)
* Fix ema hook and add unit test

* save state_dict of ema.module

save state_dict of ema.module

* replace warning.warn with MMLogger.warn

* fix as comment

* fix bug

* fix bug
2022-07-04 14:23:23 +08:00
Cedric Luo 9c55b4300c
[Enhance] Support dynamic interval (#342)
* support dynamic interval in iterbasedtrainloop

* update typehint

* update typehint

* add dynamic interval in epochbasedtrainloop

* update

* fix

Co-authored-by: luochunhua.vendor <luochunhua@pjlab.org.cn>
2022-06-30 15:08:56 +08:00
LeoXing1996 d65350a9da
[Fix] Fix bug of not save-best in iteration-based training (#341)
* fix bug of not save-best in iteration-based training

* revise the unit test
2022-06-30 14:51:31 +08:00
Mashiro 59b0ccfe6f
[Fix] Fix pytorch version compatibility of autocast (#339)
* fix unit test of autocast

* fix compatiblity of unit test of optimizerwrapper

* clean code

* fix as comment

* fix docstring
2022-06-29 20:30:53 +08:00
Mashiro 5ac3c23338
[Fix]: fix MMSeparateDistributedDataParallel (#338) 2022-06-28 22:20:20 +08:00
Mashiro d624fa9191
[Enhance] assert image shape before forward (#300)
* assert image shape before forward

* add unit test

* enhance error message

* allow gray image input

* fix as comment

* fix unit test

* fix unit test
2022-06-28 11:46:12 +08:00
Mashiro 2fd6beb972
[Fix] Fix UT of optimizer wrapper failed in pytorch1.6 (#340) 2022-06-28 10:31:14 +08:00
Jiazhen Wang 3af3d40541
[Enhance] Refine BaseDataset (#303)
* refine data_root and data_prefix params

* modify unittest
2022-06-27 14:59:56 +08:00
Yuan Liu 03d5c17ba6
[Feature]: Set different seed to different rank (#298)
* [Feature]: Set different seed for diff rank

* [Feature]: Add log

* [Fix]: Fix lint

* [Fix]: Fix docstring

* [Fix]: Fix sampler seed

* [Fix]: Fix log bug

* [Fix]: Change diff_seed to diff_rank_seed

* [Fix]: Fix lint
2022-06-24 14:28:16 +08:00
Alex Yang 2994195be2
[Feat] Support training on MPS (#331)
* [Feat] Support mps

* fix docstring
2022-06-23 16:53:19 +08:00
Haian Huang(深度眸) 2b8a32eca0
[Fix]: fix RuntimeError of SyncBuffersHook (#309)
* fix RuntimeError of SyncBuffersHook

* add UT
2022-06-22 20:00:46 +08:00
Alex Yang e18832f046
[Feat] Support revert syncbn (#326)
* [Feat] Support revert syncbn

* use logger.info but not warning

* fix info string
2022-06-22 19:50:54 +08:00
Mashiro 312f264ecd
[Feature] Add autocast wrapper (#307)
* add autocast wrapper

* fix docstring

* fix docstring

* fix compare version

* fix unit test

* fix incompatible arguments

* fix as comment

* fix unit test

* rename auto_cast to autocast
2022-06-22 19:49:20 +08:00
Alex Yang 216521a936
[Feat] Support save best ckpt (#310)
* [Feat] Support save best ckpt

* reformat code

* rename function and reformat code

* fix logging info
2022-06-22 19:48:46 +08:00
Mashiro 7154df2618
[Enhance] LogProcessor support custom significant digit (#311)
* LogProcessor support custom significant digit

* rename to num_digits
2022-06-22 19:35:52 +08:00
Mashiro afeac1c098
[Feature]: support to dump result in LoggerHook.after_test_epoch (#321) 2022-06-22 19:10:58 +08:00
Alex Yang dceef1f66f
[Refactor] Refactor `after_val_epoch` to make it output metric by epoch (#278)
* [Refactor]:Refactor `after_val_epoch` to make it output metric by epoch

* add an option for user to choose the way of outputing metric

* rename variable

* reformat docstring

* add type alias

* reformat code

* add test function

* add comment and test code

* add comment and test code
2022-06-21 15:39:59 +08:00
Alex Yang ef946404e6
[Feat] Support FSDP Training (#304)
* [Feat] Support FSDP Training

* fix version comparison

* change param format and move `FSDP_WRAP_POLICY` to wrapper file

* add docstring and type hint,reformat code

* fix type hint

* fix typo, reformat code
2022-06-21 15:32:56 +08:00
Mashiro 4a4d6b1ab2
[Enhance] dump messagehub in runner.resume (#237)
* [Enhance] dump messagehub in runner.resume

* delete unnecessary code

* delete debugging code

Co-authored-by: imabackstabber <312276423@qq.com>
2022-06-17 11:10:37 +08:00
Mashiro 7129a98e36
[Fix]: fix log processor to log average time and grad norm (#292) 2022-06-17 10:54:20 +08:00
Jiazhen Wang 7b55c5bdbf
[Feature] Support resume from Ceph (#294)
* support resume from ceph

* move func and refine

* delete symlink

* fix unittest

* perserve _allow_symlink and symlink
2022-06-17 10:37:19 +08:00
Jiazhen Wang d0d7174274
[Feature] Support MLU Devices (#288)
* support mlu

* add ut and refine docstring
2022-06-16 20:28:09 +08:00
Mashiro 7d3224bf46
[Fix] Fix setLevel of MMLogger (#297)
* Fix setLevel of MMLogger

Fix setLevel of MMLogger

* add docstring and comment
2022-06-14 14:54:25 +08:00
RangiLyu 1c18f30854
[Enhance] Support infinite dataloader iterator wrapper for IterBasedTrainLoop. (#289) 2022-06-14 14:52:59 +08:00
Alex Yang 5016332588
[Feat] support registering function (#302) 2022-06-14 14:50:24 +08:00
RangiLyu 4cd91ffe15
[Feature] Dump predictions to a pickle file for offline evaluation. (#293)
* [Feature] Dump predictions to pickle file for offline evaluation.

* print_log
2022-06-14 14:48:21 +08:00
Mashiro b7866021c4
[Refactor] Refactor the accumulate gradient implemention of OptimWrapper (#284)
* merge context

* update unit test

* add docstring

* fix bug in AmpOptimWrapper

* add docstring for backward

* add warning and docstring for accumuate gradient

* fix docstring

* fix docstring

* add params_group method

* fix as comment

* fix as comment

* make default_value of loss_scale to dynamic

* Fix docstring

* decouple should update and should no sync

* rename attribute in OptimWrapper

* fix docstring

* fix comment

* fix comment

* fix as comment

* fix as comment and add unit test
2022-06-13 23:20:53 +08:00
Miao Zheng fd295741ca
[Features]Add OneCycleLR (#296)
* [Features]Add OnecycleLR

* [Features]Add OnecycleLR

* yapf disable

* build_iter_from_epoch

* add epoch

* fix args

* fix according to comments;

* lr-param

* fix according to comments

* defaults -> default to

* remove epoch and steps per step

* variabel names
2022-06-13 21:23:59 +08:00
Mashiro 8b0c9c5f6f
[Fix] fix build train_loop during test (#295)
* fix build train_loop during test

* fix build train_loop during test

* fix build train_loop during test

* fix build train_loop during test

* Fix as comment
2022-06-13 21:23:46 +08:00
RangiLyu 819e10c24c
[Fix] Fix image dtype when enable_normalize=False. (#301)
* [Fix] Fix image dtype when enable_normalize=False.

* update ut

* move to collate

* update ut
2022-06-13 21:21:19 +08:00
Mashiro bcab813242
[Feature] Add ModuleList Sequential and ModuleDict (#299)
* add module list

* add module list

* fix docstring
2022-06-13 13:51:07 +08:00
Alex Yang df0c510444
[Feat]:support customizing evaluator (#287)
* [Feat]:support customizing evaluator

* fix keyname of determining using default evaluator or not

* add assertion

* fix typo
2022-06-10 15:34:10 +08:00
liukuikun c90b95a44b
[Fix]: fix label data and support empty tensor in label_to_onehot (#291) 2022-06-10 15:12:41 +08:00
RangiLyu 2f16ec69fb
[Feature] Support overwrite default scope with "_scope_". (#275)
* [Feature] Support overwrite default scope with "_scope_".

* add ut

* add ut
2022-06-09 20:16:31 +08:00
jbwang1997 7a5d3c83ea
[Fix] Replace auto_scale_lr_cfg to auto_scale_lr (#286)
* Replace auto_scale_lr_cfg to auto_scale_lr

* Update
2022-06-09 20:15:36 +08:00
Mashiro 931db99005
[Enhance] Enhance img data preprocessor (#290)
* fix BaseDataPreprocessor

* fix BaseDataPreprocessor

* change device type to torch.device

* change device type to torch.device

* fix cpu method of base model

* Allow ImgDataPreprocessor do not normalize

* remove unnecessary type ignore

* make mean and std optional

* refine docstring
2022-06-09 20:12:15 +08:00
Mashiro a9afdad7a8
[Fix] Fix BaseDataPreprocessor and BaseModel (#285)
* fix BaseDataPreprocessor

* fix BaseDataPreprocessor

* change device type to torch.device

* change device type to torch.device

* fix cpu method of base model
2022-06-09 11:45:19 +08:00
Mashiro 6ee675430f
[Refactor]: change order of BaseModel arguments (#282) 2022-06-08 13:28:00 +08:00
Mashiro f04fec736d
[Feature]: add base model, ddp model wrapper and unit test (#268)
* add base model, ddp model and unit test

* add unit test

* fix unit test

* fix docstring

* fix cpu unit test

* refine base data preprocessor

* refine base data preprocessor

* refine interface of ddp module

* remove optimizer hook

* add forward

* fix as comment

* fix unit test

* fix as comment

* fix build optimizer wrapper

* rebase main and fix unit test

* stack_batch support stacking ndim tensor, add docstring for merge dict

* fix lint

* fix test loop

* make precision_context effective to data_preprocessor

* fix as comment

* fix as comment

* refine docstring

* change collate_data output typehints

* rename to_rgb to bgr_to_rgb and rgb_to_bgr

* support build basemodel with built DataPreprocessor

* fix as comment

* fix docstring
2022-06-07 22:13:53 +08:00
RangiLyu ad965a5309
[Enhance] Enhance checkpoint meta info. (#279) 2022-06-07 18:48:50 +08:00
Mashiro 538ff48aec
[Fix] Rename data_list and support loading from ceph in dataset (#240)
* rename datalist and support load ceph

* rename datalist and support load ceph

* remove check disk file path in _load_metainfo

* fix rename error

* fix rename error

* unit test error

* fix rename error

* remove unnecessary code

* fix lint
2022-06-07 17:09:33 +08:00
jbwang1997 bd3c53b385
[Fix] Fix CI after merging support auto scale lr and support custom runner (#280) 2022-06-07 16:03:51 +08:00