Commit Graph

147 Commits (dceef1f66fd5afd6346f9941b6595b81ee6b364e)

Author SHA1 Message Date
Alex Yang dceef1f66f
[Refactor] Refactor `after_val_epoch` to make it output metric by epoch (#278)
* [Refactor]:Refactor `after_val_epoch` to make it output metric by epoch

* add an option for user to choose the way of outputing metric

* rename variable

* reformat docstring

* add type alias

* reformat code

* add test function

* add comment and test code

* add comment and test code
2022-06-21 15:39:59 +08:00
Alex Yang ef946404e6
[Feat] Support FSDP Training (#304)
* [Feat] Support FSDP Training

* fix version comparison

* change param format and move `FSDP_WRAP_POLICY` to wrapper file

* add docstring and type hint,reformat code

* fix type hint

* fix typo, reformat code
2022-06-21 15:32:56 +08:00
Mashiro 4a4d6b1ab2
[Enhance] dump messagehub in runner.resume (#237)
* [Enhance] dump messagehub in runner.resume

* delete unnecessary code

* delete debugging code

Co-authored-by: imabackstabber <312276423@qq.com>
2022-06-17 11:10:37 +08:00
Mashiro 7129a98e36
[Fix]: fix log processor to log average time and grad norm (#292) 2022-06-17 10:54:20 +08:00
Jiazhen Wang 7b55c5bdbf
[Feature] Support resume from Ceph (#294)
* support resume from ceph

* move func and refine

* delete symlink

* fix unittest

* perserve _allow_symlink and symlink
2022-06-17 10:37:19 +08:00
Jiazhen Wang d0d7174274
[Feature] Support MLU Devices (#288)
* support mlu

* add ut and refine docstring
2022-06-16 20:28:09 +08:00
Mashiro 7d3224bf46
[Fix] Fix setLevel of MMLogger (#297)
* Fix setLevel of MMLogger

Fix setLevel of MMLogger

* add docstring and comment
2022-06-14 14:54:25 +08:00
RangiLyu 1c18f30854
[Enhance] Support infinite dataloader iterator wrapper for IterBasedTrainLoop. (#289) 2022-06-14 14:52:59 +08:00
Alex Yang 5016332588
[Feat] support registering function (#302) 2022-06-14 14:50:24 +08:00
RangiLyu 4cd91ffe15
[Feature] Dump predictions to a pickle file for offline evaluation. (#293)
* [Feature] Dump predictions to pickle file for offline evaluation.

* print_log
2022-06-14 14:48:21 +08:00
Mashiro b7866021c4
[Refactor] Refactor the accumulate gradient implemention of OptimWrapper (#284)
* merge context

* update unit test

* add docstring

* fix bug in AmpOptimWrapper

* add docstring for backward

* add warning and docstring for accumuate gradient

* fix docstring

* fix docstring

* add params_group method

* fix as comment

* fix as comment

* make default_value of loss_scale to dynamic

* Fix docstring

* decouple should update and should no sync

* rename attribute in OptimWrapper

* fix docstring

* fix comment

* fix comment

* fix as comment

* fix as comment and add unit test
2022-06-13 23:20:53 +08:00
Miao Zheng fd295741ca
[Features]Add OneCycleLR (#296)
* [Features]Add OnecycleLR

* [Features]Add OnecycleLR

* yapf disable

* build_iter_from_epoch

* add epoch

* fix args

* fix according to comments;

* lr-param

* fix according to comments

* defaults -> default to

* remove epoch and steps per step

* variabel names
2022-06-13 21:23:59 +08:00
Mashiro 8b0c9c5f6f
[Fix] fix build train_loop during test (#295)
* fix build train_loop during test

* fix build train_loop during test

* fix build train_loop during test

* fix build train_loop during test

* Fix as comment
2022-06-13 21:23:46 +08:00
RangiLyu 819e10c24c
[Fix] Fix image dtype when enable_normalize=False. (#301)
* [Fix] Fix image dtype when enable_normalize=False.

* update ut

* move to collate

* update ut
2022-06-13 21:21:19 +08:00
Mashiro bcab813242
[Feature] Add ModuleList Sequential and ModuleDict (#299)
* add module list

* add module list

* fix docstring
2022-06-13 13:51:07 +08:00
Alex Yang df0c510444
[Feat]:support customizing evaluator (#287)
* [Feat]:support customizing evaluator

* fix keyname of determining using default evaluator or not

* add assertion

* fix typo
2022-06-10 15:34:10 +08:00
liukuikun c90b95a44b
[Fix]: fix label data and support empty tensor in label_to_onehot (#291) 2022-06-10 15:12:41 +08:00
RangiLyu 2f16ec69fb
[Feature] Support overwrite default scope with "_scope_". (#275)
* [Feature] Support overwrite default scope with "_scope_".

* add ut

* add ut
2022-06-09 20:16:31 +08:00
jbwang1997 7a5d3c83ea
[Fix] Replace auto_scale_lr_cfg to auto_scale_lr (#286)
* Replace auto_scale_lr_cfg to auto_scale_lr

* Update
2022-06-09 20:15:36 +08:00
Mashiro 931db99005
[Enhance] Enhance img data preprocessor (#290)
* fix BaseDataPreprocessor

* fix BaseDataPreprocessor

* change device type to torch.device

* change device type to torch.device

* fix cpu method of base model

* Allow ImgDataPreprocessor do not normalize

* remove unnecessary type ignore

* make mean and std optional

* refine docstring
2022-06-09 20:12:15 +08:00
Mashiro a9afdad7a8
[Fix] Fix BaseDataPreprocessor and BaseModel (#285)
* fix BaseDataPreprocessor

* fix BaseDataPreprocessor

* change device type to torch.device

* change device type to torch.device

* fix cpu method of base model
2022-06-09 11:45:19 +08:00
Mashiro 6ee675430f
[Refactor]: change order of BaseModel arguments (#282) 2022-06-08 13:28:00 +08:00
Mashiro f04fec736d
[Feature]: add base model, ddp model wrapper and unit test (#268)
* add base model, ddp model and unit test

* add unit test

* fix unit test

* fix docstring

* fix cpu unit test

* refine base data preprocessor

* refine base data preprocessor

* refine interface of ddp module

* remove optimizer hook

* add forward

* fix as comment

* fix unit test

* fix as comment

* fix build optimizer wrapper

* rebase main and fix unit test

* stack_batch support stacking ndim tensor, add docstring for merge dict

* fix lint

* fix test loop

* make precision_context effective to data_preprocessor

* fix as comment

* fix as comment

* refine docstring

* change collate_data output typehints

* rename to_rgb to bgr_to_rgb and rgb_to_bgr

* support build basemodel with built DataPreprocessor

* fix as comment

* fix docstring
2022-06-07 22:13:53 +08:00
RangiLyu ad965a5309
[Enhance] Enhance checkpoint meta info. (#279) 2022-06-07 18:48:50 +08:00
Mashiro 538ff48aec
[Fix] Rename data_list and support loading from ceph in dataset (#240)
* rename datalist and support load ceph

* rename datalist and support load ceph

* remove check disk file path in _load_metainfo

* fix rename error

* fix rename error

* unit test error

* fix rename error

* remove unnecessary code

* fix lint
2022-06-07 17:09:33 +08:00
jbwang1997 bd3c53b385
[Fix] Fix CI after merging support auto scale lr and support custom runner (#280) 2022-06-07 16:03:51 +08:00
jbwang1997 8f3fcee301
[Feature] Add auto scale lr fucntion (#270)
* Add auto scale lr fucntion

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

Co-authored-by: wangjiabao1.vendor <wangjiabao@pjlab.org.cn>
2022-06-06 22:27:15 +08:00
Jiazhen Wang 65bc95036c
[Enhance] Support Custom Runner (#258)
* support custom runner

* change build_runner_from_cfg

* refine docstring

* refine docstring
2022-06-06 14:33:32 +08:00
RangiLyu 70c4ea191f
[Refactor]: Modify val_interval and val_begin to be the attributes of TrainLoop. (#274)
* Modify val_interval and val_begin to be the attributes of TrainLoop.

* update doc

* fix lint

* type hint
2022-06-06 11:13:25 +08:00
Alex Yang 13606040ac
[Feat]:Add base module (#277) 2022-06-06 10:51:23 +08:00
Mashiro 80a46c4848
[Fix] fix build optimizer wrapper without type (#272)
* fix build optimizer wrapper without type

* refine logic

* fix as comment

* fix optim_wrapper config error in docstring and unit test

* refine docstring of build_optim_wrapper
2022-06-05 22:35:16 +08:00
Mashiro 3e3866c1b9
[Feature] Add optimizer wrapper (#265)
* Support multiple optimizers

* minor refinement

* improve unit tests

* minor fix

* Update unit tests for resuming or saving ckpt for multiple optimizers

* refine docstring

* refine docstring

* fix typo

* update docstring

* refactor the logic to build multiple optimizers

* resolve comments

* ParamSchedulers spports multiple optimizers

* add optimizer_wrapper

* fix comment and docstirng

* fix unit test

* add unit test

* refine docstring

* RuntimeInfoHook supports printing multi learning rates

* resolve comments

* add optimizer_wrapper

* fix mypy

* fix lint

* fix OptimizerWrapperDict docstring and add unit test

* rename OptimizerWrapper to OptimWrapper, OptimWrapperDict inherit OptimWrapper, and fix as comment

* Fix AmpOptimizerWrapper

* rename build_optmizer_wrapper to build_optim_wrapper

* refine optimizer wrapper

* fix AmpOptimWrapper.step, docstring

* resolve confict

* rename DefaultOptimConstructor

* fix as comment

* rename clig grad auguments

* refactor optim_wrapper config

* fix docstring of DefaultOptimWrapperConstructor

fix docstring of DefaultOptimWrapperConstructor

* add get_lr method to OptimWrapper and OptimWrapperDict

* skip some amp unit test

* fix unit test

* fix get_lr, get_momentum docstring

* refactor get_lr, get_momentum, fix as comment

* fix error message

Co-authored-by: zhouzaida <zhouzaida@163.com>
2022-06-01 18:04:38 +08:00
Zaida Zhou f1da9a1d7f
[Feature] Support multiple optimizers (#235)
* Support multiple optimizers

* minor refinement

* improve unit tests

* minor fix

* Update unit tests for resuming or saving ckpt for multiple optimizers

* refine docstring

* refine docstring

* fix typo

* update docstring

* refactor the logic to build multiple optimizers

* resolve comments

* ParamSchedulers spports multiple optimizers

* refine docstring

* RuntimeInfoHook supports printing multi learning rates

* resolve comments

* fix typo
2022-05-31 16:54:39 +08:00
Jiazhen Wang f2190de787
[Enhance] Improve Exception in call_hook (#247)
* improve exception in call_hook

* refine unit test

* add test_call_hook

* refine

* update docstring and ut
2022-05-31 11:34:30 +08:00
jbwang1997 38b22d9e68
[Enhance] Enhance error report when a module has been registered in registery. (#264)
* Update

* Add unittest
2022-05-31 11:31:04 +08:00
RangiLyu 172b9ded4a
[Fix] Fix ema state dict swapping in EMAHook and torch1.5 ut. (#266)
* [Fix] Fix ema state dict swapping in EMAHook.

* fix pt1.5 ut

* add more comments
2022-05-30 16:51:06 +08:00
Jingwei Zhang 40daf46a45
Support validation only after some epoch/iteration in ValLoop (#257)
* add the epoch/iter that begins validating

* fix lint

* add property and fix unit test

* minor changes

* fix typos and add unit test

* add unit test about begin

* fix docstring
2022-05-27 15:10:12 +08:00
RangiLyu 4705e1fe3d
[Enhance] Add RuntimeInfoHook to update runtime information. (#254)
* [Enhance] Add RuntimeInfoHook to update runtime information.

* move lr to runtime info

* docstring

* resolve comments

* update ut and doc
2022-05-26 14:35:37 +08:00
Jiazhen Wang 4cbbbc0c31
[Enhance] Refine sync random seed (#256)
* refine sync random seed

* cancel seed param in batch-sampler
2022-05-25 19:18:03 +08:00
Haian Huang(深度眸) c197bdf359
[Feature] Profiling tools (#241)
* Add profiling tools

* fix docstr

* fix docstr

* update

* fix bug

* update

* update

* fix error

* fix mypy

* uodate

* merge main

* fix UT
2022-05-25 10:55:07 +08:00
Haian Huang(深度眸) 8d3bd4dfef
Move get_max_cuda_memory and set_multi_processing to public function (#250)
* move get_max_cuda_memory and set_multi_processing to a public function

* fix lint

* fix lint

* fix lint

* delete _set_multi_processing

* fix error

* rename
2022-05-24 19:36:55 +08:00
Jiazhen Wang a976257ca9
[Enhance] Support Custom LogProcessor (#251)
* support custom log processor

* supplementary docs

* format code
2022-05-24 17:17:35 +08:00
RangiLyu 11688507ba
[Fix] Fix some bugs in hooks and runner. (#242)
* [Fix] Fix some bugs in hooks and runner.

* fix markdown

* fix latex formula

* resolve comments
2022-05-20 17:18:24 +08:00
RangiLyu 0279ac2e8d
[Feature] Support EMA and SWA. (#239)
* [Feature] Support EMA and SWA.

* add ema hook

* add avg model ut

* add more unit tests

* resolve comments

* fix warmup ema

* rename

* fix comments

* add assert

* fix typehint

* add comments
2022-05-19 18:53:04 +08:00
Zaida Zhou 86ffc19c9c
Add pyupgrade pre-commit hook (#232)
* Add pyupgrade pre-commit hook

* fix ut

* remove comments
2022-05-19 17:56:31 +08:00
Zaida Zhou 98c85529b1
[Refactor] Replace torch distributed with mmengine dist module (#196)
* [Fix] Replace torch distributed with mmengine dist module

* minor refinement

* move all_reduce_params to dist.py

* add unit tests

* update unit tests

* fix test_logger.py

* add examples
2022-05-19 17:40:01 +08:00
RangiLyu e37f1f905b
[Refactor] Make loop-related attributes to be runner's properties. (#236)
* [Enhance] Make loop related attributes to be runner's properties.

* move iter and epoch to loop

* resolve comments
2022-05-18 22:35:10 +08:00
Mashiro cc8a6b86e1
[Fix] Fix BaseDataset: join prefix in parse_data_info (#226)
* implement parse_data_info

* add unit test

* fix join prefix of ann_file

* fix docstring
2022-05-17 20:53:13 +08:00
Mashiro fd962437e9
[Fix] Support Runner dump cfg without filename (#228)
* fix runner dump cfg

* convert dict cfg to Config
2022-05-17 17:32:10 +08:00
liukuikun 689837d2b8
[Enhancement] add pixel data and label data (#224) 2022-05-13 18:23:25 +08:00