Mashiro
b2ee9f8b11
[Fix] Fix loss could be nan in optimizer wrapper ( #345 )
...
* fix optimizer wrapper counts
* fix ut
2022-07-06 16:42:49 +08:00
Mashiro
96378fa748
[Fix] make `autocast` compatible with GTX1660 and make it more robust. ( #344 )
...
* fix amp
* fix amp
* make logic consistent with torch.autocast
* support multiple device
* fix as comment
* fix as comment
* avoid circle import
2022-07-05 20:37:56 +08:00
RangiLyu
a3d2916790
[Enhance] Support scheduling betas with MomentumScheduler. ( #346 )
...
* [Enhance] Support scheduling betas with MomentumScheduler.
* enhance ut
* test adam betas
* enhance ut
* enhance ut
2022-07-05 20:37:23 +08:00
Mashiro
2853045e96
[Fix] Fix build multiple runners error ( #348 )
...
* fix build multiple runner error
* fix comments
* fix cpu ci
2022-07-05 20:35:06 +08:00
Mashiro
38e78d5549
[Fix] Fix ema hook and add unit test ( #327 )
...
* Fix ema hook and add unit test
* save state_dict of ema.module
save state_dict of ema.module
* replace warning.warn with MMLogger.warn
* fix as comment
* fix bug
* fix bug
2022-07-04 14:23:23 +08:00
Cedric Luo
9c55b4300c
[Enhance] Support dynamic interval ( #342 )
...
* support dynamic interval in iterbasedtrainloop
* update typehint
* update typehint
* add dynamic interval in epochbasedtrainloop
* update
* fix
Co-authored-by: luochunhua.vendor <luochunhua@pjlab.org.cn>
2022-06-30 15:08:56 +08:00
LeoXing1996
d65350a9da
[Fix] Fix bug of not save-best in iteration-based training ( #341 )
...
* fix bug of not save-best in iteration-based training
* revise the unit test
2022-06-30 14:51:31 +08:00
Mashiro
59b0ccfe6f
[Fix] Fix pytorch version compatibility of autocast ( #339 )
...
* fix unit test of autocast
* fix compatiblity of unit test of optimizerwrapper
* clean code
* fix as comment
* fix docstring
2022-06-29 20:30:53 +08:00
Mashiro
5ac3c23338
[Fix]: fix MMSeparateDistributedDataParallel ( #338 )
2022-06-28 22:20:20 +08:00
Mashiro
d624fa9191
[Enhance] assert image shape before forward ( #300 )
...
* assert image shape before forward
* add unit test
* enhance error message
* allow gray image input
* fix as comment
* fix unit test
* fix unit test
2022-06-28 11:46:12 +08:00
Zaida Zhou
6015fd35e5
Fix docstring format ( #337 )
2022-06-28 11:04:55 +08:00
Mashiro
2fd6beb972
[Fix] Fix UT of optimizer wrapper failed in pytorch1.6 ( #340 )
2022-06-28 10:31:14 +08:00
Jiazhen Wang
bbe00274c8
[Enhance] LR and Momentum Visualizer ( #267 )
...
* impl lr and momentum visualizer
* provide fakerun
2022-06-27 15:00:11 +08:00
Jiazhen Wang
3af3d40541
[Enhance] Refine BaseDataset ( #303 )
...
* refine data_root and data_prefix params
* modify unittest
2022-06-27 14:59:56 +08:00
Yuan Liu
03d5c17ba6
[Feature]: Set different seed to different rank ( #298 )
...
* [Feature]: Set different seed for diff rank
* [Feature]: Add log
* [Fix]: Fix lint
* [Fix]: Fix docstring
* [Fix]: Fix sampler seed
* [Fix]: Fix log bug
* [Fix]: Change diff_seed to diff_rank_seed
* [Fix]: Fix lint
2022-06-24 14:28:16 +08:00
Jiazhen Wang
12f7d3a0d3
[Fix]: fix load_checkpoint ( #332 )
2022-06-23 16:53:53 +08:00
Alex Yang
2994195be2
[Feat] Support training on MPS ( #331 )
...
* [Feat] Support mps
* fix docstring
2022-06-23 16:53:19 +08:00
Zaida Zhou
e877862d5b
[Docs] Improve docstring ( #324 )
...
* Fix docstring format of BaseDataElement
* fix docstring
2022-06-23 16:08:56 +08:00
Mashiro
a4f5533db6
fix torch 1.10 amp error ( #330 )
2022-06-22 23:12:20 +08:00
Haian Huang(深度眸)
2b8a32eca0
[Fix]: fix RuntimeError of SyncBuffersHook ( #309 )
...
* fix RuntimeError of SyncBuffersHook
* add UT
2022-06-22 20:00:46 +08:00
Alex Yang
e18832f046
[Feat] Support revert syncbn ( #326 )
...
* [Feat] Support revert syncbn
* use logger.info but not warning
* fix info string
2022-06-22 19:50:54 +08:00
Mashiro
312f264ecd
[Feature] Add autocast wrapper ( #307 )
...
* add autocast wrapper
* fix docstring
* fix docstring
* fix compare version
* fix unit test
* fix incompatible arguments
* fix as comment
* fix unit test
* rename auto_cast to autocast
2022-06-22 19:49:20 +08:00
Alex Yang
216521a936
[Feat] Support save best ckpt ( #310 )
...
* [Feat] Support save best ckpt
* reformat code
* rename function and reformat code
* fix logging info
2022-06-22 19:48:46 +08:00
Zaida Zhou
c451e71998
Add storage backends in init file ( #325 )
2022-06-22 19:41:31 +08:00
Mashiro
7154df2618
[Enhance] LogProcessor support custom significant digit ( #311 )
...
* LogProcessor support custom significant digit
* rename to num_digits
2022-06-22 19:35:52 +08:00
Jiazhen Wang
2086bc4554
[Feature] Fully support to use MLU for training ( #313 )
...
* modify cuda() to to()
* rollback load_checkpoint
* refine runner
* add TODO
2022-06-22 19:33:35 +08:00
Mashiro
afeac1c098
[Feature]: support to dump result in LoggerHook.after_test_epoch ( #321 )
2022-06-22 19:10:58 +08:00
Zaida Zhou
6501d21eab
[Fix]: fix mdformat version to support python3.6 ( #315 )
2022-06-21 16:32:58 +08:00
Alex Yang
dceef1f66f
[Refactor] Refactor `after_val_epoch` to make it output metric by epoch ( #278 )
...
* [Refactor]:Refactor `after_val_epoch` to make it output metric by epoch
* add an option for user to choose the way of outputing metric
* rename variable
* reformat docstring
* add type alias
* reformat code
* add test function
* add comment and test code
* add comment and test code
2022-06-21 15:39:59 +08:00
Alex Yang
ef946404e6
[Feat] Support FSDP Training ( #304 )
...
* [Feat] Support FSDP Training
* fix version comparison
* change param format and move `FSDP_WRAP_POLICY` to wrapper file
* add docstring and type hint,reformat code
* fix type hint
* fix typo, reformat code
2022-06-21 15:32:56 +08:00
Zaida Zhou
e76517c63a
[Doc]: Update hooks docs ( #317 )
2022-06-21 15:13:53 +08:00
Zaida Zhou
d09af9ead4
[Doc]: update root registries in docs ( #316 )
2022-06-21 15:12:49 +08:00
Tao Gong
45f5859b50
[Doc]: refactor docs for basedataset ( #318 )
2022-06-21 14:58:10 +08:00
Mashiro
44538e56c5
[Doc]: refine logging doc ( #320 )
2022-06-21 14:55:21 +08:00
Jiazhen Wang
e1422a34a3
[Fix]: Fix missing schedulers in __init__.py of schedulers ( #319 )
2022-06-21 14:40:00 +08:00
RangiLyu
e470c3aa1b
[Fix]: fix SWA in pytorch 1.6 ( #312 )
2022-06-21 14:35:22 +08:00
Mashiro
bc763758d8
Fix resource package in windows ( #308 )
...
* move import resource
* move import resource
2022-06-17 14:43:27 +08:00
Mashiro
4a4d6b1ab2
[Enhance] dump messagehub in runner.resume ( #237 )
...
* [Enhance] dump messagehub in runner.resume
* delete unnecessary code
* delete debugging code
Co-authored-by: imabackstabber <312276423@qq.com>
2022-06-17 11:10:37 +08:00
Mashiro
7129a98e36
[Fix]: fix log processor to log average time and grad norm ( #292 )
2022-06-17 10:54:20 +08:00
Jiazhen Wang
7b55c5bdbf
[Feature] Support resume from Ceph ( #294 )
...
* support resume from ceph
* move func and refine
* delete symlink
* fix unittest
* perserve _allow_symlink and symlink
2022-06-17 10:37:19 +08:00
Jiazhen Wang
d0d7174274
[Feature] Support MLU Devices ( #288 )
...
* support mlu
* add ut and refine docstring
2022-06-16 20:28:09 +08:00
Mashiro
e1ed5669d5
set resource limit in runner ( #306 )
2022-06-15 21:01:13 +08:00
Mashiro
7d3224bf46
[Fix] Fix setLevel of MMLogger ( #297 )
...
* Fix setLevel of MMLogger
Fix setLevel of MMLogger
* add docstring and comment
2022-06-14 14:54:25 +08:00
RangiLyu
1c18f30854
[Enhance] Support infinite dataloader iterator wrapper for IterBasedTrainLoop. ( #289 )
2022-06-14 14:52:59 +08:00
Alex Yang
5016332588
[Feat] support registering function ( #302 )
2022-06-14 14:50:24 +08:00
RangiLyu
4cd91ffe15
[Feature] Dump predictions to a pickle file for offline evaluation. ( #293 )
...
* [Feature] Dump predictions to pickle file for offline evaluation.
* print_log
2022-06-14 14:48:21 +08:00
Mashiro
b7866021c4
[Refactor] Refactor the accumulate gradient implemention of OptimWrapper ( #284 )
...
* merge context
* update unit test
* add docstring
* fix bug in AmpOptimWrapper
* add docstring for backward
* add warning and docstring for accumuate gradient
* fix docstring
* fix docstring
* add params_group method
* fix as comment
* fix as comment
* make default_value of loss_scale to dynamic
* Fix docstring
* decouple should update and should no sync
* rename attribute in OptimWrapper
* fix docstring
* fix comment
* fix comment
* fix as comment
* fix as comment and add unit test
2022-06-13 23:20:53 +08:00
Miao Zheng
fd295741ca
[Features]Add OneCycleLR ( #296 )
...
* [Features]Add OnecycleLR
* [Features]Add OnecycleLR
* yapf disable
* build_iter_from_epoch
* add epoch
* fix args
* fix according to comments;
* lr-param
* fix according to comments
* defaults -> default to
* remove epoch and steps per step
* variabel names
2022-06-13 21:23:59 +08:00
Mashiro
8b0c9c5f6f
[Fix] fix build train_loop during test ( #295 )
...
* fix build train_loop during test
* fix build train_loop during test
* fix build train_loop during test
* fix build train_loop during test
* Fix as comment
2022-06-13 21:23:46 +08:00
RangiLyu
819e10c24c
[Fix] Fix image dtype when enable_normalize=False. ( #301 )
...
* [Fix] Fix image dtype when enable_normalize=False.
* update ut
* move to collate
* update ut
2022-06-13 21:21:19 +08:00