Commit Graph

16 Commits (dc778481cbe10a099d98543e6adee728999b7d54)

Author SHA1 Message Date
shilong 1830347f8b
Ema (#421)
* add ema hook

* add ema hook resume

* add ema hook test

* fix typo

* fix according to comment

* delete logger

* fix according to comment

* fix unitest

* fix typo

* fix according to comment

* change to resume_from

* typo

* fix isort
2020-07-30 22:06:19 +08:00
Wang Xinjiang d4da3daa7e
Syncbuf (#447)
* More robust sync buffer hook

* More robust sync buffer hook

* Reformat
2020-07-25 12:51:46 +08:00
Wang Xinjiang 66604e83de
Add syncbuffer hook (#443)
* reformat

* reformat

* Add register hook from cfg

* docstring

* change according to comments
2020-07-24 14:15:44 +08:00
Jiamin 55fadb4c4e
Add runner.meta to checkpoint in save_checkpoint() (#438)
* fix: error when runner.meta is None

* tests: add unittest for epoch-based save_checkpoint
2020-07-20 11:40:04 +08:00
Yawei Li 7730a79fcd
fix typo of annealing (#433) 2020-07-17 23:48:22 +08:00
Harry 5704613e28
Remove all module wrapper's module when saving checkpoint (#399)
* fix: remove all module wrapper when saving checkpoint

* refactor: move position of if

* docs: add docstring

* refactor: add _save_to_state_dict from official torch

* docs: modify docstring of _save_to_state_dict

* docs: modify docstring

* feat: add unittest

* feat: add DataParallel to unittest

* fix: a bug when model has batchnorm

* docs: update docstring
2020-07-08 23:20:22 +08:00
Kai Chen 63b7aa31b6
Fix docstring formats (#383)
* update doc formats

* update docstring
2020-07-04 00:55:25 +08:00
Jintao Lin 1ebd7ea6fb
add unittest for set_random_seed (#376) 2020-07-02 00:13:04 +08:00
Harry 69048ff056
Specifying distributed training port in os.environ when training with slurm (#362)
* feat: support for os.environ port for slurm training

* fix: port data type

* feat: add flawed unittest

* feat: add flawed unittest

* docs: add comments

* fix: unittest

* fix: unittest
2020-06-20 00:49:44 +08:00
Kai Chen 6bb244f255
add train_step() and val_step() for MMDP (#354) 2020-06-18 20:55:53 +08:00
Harry f28a7c7ed7
Add CosineRestartLrUpdaterHook (#319)
* feat: add CosineRestartLrUpdaterHook

* style: rename period to periods

* fix: bug in period 0

* feat: rename eta_min to min_lr and add min_lr_ratio

* docs: fix docstring of restart lr updater

* refactor: use annealing_cos

* docs: add docstring to annealing_cos

* feat: cosine restart lr update hook

* refactor: modify code order for unittest
2020-06-15 23:01:26 +08:00
Harry 67a26da917
Add IterBasedRunner (#314)
* feat: add IterBasedRunner

* fix: unittest

* feat: more unittest

* fix: expose dataloader len

* minor updates of BaseRunner

* refactor: remove CosineRestartLrUpdaterHook

* style: add docstring

* refactor: update IterTextLoggerHook: fstring and exp_name

* fix: epoch_runner unittest

* refactor: remove IterBasedTextLogger

* fix: old IterTextLoggerHook issue

* refactor: remove __len__ of IterLoader

* feat: add IterBasedRunner to init

* feat: add __len__ to IterLoader

* fix some docstrings

* refactor: use is_parallel_module

* fix: import issue

* fix: runner unittest missing logger

* fix checkpoints

* feat: add by_epoch default value to IterBaseRunner regitering loggger_hook

* refactor: remove setting by_epoch in log_config

* minor refactoring

* docs: add docstring

* fix: remove unused doc

* update the log info for saving checkpoints

Co-authored-by: Kai Chen <chenkaidev@gmail.com>
2020-06-11 13:35:34 +08:00
Kai Chen 821b3ad622
Fix the BC issue of ddp (#325)
* fix the BC issue of ddp

* minor fix for the docstring
2020-06-08 22:34:19 +08:00
Kai Chen 35ba152821
Add a BaseRunner and rename Runner to EpochBasedRunner (#290)
* add a BaseRunner and rename Runner to EpochBasedRunner

* fix the train/val step

* bug fix

* update unit tests

* fix unit tests

* raise an error if both batch_processor and train_step are set

* add a unit test
2020-06-02 22:23:21 +08:00
Wenwei Zhang 19e4a06cbc
Fix CosineAnealingLr register bug (#265)
* Fall back to CosineLr

* Fix consineanealing with unittest

* Cover momentum hook

* Add comments to explain
2020-05-04 00:38:55 +08:00
Kai Chen a338d43d78
Refactor unittests (#241)
* refactor unittests

* split test_video.py to two files
2020-04-26 22:54:27 +08:00