Zaida Zhou
19aa1eb780
[Fix] Save checkpoint again to update best_ckpt of ckpt ( #1168 )
2023-06-02 14:42:56 +08:00
Zaida Zhou
193b7fdfcc
[Refactor] Let unit tests not affect each other ( #1169 )
2023-05-27 22:36:04 +08:00
Mashiro
f1aca8e307
[Fix] Failed to remove the previous best checkpoints ( #1086 )
...
* [Fix] Only reserve one best checkpoint
* [Fix] Only reserve one best checkpoint
* Fix unit test
* shutdown logging
* clean the save_checkpoint logic
2023-04-20 21:28:56 +08:00
Junwei Zheng
d41906fa15
[Fix] Fix publish multiple checkpoints when using multiple GPUs ( #1059 ) ( #1070 )
2023-04-12 10:38:48 +08:00
Mashiro
2dbc8ed253
[Refactor] Refactor checkpointhook unit tests ( #789 )
...
* Enhance config
* add unit test data
* Refacotr unitest of checkpointhook
* add comments
* Fix unit test
* remove _get_metric_scope
* tmp save
* Revert "remove _get_metric_scope"
This reverts commit eeb7a8c5ed2766bf773a9ed28f731fddacd10ac1.
* Revert "Revert "remove _get_metric_scope""
This reverts commit 5398255f6fb3dac8341f7d808f0d7d09350fcaae.
* Revert "tmp save"
This reverts commit cdc9919be8e0a78bbf264c060de2a4396c137d5a.
* clean the code
* Fix ut
* minor fix
* use str.replace
2023-04-06 10:55:16 +08:00
KerwinKai
5b35c5b6ad
[Feature] Publish models after training if published_keys is set in CheckpointHook ( #987 )
...
* add publish keys in checkpointhook and update hook.md file
* Update checkpoint_hook.py
To avoid `mypy` warning `mmengine/hooks/checkpoint_hook.py:358: error: Unsupported right operand type for in ("Optional[List[str]]") Found 1 error in 1 file (checked 224 source files)`
* Update hook.md
Try to avoid trim trailing whitespace waring in hook.md
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Update checkpoint_hook.py
* Update docs/en/tutorials/hook.md
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
* Update hook.md
add 自动发布最好的和最后的权重
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
* Update checkpoint_hook.py
add condition when the best checkpoints more than 1.
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
* Update checkpoint_hook.py
delete re judge
* Update checkpoint_hook.py
* Update checkpoint_hook.py
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Update checkpoint_hook.py
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Update checkpoint_hook.py
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
* Add Test for publish model
* Update checkpoint_hook.py
* Update test_checkpoint_hook.py
* Fix file to pass pre-commit check
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Fix mypy warning
* rm not necessary line in checkpoint_hook.py
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
* rm unnecessary messages add to message_hub
* Update mmengine/hooks/checkpoint_hook.py
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
* Update docs/zh_cn/tutorials/hook.md
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
* Update docs/zh_cn/tutorials/hook.md
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
* update checkpoint hook and hook.md file
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
* Update mmengine/hooks/checkpoint_hook.py
---------
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2023-03-29 10:25:14 +08:00
Mashiro
dbae83c52f
[Enhancement] Replace warnings.warn with print_log ( #961 )
...
* Replace warning with print_log
* Add comments for testing warning
2023-03-06 17:25:28 +08:00
Zaida Zhou
cb7e04d3cf
fix typo ( #965 )
2023-02-27 17:13:38 +08:00
Zaida Zhou
646927f62f
[Enhance] Ensure metrics is not empty when saving best ckpts ( #849 )
...
* [Enhance] Ensure metrics is not empty when saving best ckpts
* fix warn to warning
* delete a unnecessary method
2022-12-28 11:34:08 +08:00
Mashiro
0f62a6c091
[Fix] Remove besk ckpt only in master rank ( #682 )
2022-11-08 19:13:35 +08:00
Hakjin Lee
0857f9fb40
[Feature] Support torch ZeroRedundancyOptimizer ( #551 )
...
* [Feature] Support torch ZeRORedundancyOptimizer
Co-authored-by: Junhwa Song <ethan9867@gmail.com>
Signed-off-by: Junhwa Song <ethan9867@gmail.com>
Signed-off-by: Hakjin Lee <nijkah@gmail.com>
* lint
* Fix saving optimizer state_dict
* Fix handling import error
* Add test case
* fix UT
* Revert "fix UT"
This reverts commit dd64538960ff7440c6020f533d43945ffc23f2d2.
* fix handling import in UT
* Fix saving zero checkpoint and delete redundant master_only
* lint
* test unittest
* Fix handling impor error
* Fix UT condition
* Edit docstrings
* Fix typo
* Skip redundant procudure in checkpoint hook
* fix typo again
* Update mmengine/optim/optimizer/zero_optimizer.py
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
* Add api info
* lint
* Fix lint
* Handling AmpOptimWrapper case
* handling overlap_with_ddp
* Fix error
Signed-off-by: Junhwa Song <ethan9867@gmail.com>
Signed-off-by: Hakjin Lee <nijkah@gmail.com>
Co-authored-by: Junhwa Song <ethan9867@gmail.com>
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2022-10-27 20:31:50 +08:00
Zaida Zhou
ed84dfd34d
[Refactor] Refactor fileio without breaking back compatibility ( #533 )
...
* [Refactor] Refactor fileio but without breaking bc
* handle compatibility
* fix format
* modify io functions
* fix ut
* fix ut
* rename method names
* refine
* refine docstring
* fix ut in windows
* update ut
* minor fix
* ensure client is not None when closing it
* add more examples for list_dir_or_file interface
* refine docstring
* refine deprecated info
* fix ut
* add a description for lmdb docstring
2022-09-26 14:30:40 +08:00
Qian Zhao
c64243aa9e
[Fix] CheckpointHook behavior incorrect if given filename_tmpl
argument ( #518 )
2022-09-22 12:47:45 +08:00
Mashiro
8770c6c7fc
[Refactor] Refactor data flow to make the interface more natural ( #468 )
...
* [Refactor]: modify interface of Visualizer.add_datasample (#365 )
* [Refactor] Refactor data flow: refine `data_preprocessor`. (#359 )
* refine data_preprocessor
* remove unused BATCH_DATA alias
* Fix type hints
* rename move_data to cast_data
* [Refactor] Refactor data flow: collate data in `collate_fn` of `DataLoader` (#323 )
* acollate data in dataloader
* fix docstring
* refine comment
* fix as comment
* refactor default collate and psedo collate
* foramt test file
* fix docstring
* fix as comment
* rename elem to data_item
* minor fix
* fix as comment
* [Refactor] Refactor data flow: `data_batch` argument of `Evaluator.process is a `dict` (#360 )
* refine evaluator and metric
* compatible with new default collate
* replace default collate with pseudo
* Handle data_batch in metric
* fix unit test
* fix unit test
* fix unit test
* minor refine
* make data_batch optional
make data_batch optional
* rename outputs to predictions
* fix ut
* rename predictions to outputs
* fix docstring
* fix docstring
* fix unit test
* make outputs and data_batch to kwargs
* fix unit test
* keep signature of metric
* fix ut
* rename pred_sample arguments to data_sample(Visualizer)
* fix loop and ut
* [refactor]: Refactor model dataflow (#398 )
* [Refactor] Refactor data flow: refine `data_preprocessor`. (#359 )
* refine data_preprocessor
* remove unused BATCH_DATA alias
* Fix type hints
* rename move_data to cast_data
* refactor model data flow
tmp_commt
tmp commit
* make val_cfg and test_cfg optional
* roll back runner
* pass test mmdet
* fix as comment
fix as comment
fix ci in DataPreprocessor
* fix ut
* fix ut
* fix rebase main
* [Fix]: Fix test val ddp (#462 )
* [Fix] Fix docstring and type hint of data flow (#463 )
* Fix docstring of data flow
* change signature of hook
* fix unit test
* resolve conflicts
* fix lint
2022-08-24 22:04:55 +08:00
Zaida Zhou
486d8cda56
[Refactor] Refactor the import rule ( #459 )
...
* [Refactor] Refactor the import rule
* minor refinement
* add a comment
2022-08-23 18:58:36 +08:00
Zaida Zhou
6c607bd26f
[Docs] Simplify hook docs ( #428 )
...
* Move the design of hook to design/hook.md
* add relative links in docs
* update docstring of hooks
* refine checkpointhook docs
* refine
* fix comments
* refine
* add logging.md link in hook.md
* resolve comments
* fix typo
2022-08-23 16:20:47 +08:00
Mashiro
b14cbc2576
[Fix] Fix wrong epoch and iter when saving best ckpt ( #400 )
...
* fix wrong epoch andd iter when save bbest ckpt
* fix ut
* fix resume best ckpt unexpectedly
* minor refine
* fix unit test
2022-08-11 14:52:38 +08:00
LeoXing1996
08602a2385
[Enhancement] Support save best based on multi metrics ( #349 )
...
* support save best based on multi metrics
* add unit test
* resolve bugs after rebasing
* revise docstring
* revise docstring
* fix as comment
* revise as comment
2022-08-08 20:17:17 +08:00
LeoXing1996
d65350a9da
[Fix] Fix bug of not save-best in iteration-based training ( #341 )
...
* fix bug of not save-best in iteration-based training
* revise the unit test
2022-06-30 14:51:31 +08:00
Alex Yang
216521a936
[Feat] Support save best ckpt ( #310 )
...
* [Feat] Support save best ckpt
* reformat code
* rename function and reformat code
* fix logging info
2022-06-22 19:48:46 +08:00
Jiazhen Wang
7b55c5bdbf
[Feature] Support resume from Ceph ( #294 )
...
* support resume from ceph
* move func and refine
* delete symlink
* fix unittest
* perserve _allow_symlink and symlink
2022-06-17 10:37:19 +08:00
RangiLyu
11688507ba
[Fix] Fix some bugs in hooks and runner. ( #242 )
...
* [Fix] Fix some bugs in hooks and runner.
* fix markdown
* fix latex formula
* resolve comments
2022-05-20 17:18:24 +08:00
Zaida Zhou
86ffc19c9c
Add pyupgrade pre-commit hook ( #232 )
...
* Add pyupgrade pre-commit hook
* fix ut
* remove comments
2022-05-19 17:56:31 +08:00
RangiLyu
e37f1f905b
[Refactor] Make loop-related attributes to be runner's properties. ( #236 )
...
* [Enhance] Make loop related attributes to be runner's properties.
* move iter and epoch to loop
* resolve comments
2022-05-18 22:35:10 +08:00
Mashiro
5007825619
[Fix] change CheckPointHook before_run to before train ( #214 )
...
* change CheckPointHook before_run to before train
* using tmp_path in each checkpointhook test case
2022-05-05 20:08:07 +08:00
RangiLyu
59cc08e3ac
[Refactor] Refactor data_batch type and remove cur_dataloader in runner. ( #171 )
...
* [Refactor] Refactor data_batch type.
* fix sampler
* [Refactor] Remove cur_dataloader in runner.
* fix set_epoch
2022-04-08 15:57:10 +08:00
liukuikun
7e246b6f65
[Enhancement] refactor base data element ( #143 )
...
* [Enhancement] refactor base data elment
* fix comment
* fix comment
* fix pop not existing key without error
2022-03-31 18:21:45 +08:00
RangiLyu
9a61b389e7
[Refactor] Add batch_idx to hook input. ( #140 )
...
* [Refactor] Add batch_idx to hook input.
* update
2022-03-29 11:40:38 +08:00
Yuan Liu
26f24296db
[Feature]: Add dist semantics in checkpoint hook ( #131 )
...
* [Feature]: Add dist semantics in checkpoint hook
* [Fix]: Delete sync buffer in checkpoint hook
2022-03-25 13:46:31 +08:00
Zaida Zhou
72cf410969
[Refactor] Refactor interface of checkpointhook ( #127 )
...
* [Refactor] Refactor interface of checkpointhook
* fix print format
* minor ifx
2022-03-13 23:39:28 +08:00
Mashiro
a7961407e4
[Refactor] Refactor the interfaces of Hook and its subclassed ( #117 )
...
* Fix hook
* Fix
* Fix docs
* FIx
* Fix
* Fix as comment
* update
* Fix hook
* Fix hook
* Fix hook
* Fix itertimerhook
* Fix iter_timer_hook
* Fix
* Fix
* fix logger hook
* Fix loggerhook
* update cur_dataloader
* Fix docstring
* Fix docstring
* Fix as commet
* Fix as commet
* Fix as comment
* rename is_last_epoch, enhance and add after_val before_val .etc
* fix typo in docstring
* remove resolved TODO
* refactor docstring
2022-03-13 16:48:09 +08:00
Mashiro
ec3034b765
[Fix] Fix output argument of after_iter, train_after_ter and val_after_iter ( #115 )
...
* Fix hook
* Fix
* Fix docs
* FIx
* Fix
* Fix as comment
2022-03-09 23:10:19 +08:00
Zaida Zhou
ed8dcb4c61
fix type hint in hooks ( #106 )
2022-03-07 19:35:37 +08:00
Yuan Liu
be9971781e
[Fix]: Change the type of runner in docstring to Runner ( #103 )
...
* [Fix]: Change after inter and epoch to after train iter and epoch
* [Fix]: Add new UT to param scheduler hook
* [Fix]: Change the type of runner in docstring to Runner
Co-authored-by: Your <you@example.com>
2022-03-07 14:00:05 +08:00
Yuan Liu
15abb061ef
[Fix]: Fix data batch type in base hook ( #99 )
...
* [Fix]: Fix data batch type in base hook
* [Fix]: Fix the type hint bug in checkpoint, optimizer, param scheduler hooks
Co-authored-by: Your <you@example.com>
2022-03-07 13:25:45 +08:00
Zaida Zhou
fd85156412
fix type hint and format ( #88 )
2022-03-05 17:44:31 +08:00
Yuan Liu
cf239a2b17
[Feature]: Add checkpoint hook ( #66 )
...
* [Feature]: Add checkpoint hook
* [Fix]: Fix lint
* [Fix]: Delete redundant optional and give an example to our_dir
* [Feature]: Add test the last_ckpt in UT
* [Fix]: Fix docstring problem
* [Fix]: Add patch to UT
* [Feature]: Add Test case for by epoch
2022-03-02 22:01:58 +08:00