240 Commits

Author SHA1 Message Date
Range King
273fb2b333
[Feature] Add DVCLiveVisBackend (#1336) 2023-09-08 17:22:23 +08:00
Zaida Zhou
45ee96d0c4
[Docs] Add activation checkpointing usage (#1341) 2023-09-05 11:23:44 +08:00
Mashiro
19ab172b2d
Fix typos and documents of colossalai (#1315) 2023-08-22 16:13:55 +08:00
Mashiro
db32234241
[Feature] Add colossalai strategy (#1299) 2023-08-18 15:09:35 +08:00
Zaida Zhou
03ad86cfd2
[Docs] Add a image for neptune (#1312) 2023-08-18 10:48:55 +08:00
Theodore
43e308caaf
[Feature] Add NeptuneVisBackend (#1311) 2023-08-17 23:29:58 +08:00
Zaida Zhou
a483dba9d1
fix typo (#1298) 2023-08-08 10:49:13 +08:00
Zaida Zhou
a54e814bf8
[Docs] Fix unused parameters (#1288) 2023-08-03 15:45:30 +08:00
Mashiro
5c5ec8b168
Add a segmentation example (#1282) 2023-08-03 15:27:58 +08:00
Zaida Zhou
5ef75fd7a7
[Docs] Introduce how to customize distributed training settings (#1279) 2023-07-31 15:40:45 +08:00
KerwinKai
68360e7ce8
[Feature] Add parameter save_begin for CheckpointHook (#1271) 2023-07-25 19:21:21 +08:00
youkaichao
66d828d8d3
[Enhancement] Rename fast_conv_bn_eval to efficient_conv_bn_eval (#1251) 2023-07-15 22:13:17 +08:00
youkaichao
40e49ff747
[Feature] Enable fast conv bn eval (#1202) 2023-07-14 18:21:55 +08:00
Zaida Zhou
33e30b7cb6
[Docs] how to train a large model (#1228) 2023-07-05 18:20:07 +08:00
Mashiro
529bab815f
[Fix] Fix docs (#1233) 2023-07-04 22:20:30 +08:00
Mashiro
399f76ffa8 [Experimental] Add support for FSDP (#1213) 2023-06-29 15:19:33 +08:00
Zaida Zhou
ccd5dc8b18 [Experimental] Add FlexibleRunner and Strategies (#1183) 2023-06-29 15:19:33 +08:00
Mashiro
22aa46bf56
[Docs] Fix config doc (#1218) 2023-06-28 19:01:08 +08:00
Zaida Zhou
d03a1da9a9
[Docs] Add a document to introduce how to debug with vscode (#1212) 2023-06-27 16:46:47 +08:00
Mashiro
478c952a6d
[Refacor] Replace 'if base' with 'with read_base' context manager (#1207) 2023-06-25 13:53:19 +08:00
Mashiro
6ece63ed35
[Feature] Support Pure Python style Configuration File (#1071) 2023-06-16 12:52:07 +08:00
syo093c
0052873b41
[Docs] Fix typo in document (#1201) 2023-06-14 22:15:20 +08:00
Zaida Zhou
cf477d15a2
[Docs] Add the usage of clearml (#1180) 2023-06-01 21:54:30 +08:00
vugia truong
68414516aa
[Feature] Add vis backend for clearml (#878) (#1091) 2023-06-01 17:41:34 +08:00
Zaida Zhou
4a9e379c1a
[Feature] Support Sophia optimizers (#1170) 2023-05-30 15:44:06 +08:00
Zaida Zhou
691500dce6
[Docs] Move the usage of distributed training to a single document (#1171) 2023-05-28 20:10:16 +08:00
Xin Li
d59acfbd9f
[Docs] Translate data_element.md (#1067) 2023-05-23 12:59:29 +08:00
gy77
ec2e00ae90
[Docs] Fix a missing comma in tutorials/runner.md (#1146) 2023-05-16 14:20:44 +08:00
Zaida Zhou
70c28415db
[Docs] Move translation of infer.md to en (#1138) 2023-05-09 11:45:18 +08:00
XHr
3b7c70fa97
[Docs] Translate infer.md (#1121) 2023-05-09 10:53:51 +08:00
Mashiro
6ba667c8cf
[Fix] Save optimizer.state_dict() in cpu by default (#966) 2023-04-26 16:47:47 +08:00
Mashiro
1c01594c5c
[Docs] Update links (#1108) 2023-04-25 18:51:11 +08:00
Zaida Zhou
43165160e6
[Docs] Replace MMCls with MMPretrain in docs (#1096)
* [Docs] Replace MMCls with MMPretrain in docs

* fix format
2023-04-23 15:29:43 +08:00
luomaoling
5b9a1544b0
[Feature] Add torch_npu optimizer (#1079) 2023-04-21 15:15:10 +08:00
LEFTeyes
6b366f236c
[Docs] Translate tutorials/evaluation.md (#1053)
* [Docs] Translate tutorials/evaluation.md
2023-04-12 12:53:14 +08:00
Zaida Zhou
9207e84aa0
[Docs] Introduce the use of wandb and tensorboard (#912)
* [Docs] Introduce the use of wandb and tensorboard

* fix link

* Update docs/en/common_usage/visualize_training_log.md
2023-04-11 12:31:05 +08:00
sung-hwa kim
8bf1ecad38
[Feature] Add vis backend for MLflow. (#878)
* add vis mlflow backend
2023-04-07 16:35:41 +08:00
KerwinKai
5b35c5b6ad
[Feature] Publish models after training if published_keys is set in CheckpointHook (#987)
* add publish keys in checkpointhook and update hook.md file

* Update checkpoint_hook.py

To avoid `mypy` warning `mmengine/hooks/checkpoint_hook.py:358: error: Unsupported right operand type for in ("Optional[List[str]]") Found 1 error in 1 file (checked 224 source files)`

* Update hook.md

Try to avoid trim trailing whitespace waring in hook.md

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update checkpoint_hook.py

* Update docs/en/tutorials/hook.md

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update hook.md

add 自动发布最好的和最后的权重

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update checkpoint_hook.py

add condition when the best checkpoints more than 1.

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update checkpoint_hook.py

delete re judge

* Update checkpoint_hook.py

* Update checkpoint_hook.py

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update checkpoint_hook.py

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update checkpoint_hook.py

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Add Test for publish model

* Update checkpoint_hook.py

* Update test_checkpoint_hook.py

* Fix file to pass pre-commit check

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Fix mypy warning

* rm not necessary line in checkpoint_hook.py

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* rm unnecessary messages add to message_hub

* Update mmengine/hooks/checkpoint_hook.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update docs/zh_cn/tutorials/hook.md

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update docs/zh_cn/tutorials/hook.md

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* update checkpoint hook and hook.md file

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update mmengine/hooks/checkpoint_hook.py

---------

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2023-03-29 10:25:14 +08:00
Mashiro
395ebf8d82
[Enhancement] Support dumping logs of different ranks (#968)
* Add hostname

* Update mmengine/logging/logger.py

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Fix lint

* support record device id

* Fix unit test

* Clean the code

* Fix as comment

* Fix as comment

* Fix unit test

* Update doc

* Fix unit test

* Adjust sequence

* Replace \ with ()

* remove unnecessary ()

* does not change filename in single gpu training

* Fix ci

* fix docs

* Fix as comment

---------

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2023-03-13 14:35:31 +08:00
Yijie Zheng
330985d6c1
[Docs] Translate "Model Complexity Analysis" to Chinese (#969)
* [Doc] Translate model complexity analysis into Chinese.

* [Doc] Translate model complexity analysis into Chinese.

* [Docs] fix the description of the interface

* update  introduction

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update description of FLOPs

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update activation

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update model description

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Beautify code style

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Modify examples

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Upadate output description

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Update docs/zh_cn/advanced_tutorials/model_analysis.md

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Replace FLOPs with flop; fix typo

* Fix typo

* fix lint error

* Update docs/zh_cn/advanced_tutorials/model_analysis.md

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update docs/zh_cn/advanced_tutorials/model_analysis.md

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update docs/zh_cn/advanced_tutorials/model_analysis.md

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update docs/zh_cn/advanced_tutorials/model_analysis.md

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update docs/zh_cn/advanced_tutorials/model_analysis.md

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update docs/zh_cn/advanced_tutorials/model_analysis.md

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Update model_analysis.md

* Update model_analysis.md

* Apply suggestions from code review

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

---------

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2023-03-13 14:31:06 +08:00
Qian Zhao
0d25625ba2
[Feature] Support torch.compile since PyTorch2.0 (#976)
* enable compile configurations to support torch.compile in Runner

* enable compilation in train, val and test

* fix as comments

* add docstring to illustrate usage

* minor refine error message

* add unittests

* fix ut skip

* add logging message to inform users

* compile `train_step`, `val_step`, `test_step` instead

* fix as comments

* revert to compile `train_step` only due to pt2 issue

* add documentation about torch.compile
2023-03-12 18:26:43 +08:00
Mashiro
44f30f649e
[Enhancement] Add FUNCTIONS root Registry (#983)
* [Enhancement] Add FUNCTIONS Registry

* Refine as comment

* clean the code
2023-03-08 12:53:24 +08:00
Julius Zhang
aeb5c454c5
[Docs] Fix typo in hook document (#980) 2023-03-07 12:53:30 +08:00
Hakjin Lee
b3430e4257
[Feature] Support EarlyStoppingHook (#739)
* [Feature] EarlyStoppingHook

* delete redundant line

* Assert stop_training and rename tests

* Fix UT

* rename `metric` to `monitor`

* Fix UT

* Fix UT

* edit docstring on patience

* Draft for new code

* fix ut

* add test case

* add test case

* fix ut

* Apply suggestions from code review

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>

* Append hook

* Append hook

* Apply suggestions

* Update suggestions

* Update mmengine/hooks/__init__.py

* fix min_delta

* Apply suggestions from code review

* lint

* Apply suggestions from code review

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* delete save_last

* infer rule more robust

* refine unit test

* Update mmengine/hooks/early_stopping_hook.py

---------

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
Co-authored-by: Mashiro <57566630+HAOCHENYE@users.noreply.github.com>
Co-authored-by: zhouzaida <zhouzaida@163.com>
Co-authored-by: HAOCHENYE <21724054@zju.edu.cn>
2023-03-06 13:18:42 +08:00
Qian Zhao
2ed8e343a0
[Feature] Enable bf16 in AmpOptimWrapper (#960)
* support bf16 in AmpOptimWrapper

* add docstring

* modify docs

* add unittests for bf16 in AmpOptimWrapper

* fix type

* fix to pass ci

* fix ut skip logic to pass ci

* fix as comment

* add type hints

* fix docstring and add warning information

* remove check for pytorch>=1.6 in unittest

* modify unittest

* modify unittest

* remove torch.float32 && torch.float64 from valid dtypes

* fix as comments

* minor refine docstring

* fix unittest parameterized to pass CI

* fix unittest && add back torch.float32, torch.float64
2023-03-01 21:35:18 +08:00
Mashiro
6a56ca78e3
Bump version to v0.6.0 (#954)
* update version

* Update change log

* Fix as comment

* Add link to username

* Refine

* Adjust highlight sequence

* Fix as comment

* Fix error format in changelog

* delete chinese changelog

* remove link

* Adjust highlight sequence
2023-02-24 14:30:01 +08:00
Zaida Zhou
fc9518e2c1
[Feature] Add Lion optimizer (#952) 2023-02-23 11:24:50 +08:00
Zaida Zhou
67acdbe245
[Docs] Add a document about debug tricks (#938)
* fix typo

* [Docs] Add debug skills

* minor fix

* refine

* rename debug_skills to debug_tricks

* refine

* Update docs/en/common_usage/debug_tricks.md
2023-02-21 21:40:35 +08:00
Zaida Zhou
4861f034a7
[Docs] Count FLOPs and parameters (#939)
* [Docs] Count FLOPs and parameters

* add the doc to index.rst

* fix table in HTML

* fix

* fix

* fix indent

* refine
2023-02-21 21:16:18 +08:00
Mashiro
346989464c
[Docs] Add the document for the transition between IterBasedTraining and EpochBasedTraining (#926)
* Add epoch 2 iter

* Add epoch 2 iter

* Refine chinese docs

* Add example for training CIFAR10 by iter

* minor refine

* Fix as comment

* Fix as comment

* Refine description

* Fix as comment

* minor refine

* Refine description

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>

* Translate to en

* Adjust indent

---------

Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
2023-02-21 21:12:38 +08:00