* support bf16 in AmpOptimWrapper
* add docstring
* modify docs
* add unittests for bf16 in AmpOptimWrapper
* fix type
* fix to pass ci
* fix ut skip logic to pass ci
* fix as comment
* add type hints
* fix docstring and add warning information
* remove check for pytorch>=1.6 in unittest
* modify unittest
* modify unittest
* remove torch.float32 && torch.float64 from valid dtypes
* fix as comments
* minor refine docstring
* fix unittest parameterized to pass CI
* fix unittest && add back torch.float32, torch.float64
* fix the bug when the params in shared modules do not require grad
* test DefaultOptimWrapperConstructor when the params in shared modules do not require grad
* fix zero_optimizer error with param groups when pytorch < 1.12.0
* add docstring
* fix docstring
* add unittest
* change ut to use a valid paramwise_cfg
* modify ut
* fix as comments
* [Enhance] add documents for , and support clip grad by value
* refine docstring
* fix as comment
* Fix as comment
* minor refine
* minor refine
* remove error comment for clip grad
* refine docstring
* Rename data to structure
* adjust the way to import module
* adjust the way to import module
* rename Structure to Data Structures in docs api
* rename structure to structures
* support using some modules of mmengine without torch
* fix circleci config
* fix circleci config
* fix registry ut
* minor fix
* move init method from model/utils to model/weight_init.py
* move init method from model/utils to model/weight_init.py
* move sync_bn to model
* move functions depending on torch to dl_utils
* format import
* fix logging ut
* add weight init in model/__init__.py
* move get_config and get_model to mmengine/hub
* move log_processor.py to mmengine/runner
* fix ut
* Add TimeCounter in dl_utils/__init__.py
* fix save scheduler state dict with optim wrapper
* remove for loop and inherit TestParameterScheduler
* remove for loop and inherit TestParameterScheduler
* minor refine
* merge context
* update unit test
* add docstring
* fix bug in AmpOptimWrapper
* add docstring for backward
* add warning and docstring for accumuate gradient
* fix docstring
* fix docstring
* add params_group method
* fix as comment
* fix as comment
* make default_value of loss_scale to dynamic
* Fix docstring
* decouple should update and should no sync
* rename attribute in OptimWrapper
* fix docstring
* fix comment
* fix comment
* fix as comment
* fix as comment and add unit test