mirror of
https://github.com/open-mmlab/mmengine.git
synced 2025-06-03 21:54:44 +08:00
[Feature] Support Sophia optimizers (#1170)
This commit is contained in:
parent
a92f87099f
commit
4a9e379c1a
92
docs/en/common_usage/better_optimizers.md
Normal file
92
docs/en/common_usage/better_optimizers.md
Normal file
@ -0,0 +1,92 @@
|
|||||||
|
# Better performance optimizers
|
||||||
|
|
||||||
|
This document provides some third-party optimizers supported by MMEngine, which may bring faster convergence speed or higher performance.
|
||||||
|
|
||||||
|
## D-Adaptation
|
||||||
|
|
||||||
|
[D-Adaptation](https://github.com/facebookresearch/dadaptation) provides `DAdaptAdaGrad`, `DAdaptAdam` and `DAdaptSGD` optimziers。
|
||||||
|
|
||||||
|
```{note}
|
||||||
|
If you use the optimizer provided by D-Adaptation, you need to upgrade mmengine to `0.6.0`.
|
||||||
|
```
|
||||||
|
|
||||||
|
- Installation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install dadaptation
|
||||||
|
```
|
||||||
|
|
||||||
|
- Usage
|
||||||
|
|
||||||
|
Take the `DAdaptAdaGrad` as an example.
|
||||||
|
|
||||||
|
```python
|
||||||
|
runner = Runner(
|
||||||
|
model=ResNet18(),
|
||||||
|
work_dir='./work_dir',
|
||||||
|
train_dataloader=train_dataloader_cfg,
|
||||||
|
# To view the input parameters for DAdaptAdaGrad, you can refer to
|
||||||
|
# https://github.com/facebookresearch/dadaptation/blob/main/dadaptation/dadapt_adagrad.py
|
||||||
|
optim_wrapper=dict(optimizer=dict(type='DAdaptAdaGrad', lr=0.001, momentum=0.9)),
|
||||||
|
train_cfg=dict(by_epoch=True, max_epochs=3),
|
||||||
|
)
|
||||||
|
runner.train()
|
||||||
|
```
|
||||||
|
|
||||||
|
## Lion-Pytorch
|
||||||
|
|
||||||
|
[lion-pytorch](https://github.com/lucidrains/lion-pytorch) provides the `Lion` optimizer。
|
||||||
|
|
||||||
|
```{note}
|
||||||
|
If you use the optimizer provided by Lion-Pytorch, you need to upgrade mmengine to `0.6.0`.
|
||||||
|
```
|
||||||
|
|
||||||
|
- Installation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install lion-pytorch
|
||||||
|
```
|
||||||
|
|
||||||
|
- Usage
|
||||||
|
|
||||||
|
```python
|
||||||
|
runner = Runner(
|
||||||
|
model=ResNet18(),
|
||||||
|
work_dir='./work_dir',
|
||||||
|
train_dataloader=train_dataloader_cfg,
|
||||||
|
# To view the input parameters for Lion, you can refer to
|
||||||
|
# https://github.com/lucidrains/lion-pytorch/blob/main/lion_pytorch/lion_pytorch.py
|
||||||
|
optim_wrapper=dict(optimizer=dict(type='Lion', lr=1e-4, weight_decay=1e-2)),
|
||||||
|
train_cfg=dict(by_epoch=True, max_epochs=3),
|
||||||
|
)
|
||||||
|
runner.train()
|
||||||
|
```
|
||||||
|
|
||||||
|
## Sophia
|
||||||
|
|
||||||
|
[Sophia](https://github.com/kyegomez/Sophia) provides `Sophia`, `SophiaG`, `DecoupledSophia` and `Sophia2` optimizers.
|
||||||
|
|
||||||
|
```{note}
|
||||||
|
If you use the optimizer provided by Sophia, you need to upgrade mmengine to `0.7.4`.
|
||||||
|
```
|
||||||
|
|
||||||
|
- Installation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install Sophia-Optimizer
|
||||||
|
```
|
||||||
|
|
||||||
|
- Usage
|
||||||
|
|
||||||
|
```python
|
||||||
|
runner = Runner(
|
||||||
|
model=ResNet18(),
|
||||||
|
work_dir='./work_dir',
|
||||||
|
train_dataloader=train_dataloader_cfg,
|
||||||
|
# To view the input parameters for SophiaG, you can refer to
|
||||||
|
# https://github.com/kyegomez/Sophia/blob/main/Sophia/Sophia.py
|
||||||
|
optim_wrapper=dict(optimizer=dict(type='SophiaG', lr=2e-4, betas=(0.965, 0.99), rho = 0.01, weight_decay=1e-1)),
|
||||||
|
train_cfg=dict(by_epoch=True, max_epochs=3),
|
||||||
|
)
|
||||||
|
runner.train()
|
||||||
|
```
|
@ -24,6 +24,7 @@ You can switch between Chinese and English documents in the lower-left corner of
|
|||||||
common_usage/distributed_training.md
|
common_usage/distributed_training.md
|
||||||
common_usage/speed_up_training.md
|
common_usage/speed_up_training.md
|
||||||
common_usage/save_gpu_memory.md
|
common_usage/save_gpu_memory.md
|
||||||
|
common_usage/better_optimizers.md
|
||||||
common_usage/visualize_training_log.md
|
common_usage/visualize_training_log.md
|
||||||
common_usage/set_random_seed.md
|
common_usage/set_random_seed.md
|
||||||
common_usage/debug_tricks.md
|
common_usage/debug_tricks.md
|
||||||
|
@ -243,7 +243,7 @@ As shown in the above example, `OptimWrapperDict` exports learning rates and mom
|
|||||||
|
|
||||||
### Configure the OptimWapper in [Runner](runner.md)
|
### Configure the OptimWapper in [Runner](runner.md)
|
||||||
|
|
||||||
We first need to configure the `optimizer` for the OptimWrapper. MMEngine automatically adds all optimizers in PyTorch to the `OPTIMIZERS` registry, and users can specify the optimizers they need in the form of a `dict`. All supported optimizers in PyTorch are listed [here](https://pytorch.org/docs/stable/optim.html#algorithms). In addition, `DAdaptAdaGrad`, `DAdaptAdam`, and `DAdaptSGD` can be used by installing [dadaptation](https://github.com/facebookresearch/dadaptation). `Lion` optimizer can used by install [lion-pytorch](https://github.com/lucidrains/lion-pytorch).
|
We first need to configure the `optimizer` for the OptimWrapper. MMEngine automatically adds all optimizers in PyTorch to the `OPTIMIZERS` registry, and users can specify the optimizers they need in the form of a `dict`. All supported optimizers in PyTorch are listed [here](https://pytorch.org/docs/stable/optim.html#algorithms).
|
||||||
|
|
||||||
Now we take setting up a SGD OptimWrapper as an example.
|
Now we take setting up a SGD OptimWrapper as an example.
|
||||||
|
|
||||||
|
92
docs/zh_cn/common_usage/better_optimizers.md
Normal file
92
docs/zh_cn/common_usage/better_optimizers.md
Normal file
@ -0,0 +1,92 @@
|
|||||||
|
# 性能更优的优化器
|
||||||
|
|
||||||
|
本文档提供了一些 MMEngine 支持的第三方优化器,它们可能会带来更快的收敛速度或者更高的性能。
|
||||||
|
|
||||||
|
## D-Adaptation
|
||||||
|
|
||||||
|
[D-Adaptation](https://github.com/facebookresearch/dadaptation) 提供了 `DAdaptAdaGrad`、`DAdaptAdam` 和 `DAdaptSGD` 优化器。
|
||||||
|
|
||||||
|
```{note}
|
||||||
|
如使用 D-Adaptation 提供的优化器,需将 mmengine 升级至 `0.6.0`。
|
||||||
|
```
|
||||||
|
|
||||||
|
- 安装
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install dadaptation
|
||||||
|
```
|
||||||
|
|
||||||
|
- 使用
|
||||||
|
|
||||||
|
以使用 `DAdaptAdaGrad` 为例。
|
||||||
|
|
||||||
|
```python
|
||||||
|
runner = Runner(
|
||||||
|
model=ResNet18(),
|
||||||
|
work_dir='./work_dir',
|
||||||
|
train_dataloader=train_dataloader_cfg,
|
||||||
|
# 如需查看 DAdaptAdaGrad 的输入参数,可查看
|
||||||
|
# https://github.com/facebookresearch/dadaptation/blob/main/dadaptation/dadapt_adagrad.py
|
||||||
|
optim_wrapper=dict(optimizer=dict(type='DAdaptAdaGrad', lr=0.001, momentum=0.9)),
|
||||||
|
train_cfg=dict(by_epoch=True, max_epochs=3),
|
||||||
|
)
|
||||||
|
runner.train()
|
||||||
|
```
|
||||||
|
|
||||||
|
## Lion
|
||||||
|
|
||||||
|
[lion-pytorch](https://github.com/lucidrains/lion-pytorch) 提供了 `Lion` 优化器。
|
||||||
|
|
||||||
|
```{note}
|
||||||
|
如使用 Lion 提供的优化器,需将 mmengine 升级至 `0.6.0`。
|
||||||
|
```
|
||||||
|
|
||||||
|
- 安装
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install lion-pytorch
|
||||||
|
```
|
||||||
|
|
||||||
|
- 使用
|
||||||
|
|
||||||
|
```python
|
||||||
|
runner = Runner(
|
||||||
|
model=ResNet18(),
|
||||||
|
work_dir='./work_dir',
|
||||||
|
train_dataloader=train_dataloader_cfg,
|
||||||
|
# 如需查看 Lion 的输入参数,可查看
|
||||||
|
# https://github.com/lucidrains/lion-pytorch/blob/main/lion_pytorch/lion_pytorch.py
|
||||||
|
optim_wrapper=dict(optimizer=dict(type='Lion', lr=1e-4, weight_decay=1e-2)),
|
||||||
|
train_cfg=dict(by_epoch=True, max_epochs=3),
|
||||||
|
)
|
||||||
|
runner.train()
|
||||||
|
```
|
||||||
|
|
||||||
|
## Sophia
|
||||||
|
|
||||||
|
[Sophia](https://github.com/kyegomez/Sophia) 提供了 `Sophia`、`SophiaG`、`DecoupledSophia` 和 `Sophia2` 优化器。
|
||||||
|
|
||||||
|
```{note}
|
||||||
|
如使用 Sophia 提供的优化器,需将 mmengine 升级至 `0.7.4`。
|
||||||
|
```
|
||||||
|
|
||||||
|
- 安装
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install Sophia-Optimizer
|
||||||
|
```
|
||||||
|
|
||||||
|
- 使用
|
||||||
|
|
||||||
|
```python
|
||||||
|
runner = Runner(
|
||||||
|
model=ResNet18(),
|
||||||
|
work_dir='./work_dir',
|
||||||
|
train_dataloader=train_dataloader_cfg,
|
||||||
|
# 如需查看 SophiaG 的输入参数,可查看
|
||||||
|
# https://github.com/kyegomez/Sophia/blob/main/Sophia/Sophia.py
|
||||||
|
optim_wrapper=dict(optimizer=dict(type='SophiaG', lr=2e-4, betas=(0.965, 0.99), rho = 0.01, weight_decay=1e-1)),
|
||||||
|
train_cfg=dict(by_epoch=True, max_epochs=3),
|
||||||
|
)
|
||||||
|
runner.train()
|
||||||
|
```
|
@ -24,6 +24,7 @@
|
|||||||
common_usage/distributed_training.md
|
common_usage/distributed_training.md
|
||||||
common_usage/speed_up_training.md
|
common_usage/speed_up_training.md
|
||||||
common_usage/save_gpu_memory.md
|
common_usage/save_gpu_memory.md
|
||||||
|
common_usage/better_optimizers.md
|
||||||
common_usage/visualize_training_log.md
|
common_usage/visualize_training_log.md
|
||||||
common_usage/set_random_seed.md
|
common_usage/set_random_seed.md
|
||||||
common_usage/debug_tricks.md
|
common_usage/debug_tricks.md
|
||||||
|
@ -243,7 +243,7 @@ print(optim_dict.get_momentum()) # {'gen.momentum': [0], 'disc.momentum': [0]}
|
|||||||
|
|
||||||
### 在[执行器](./runner.md)中配置优化器封装
|
### 在[执行器](./runner.md)中配置优化器封装
|
||||||
|
|
||||||
优化器封装需要接受 `optimizer` 参数,因此我们首先需要为优化器封装配置 `optimizer`。MMEngine 会自动将 PyTorch 中的所有优化器都添加进 `OPTIMIZERS` 注册表中,用户可以用字典的形式来指定优化器,所有支持的优化器见 [PyTorch 优化器列表](https://pytorch.org/docs/stable/optim.html#algorithms)。另外,可以通过安装 [dadaptation](https://github.com/facebookresearch/dadaptation) 使用 `DAdaptAdaGrad`、`DAdaptAdam` 和 `DAdaptSGD` 3 个优化器。也可以通过安装 [lion-pytorch](https://github.com/lucidrains/lion-pytorch) 使用 `Lion` 优化器。
|
优化器封装需要接受 `optimizer` 参数,因此我们首先需要为优化器封装配置 `optimizer`。MMEngine 会自动将 PyTorch 中的所有优化器都添加进 `OPTIMIZERS` 注册表中,用户可以用字典的形式来指定优化器,所有支持的优化器见 [PyTorch 优化器列表](https://pytorch.org/docs/stable/optim.html#algorithms)。
|
||||||
|
|
||||||
以配置一个 SGD 优化器封装为例:
|
以配置一个 SGD 优化器封装为例:
|
||||||
|
|
||||||
|
@ -105,6 +105,30 @@ def register_lion_optimizers() -> List[str]:
|
|||||||
LION_OPTIMIZERS = register_lion_optimizers()
|
LION_OPTIMIZERS = register_lion_optimizers()
|
||||||
|
|
||||||
|
|
||||||
|
def register_sophia_optimizers() -> List[str]:
|
||||||
|
"""Register Sophia optimizer to the ``OPTIMIZERS`` registry.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List[str]: A list of registered optimizers' name.
|
||||||
|
"""
|
||||||
|
optimizers = []
|
||||||
|
try:
|
||||||
|
import Sophia
|
||||||
|
except ImportError:
|
||||||
|
pass
|
||||||
|
else:
|
||||||
|
for module_name in dir(Sophia):
|
||||||
|
_optim = getattr(Sophia, module_name)
|
||||||
|
if inspect.isclass(_optim) and issubclass(_optim,
|
||||||
|
torch.optim.Optimizer):
|
||||||
|
OPTIMIZERS.register_module(module=_optim)
|
||||||
|
optimizers.append(module_name)
|
||||||
|
return optimizers
|
||||||
|
|
||||||
|
|
||||||
|
SOPHIA_OPTIMIZERS = register_sophia_optimizers()
|
||||||
|
|
||||||
|
|
||||||
def build_optim_wrapper(model: nn.Module,
|
def build_optim_wrapper(model: nn.Module,
|
||||||
cfg: Union[dict, Config, ConfigDict]) -> OptimWrapper:
|
cfg: Union[dict, Config, ConfigDict]) -> OptimWrapper:
|
||||||
"""Build function of OptimWrapper.
|
"""Build function of OptimWrapper.
|
||||||
|
Loading…
x
Reference in New Issue
Block a user