[Docs] Add the usage of ProfilerHook (#1466)
parent
369f15e27a
commit
e4600a6993
|
@ -31,11 +31,12 @@ Each hook has a corresponding priority. At each mount point, hooks with higher p
|
|||
|
||||
**custom hooks**
|
||||
|
||||
| Name | Function | Priority |
|
||||
| :---------------------------------: | :----------------------------------------------------------------------: | :---------: |
|
||||
| [EMAHook](#emahook) | apply Exponential Moving Average (EMA) on the model during training | NORMAL (50) |
|
||||
| [EmptyCacheHook](#emptycachehook) | Releases all unoccupied cached GPU memory during the process of training | NORMAL (50) |
|
||||
| [SyncBuffersHook](#syncbuffershook) | Synchronize model buffers at the end of each epoch | NORMAL (50) |
|
||||
| Name | Function | Priority |
|
||||
| :---------------------------------: | :----------------------------------------------------------------------: | :-----------: |
|
||||
| [EMAHook](#emahook) | Apply Exponential Moving Average (EMA) on the model during training | NORMAL (50) |
|
||||
| [EmptyCacheHook](#emptycachehook) | Releases all unoccupied cached GPU memory during the process of training | NORMAL (50) |
|
||||
| [SyncBuffersHook](#syncbuffershook) | Synchronize model buffers at the end of each epoch | NORMAL (50) |
|
||||
| [ProfilerHook](#profilerhook) | Analyze the execution time and GPU memory usage of model operators | VERY_LOW (90) |
|
||||
|
||||
```{note}
|
||||
It is not recommended to modify the priority of the default hooks, as hooks with lower priority may depend on hooks with higher priority. For example, `CheckpointHook` needs to have a lower priority than ParamSchedulerHook so that the saved optimizer state is correct. Also, the priority of custom hooks defaults to `NORMAL (50)`.
|
||||
|
@ -211,6 +212,20 @@ runner = Runner(custom_hooks=custom_hooks, ...)
|
|||
runner.train()
|
||||
```
|
||||
|
||||
### ProfilerHook
|
||||
|
||||
The [ProfilerHook](mmengine.hooks.ProfilerHook) is used to analyze the execution time and GPU memory occupancy of model operators.
|
||||
|
||||
```python
|
||||
custom_hooks = [dict(type='ProfilerHook', on_trace_ready=dict(type='tb_trace'))]
|
||||
runner = Runner(custom_hooks=custom_hooks, ...)
|
||||
runner.train()
|
||||
```
|
||||
|
||||
The profiling results will be saved in the tf_tracing_logs directory under `work_dirs/{timestamp}`, and can be visualized using TensorBoard with the command `tensorboard --logdir work_dirs/{timestamp}/tf_tracing_logs`.
|
||||
|
||||
For more information on the usage of the ProfilerHook, please refer to the [ProfilerHook](mmengine.hooks.ProfilerHook) documentation.
|
||||
|
||||
## Customize Your Hooks
|
||||
|
||||
If the built-in hooks provided by MMEngine do not cover your demands, you are encouraged to customize your own hooks by simply inheriting the base [hook](mmengine.hooks.Hook) class and overriding the corresponding mount point methods.
|
||||
|
|
|
@ -31,11 +31,12 @@ MMEngine 提供了很多内置的钩子,将钩子分为两类,分别是默
|
|||
|
||||
**自定义钩子**
|
||||
|
||||
| 名称 | 用途 | 优先级 |
|
||||
| :---------------------------------: | :-------------------: | :---------: |
|
||||
| [EMAHook](#emahook) | 模型参数指数滑动平均 | NORMAL (50) |
|
||||
| [EmptyCacheHook](#emptycachehook) | PyTorch CUDA 缓存清理 | NORMAL (50) |
|
||||
| [SyncBuffersHook](#syncbuffershook) | 同步模型的 buffer | NORMAL (50) |
|
||||
| 名称 | 用途 | 优先级 |
|
||||
| :---------------------------------: | :--------------------------------: | :-----------: |
|
||||
| [EMAHook](#emahook) | 模型参数指数滑动平均 | NORMAL (50) |
|
||||
| [EmptyCacheHook](#emptycachehook) | PyTorch CUDA 缓存清理 | NORMAL (50) |
|
||||
| [SyncBuffersHook](#syncbuffershook) | 同步模型的 buffer | NORMAL (50) |
|
||||
| [ProfilerHook](#profilerhook) | 分析算子的执行时间以及显存占用情况 | VERY_LOW (90) |
|
||||
|
||||
```{note}
|
||||
不建议修改默认钩子的优先级,因为优先级低的钩子可能会依赖优先级高的钩子。例如 CheckpointHook 的优先级需要比 ParamSchedulerHook 低,这样保存的优化器状态才是正确的状态。另外,自定义钩子的优先级默认为 `NORMAL (50)`。
|
||||
|
@ -206,6 +207,20 @@ runner = Runner(custom_hooks=custom_hooks, ...)
|
|||
runner.train()
|
||||
```
|
||||
|
||||
### ProfilerHook
|
||||
|
||||
[ProfilerHook](mmengine.hooks.ProfilerHook) 用于分析模型算子的执行时间以及显存占用情况。
|
||||
|
||||
```python
|
||||
custom_hooks = [dict(type='ProfilerHook', on_trace_ready=dict(type='tb_trace'))]
|
||||
runner = Runner(custom_hooks=custom_hooks, ...)
|
||||
runner.train()
|
||||
```
|
||||
|
||||
profile 的结果会保存在 `work_dirs/{timestamp}` 下的 `tf_tracing_logs` 目录,通过 `tensorboard --logdir work_dirs/{timestamp}tf_tracing_logs`。
|
||||
|
||||
更多关于 ProfilerHook 的用法请阅读 [ProfilerHook](mmengine.hooks.ProfilerHook) 文档。
|
||||
|
||||
## 自定义钩子
|
||||
|
||||
如果 MMEngine 提供的默认钩子不能满足需求,用户可以自定义钩子,只需继承钩子基类并重写相应的位点方法。
|
||||
|
|
Loading…
Reference in New Issue