Ross Wightman
4cd7fb88b2
clip gradients with update
2023-04-19 23:36:20 -07:00
Ross Wightman
df81d8d85b
Cleanup gradient accumulation, fix a few issues, a few other small cleanups in related code.
2023-04-19 23:11:00 -07:00
Taeksang Kim
7f29a46d44
Add gradient accumulation option to train.py
...
option: iters-to-accum(iterations to accmulate)
Gradient accumulation improves training performance(samples/s).
It can reduce the number of parameter sharing between each node.
This option can be helpful when network is bottleneck.
Signed-off-by: Taeksang Kim <voidbag@puzzle-ai.com>
2023-02-06 09:24:48 +09:00
Ross Wightman
4f49b94311
Initial AGC impl. Still testing.
2021-02-15 23:22:44 -08:00
Ross Wightman
80078c47bb
Add Adafactor and Adahessian optimizers, cleanup optimizer arg passing, add gradient clipping support.
2020-10-09 17:24:43 -07:00
Ross Wightman
fcb6258877
Add missing leaky_relu layer factory defn, update Apex/Native loss scaler interfaces to support unscaled grad clipping. Bump ver to 0.2.2 for pending release.
2020-10-02 16:19:39 -07:00
Ross Wightman
532e3b417d
Reorg of utils into separate modules
2020-09-07 13:58:09 -07:00