Commit Graph

7 Commits (da75cdd212642980b1db96aea83cf64915cd6ddf)

Author SHA1 Message Date
Ross Wightman 4cd7fb88b2 clip gradients with update 2023-04-19 23:36:20 -07:00
Ross Wightman df81d8d85b Cleanup gradient accumulation, fix a few issues, a few other small cleanups in related code. 2023-04-19 23:11:00 -07:00
Taeksang Kim 7f29a46d44 Add gradient accumulation option to train.py
option: iters-to-accum(iterations to accmulate)

Gradient accumulation improves training performance(samples/s).
It can reduce the number of parameter sharing between each node.
This option can be helpful when network is bottleneck.

Signed-off-by: Taeksang Kim <voidbag@puzzle-ai.com>
2023-02-06 09:24:48 +09:00
Ross Wightman 4f49b94311 Initial AGC impl. Still testing. 2021-02-15 23:22:44 -08:00
Ross Wightman 80078c47bb Add Adafactor and Adahessian optimizers, cleanup optimizer arg passing, add gradient clipping support. 2020-10-09 17:24:43 -07:00
Ross Wightman fcb6258877 Add missing leaky_relu layer factory defn, update Apex/Native loss scaler interfaces to support unscaled grad clipping. Bump ver to 0.2.2 for pending release. 2020-10-02 16:19:39 -07:00
Ross Wightman 532e3b417d Reorg of utils into separate modules 2020-09-07 13:58:09 -07:00