* [Feature] Implement layer-wise learning rate decay optimizer constructor. * Use num_layers instead of max_depth to avoid misleading * Add UT * Update docstring * Update log info * update LearningRateDecay configs --------- Co-authored-by: fangyixiao18 <fangyx18@hotmail.com> |
||
---|---|---|
.. | ||
__init__.py | ||
adan_t.py | ||
lamb.py | ||
lars.py | ||
layer_decay_optim_wrapper_constructor.py |