mirror of
https://github.com/huggingface/pytorch-image-models.git
synced 2025-06-03 15:01:08 +08:00
* init square_avg with one instead of zero as per TF * match TF order of ops for square_avg accumulation * move LR scaling to momentum buffer accumulator as per TF * add decoupled weight decay flag (not in TF)