Commit Graph

33 Commits (26c400bf1e5e82de3841d993366fe832cd92f9f2)

Author SHA1 Message Date
Ross Wightman b1b024dfed Scheduler update, add v2 factory method, support scheduling on updates instead of just epochs. Add LR to summary csv. Add lr_base scaling calculations to train script. Fix #1168 2022-10-07 10:43:04 -07:00
Ross Wightman 87939e6fab Refactor device handling in scripts, distributed init to be less 'cuda' centric. More device args passed through where needed. 2022-09-23 16:08:59 -07:00
Ross Wightman 0dbd9352ce Add bulk_runner script and updates to benchmark.py and validate.py for better error handling in bulk runs (used for benchmark and validation result runs). Improved batch size decay stepping on retry... 2022-07-18 18:04:54 -07:00
Ross Wightman 324a4e58b6 disable nvfuser for jit te/legacy modes (for PT 1.12+) 2022-07-13 10:34:34 -07:00
Ross Wightman 2f2b22d8c7 Disable nvfuser fma / opt level overrides per #1244 2022-05-13 09:27:13 -07:00
jjsjann123 f88c606fcf fixing channels_last on cond_conv2d; update nvfuser debug env variable 2022-04-25 12:41:46 -07:00
Ross Wightman f0f9eccda8 Add --fuser arg to train/validate/benchmark scripts to select jit fuser type 2022-01-17 13:54:25 -08:00
Ross Wightman 57992509f9 Fix some formatting in utils/model.py 2021-10-23 20:35:36 -07:00
Ross Wightman e5da481073 Small post-merge tweak for freeze/unfreeze, add to __init__ for utils 2021-10-06 17:00:27 -07:00
Alexander Soare 431e60c83f Add acknowledgements for freeze_batch_norm inspiration 2021-10-06 14:28:49 +01:00
Alexander Soare 65c3d78b96 Freeze unfreeze functionality finalized. Tests added 2021-10-02 15:55:08 +01:00
Alexander Soare 0cb8ea432c wip 2021-10-02 15:55:08 +01:00
Ross Wightman d667351eac Tweak accuracy topk safety. Fix #807 2021-08-19 14:18:53 -07:00
Yohann Lereclus 35c9740826 Fix accuracy when topk > num_classes 2021-08-19 11:58:59 +02:00
Ross Wightman e685618f45
Merge pull request #550 from amaarora/wandb
Wandb Support
2021-04-15 09:26:35 -07:00
Ross Wightman 7c97e66f7c Remove commented code, add more consistent seed fn 2021-04-12 09:51:36 -07:00
Aman Arora 5772c55c57 Make wandb optional 2021-04-10 01:34:20 -04:00
Aman Arora f54897cc0b make wandb not required but rather optional as huggingface_hub 2021-04-10 01:27:23 -04:00
Aman Arora 3f028ebc0f import wandb in summary.py 2021-04-08 03:48:51 -04:00
Aman Arora 624c9b6949 log to wandb only if using using wandb 2021-04-08 03:40:22 -04:00
Aman Arora 6b18061773 Add GIST to docstring for quick access 2021-03-29 15:33:31 +11:00
Aman Arora 92b1db9a79 update docstrings and add check on and 2021-03-29 10:04:51 +11:00
Aman Arora b85be24054 update to work with fnmatch 2021-03-29 09:36:31 +11:00
Aman Arora 20626e8387 Add to extract stats for SPP 2021-03-27 05:40:04 +11:00
Ross Wightman 4f49b94311 Initial AGC impl. Still testing. 2021-02-15 23:22:44 -08:00
Ross Wightman 4203efa36d Fix #387 so that checkpoint saver works with max history of 1. Add checkpoint-hist arg to train.py. 2021-01-31 20:14:51 -08:00
Ross Wightman 4ca52d73d8 Add separate set and update method to ModelEmaV2 2020-12-03 10:05:09 -08:00
Ross Wightman 27bbc70d71 Add back old ModelEma and rename new one to ModelEmaV2 to avoid compat breaks in dependant code. Shuffle train script, add a few comments, remove DataParallel support, support experimental torchscript training. 2020-11-29 16:22:19 -08:00
Ross Wightman 9214ca0716 Simplifying EMA... 2020-11-16 12:51:52 -08:00
Ross Wightman 4a3df7842a Fix topn metric view regression on PyTorch 1.7 2020-10-29 14:04:15 -07:00
Ross Wightman 80078c47bb Add Adafactor and Adahessian optimizers, cleanup optimizer arg passing, add gradient clipping support. 2020-10-09 17:24:43 -07:00
Ross Wightman fcb6258877 Add missing leaky_relu layer factory defn, update Apex/Native loss scaler interfaces to support unscaled grad clipping. Bump ver to 0.2.2 for pending release. 2020-10-02 16:19:39 -07:00
Ross Wightman 532e3b417d Reorg of utils into separate modules 2020-09-07 13:58:09 -07:00