Commit Graph

60 Commits (d792a69f3f69159ba3ca6341a8be0e6da2b31633)

Author SHA1 Message Date
liaoxingyu 62ad5e1a8b bugfix for cuda tensor to numpy
Move cuda tensor to host memory in predictor which can be used for following process.
2021-06-17 16:05:45 +08:00
liaoxingyu 8f8cbf9411 fix lr scheduler warning when amp training
Skip lr scheduler when this iteration creates NaN gradients
2021-06-02 16:35:46 +08:00
liaoxingyu 256721cfde Impl `freezebb` in optimizer's step()
Make impl. of `freezebb` consistent with impl. of grad clip, and both are implemented through step() in optimizer
2021-05-31 17:15:26 +08:00
liaoxingyu 2b65882447 change way of layer freezing
Remove `find_unused_parameters` in DDP and add a new step function in optimizer for freezing backbone. It will accelerate training speed in this way.
2021-05-25 15:57:09 +08:00
Sherlock Liao ff8a958fff
bugfix for `plain_train_net.py` and lr scheduler step (#484) 2021-05-11 15:46:17 +08:00
liaoxingyu 0c8e3d9805 update imbalanced sampler
Summary: add a new sampler, which is useful for imbalanced or long-tail dataset. This refers to ufoym/imbalanced-dataset-sampler.
2021-04-21 17:05:10 +08:00
liaoxingyu 44cee30dfc update fastreid v1.2
Summary:
1. refactor dataloader and heads
2. bugfix in fastattr, fastclas, fastface and partialreid
3. partial-fc supported in fastface
2021-04-02 21:33:13 +08:00
Xingyu Liao 890224f25c
support classification in fastreid (#443)
Summary: support classification and refactor build_dataloader which can support explicit parameters passing
2021-03-26 20:17:39 +08:00
Xingyu Liao 15c556c43a
remove apex dependency (#442)
Summary: Use Pytorch1.6(or above) built-in amp training
2021-03-23 12:12:35 +08:00
liaoxingyu f57c5764e3 support multi-node training 2021-03-09 20:07:28 +08:00
liaoxingyu a53fd17874 update docs 2021-01-23 15:25:58 +08:00
liaoxingyu e26182e6ec make lr warmup by iter
Summary: change warmup way by iter not by epoch, which will make it more flexible when training small epochs
2021-01-22 11:17:21 +08:00
liaoxingyu 15e1729a27 update fastreid V1.0 2021-01-18 11:36:38 +08:00
liaoxingyu 2c17847980 feat: freeze FC
Summary: update freeze FC in the last stages of training
2020-12-28 14:46:28 +08:00
liaoxingyu 66941cf27a feat: support flip testing 2020-12-22 15:50:50 +08:00
liaoxingyu 5469e8ce76 feat: add save best model mechanism 2020-12-22 15:49:46 +08:00
liaoxingyu a327a70f0d v0.3 update
Summary:
1. change DPP training in apex way;
2. make warmup scheduler by iter and lr scheduler by epoch;
3. replace random erasing with torchvision implementation;
4. naming modification in config file
2020-12-07 14:19:20 +08:00
liaoxingyu 2724515fd9 save class number to config (#281)
Summary: Save the class number calculated based on datasets to the config file. If you hard-code the class number, make it unchanged.
2020-11-06 16:07:37 +08:00
liaoxingyu 7e9a4775da fixup finetune problem
Summary: support finetune from the other model with different number of classes, and simplify calling way (#325)

close #325

close #325
2020-11-06 15:58:22 +08:00
liaoxingyu 10cbaab155 support finetuning from trained models
Summary: add a flag for supporting finetuning model from the trained weights, and it's very useful when performing across domain reid
2020-09-28 17:10:10 +08:00
liaoxingyu 154a06b875 refactor code 2020-09-23 19:31:46 +08:00
liaoxingyu 4fa3f08a4a fix typro
close #268
2020-09-14 11:34:28 +08:00
liaoxingyu 4d573b8107 refactor reid head
Summary: merge BNneckHead, LinearHead and ReductionHead into EmbeddingHead
because they are highly similar and can be prepared for ClsHead
2020-09-10 10:57:37 +08:00
liaoxingyu 53fed7451d feat: support amp training
Summary: support automatic mixed precision training #217
2020-09-02 18:03:12 +08:00
liaoxingyu d00ce8fc3c refactor model arch 2020-09-01 16:14:45 +08:00
liaoxingyu ac8409a7da updating for pytorch1.6 2020-08-20 15:51:41 +08:00
liaoxingyu 2430b8ed75 pretrain model bugfix
Fix pretrain model download bugs and testing bugs in multiprocess
2020-07-31 10:42:38 +08:00
liaoxingyu 16655448c2 onnx/trt support
Summary: change model pretrain mode and support onnx/TensorRT export
2020-07-29 17:43:39 +08:00
liaoxingyu 3b57dea49f support regnet backbone 2020-07-17 19:13:45 +08:00
liaoxingyu 3f35eb449d minor update 2020-07-14 11:58:06 +08:00
liaoxingyu e81b13798c change way of loss function
Summary: move loss computation from meta_arch to run_step considering distillation loss
2020-07-10 16:28:53 +08:00
liaoxingyu ea8a3cc534 fix typro 2020-07-10 16:26:35 +08:00
liaoxingyu fec7abc461 finish v0.2 ddp training 2020-07-06 16:57:43 +08:00
liaoxingyu 36c04f0a9f fix resume training problem
Summary: when resume training, need to reiter dataloader because we update pid_dict in dataset, but dataloader with multiprocess won't do the same update
2020-05-30 16:44:18 +08:00
liaoxingyu 84c733fa85 fix: remove prefetcher, put normalizer in model
1. remove messy data prefetcher which will cause  confusion
2. put normliazer in model to accelerate training via GPU computing
2020-05-25 23:39:11 +08:00
liaoxingyu e990cf3e34 style: fix some typro 2020-05-21 15:55:51 +08:00
liaoxingyu d63bf5facc fix: add syncBN options in defaultTraine 2020-05-16 22:44:53 +08:00
liaoxingyu b28c0032e8 fix: add monkey-patching to enable syncBN
add a trigger to make syncBN work
2020-05-15 13:33:33 +08:00
liaoxingyu bf18479541 fix: revise syncBN bug 2020-05-14 14:52:37 +08:00
liaoxingyu 651e6ba9c4 feat: support multiprocess predictor
add asyncpredictor to support multiprocessing feature extraction with dataloader
2020-05-09 18:23:36 +08:00
liaoxingyu 4be4cacb73 fix: add a simple way to reset data prefetcher when resume training
use data prefetcher build-in reset function to reload it rather than
redefining a new data prefetcher, otherwise it will introduce other
problems in eval-only mode.
2020-05-09 11:58:27 +08:00
liaoxingyu 9fae467adf feat(engine/defaults): add DefaultPredictor to get image reid features
Add a new predictor interface, and modify demo code to predict image features.
2020-05-08 19:24:27 +08:00
liaoxingyu 0b15ac4e03 feat(hooks&optim): update stochastic weight averging hooks
Update swa method which will do after regular training if you
set this option enabled.
2020-05-08 12:20:04 +08:00
liaoxingyu afac8aad5d Fix(engine): fix preciseBN dataloader bugs
preciseBN needs to pass data prefetcher, but now a DataLoader is passed
2020-05-06 14:26:34 +08:00
liaoxingyu 948af64fd1 feat: add swa algorithm
Add swa and related config options,
if it is enabled, model will do swa after regular training
2020-05-06 10:17:44 +08:00
liaoxingyu 6d96529d4c fix(data): fix resume training bug
fix dataset pid dictionary loading bug when resume training,
data prefetcher will pre-load a batch of data, this will lead to
misalignment of old pid dict and updated pid dict.
We can address this problem by redefine a prefetcher in resume_or_load
2020-05-05 23:20:42 +08:00
liaoxingyu a2dcd7b4ab feat(layers/norm): add ghost batchnorm
add a get_norm fucntion to easily change normalization between batchnorm, ghost bn and group bn
2020-05-01 09:02:46 +08:00
liaoxingyu d27729c5bb refactor(preciseBN): add preciseBN datasets show 2020-04-29 21:05:53 +08:00
liaoxingyu e38a799b63 fix(engine/defaults): fix precise bn bug
fix problem in precise bn, which will not use precise bn datasets, and throw some errors
2020-04-29 16:16:54 +08:00
liaoxingyu 4d2fa28dbb update freeze layer
update preciseBN
update circle loss with metric learning and cross entropy loss form
update loss call methods
2020-04-06 23:34:27 +08:00