Xingyu Liao 1dce15efad
faster dataloader with pre-fetch and cuda stream (#456)
Summary: add a background thread to create a generator with pre-fetch, and create a new cuda stream to copy tensor from cpu to gpu in parallel.

Reviewed by: l1aoxingyu
2021-04-12 15:03:35 +08:00
..
2021-01-18 11:36:38 +08:00