Summary:
1. change DPP training in apex way;
2. make warmup scheduler by iter and lr scheduler by epoch;
3. replace random erasing with torchvision implementation;
4. naming modification in config file
Summary: fix classifier init bugs, which will not initialize classifier weights when use arcface or circle loss.
In this way, it will lead loss NaN problem.