* fix gris_sample data type bug when use fp16
* fix gris_sample data type bug when use fp16
* fix v4rec batchsize
* fix bug of hang when multi gpus training(sampler)
* add rec algorithm cppd
* delete cppd useless code
* update cppd bug
* add rec algorithm cppd
* update cppd trainedmodel url
* add cppd en doc