Summary: move div255 to gpu add read/write numpy ndarray, which will make the comparison between torch and trt results more easily. Reviewed By: l1aoxingyu