Commit Graph

23 Commits (714a48ea046ec27cc72137dca31536cf2d614f1a)

Author SHA1 Message Date
gaotingquan 64fad361ab Fix the exception of DALI 2021-04-16 10:55:31 +08:00
liuyuhui c3d401b7ea
add multi xpu support for PaddleClas (#678) 2021-04-14 22:31:36 +08:00
littletomatodonkey 4ba3d47e31
Merge branch 'develop' into cp_fp16_training 2021-03-01 16:18:26 +08:00
Zhang Ting aeccae2128
fix oom for batch_size=208 (#618) 2021-03-01 12:37:17 +08:00
huangxu96 4e43ec6995 new usage of amp training. (#564)
* new usage of amp training.

* change the usage of amp and pure fp16 training.

* modified code as reviews
2021-02-26 09:25:54 +00:00
Tingquan Gao aa8e3c1183
Fix the mertirc_list when \'use_mix==True\' (#529) 2020-12-30 17:36:56 +08:00
Tingquan Gao 8c82d89469
Fix the calculation method of batch_time and reader_time (#528) 2020-12-30 17:07:50 +08:00
Tingquan Gao c213c9fc5f
Fix the training log in static graph (#525)
* Adapt to PaddleHub2.0 to eliminate warning
* Fix the training log format
2020-12-30 14:43:00 +08:00
QingshuChen 918e68a934
fix bug for kunlun (#518) 2020-12-28 13:43:05 +08:00
WangXi 62fd192784
optimizer fleet distributed strategy (#500) 2020-12-18 11:18:58 +08:00
littletomatodonkey 8fd56a4503
fix static train (#478) 2020-12-15 16:43:15 +08:00
huangxu96 dc3020ab4a
support fp16 training (#435)
* support fp16 training

* Use compiled training program

* Change timing ips.

* Use dali

* add pure fp16 training

* fix a bug, which will not use fuse pass using pure fp16 training.

* modify code as review

* modify loss, so that it will use different loss when using pure fp16 training.

* remove some fluid API

* add static optimizer.
2020-12-11 11:04:51 +08:00
littletomatodonkey 15b18973f1
fix eval script (#464)
* fix eval script

* fix dali shell
2020-12-10 23:45:58 +08:00
QingshuChen 066d53f8ec
support cpu/xpu/gpu in static graph (#460) 2020-12-08 20:59:23 +08:00
littletomatodonkey a76e404d9c
fix time sta (#457) 2020-12-08 17:05:00 +08:00
littletomatodonkey 049cbb26d7
Update run_dali.sh 2020-12-03 15:58:07 +08:00
littletomatodonkey 1e4925704d
fix dali train (#446) 2020-12-03 15:48:35 +08:00
littletomatodonkey e92cb0b93c
fix init model in static mode (#444) 2020-12-03 12:50:33 +08:00
Tingquan Gao 2b77c71459
Support DALI (#442) 2020-12-02 22:06:23 +08:00
QingshuChen 832364e191
support static graph train for kunlun (#441) 2020-12-02 18:36:51 +08:00
littletomatodonkey 6796bca110
fix logger (#426) 2020-11-25 19:45:43 +08:00
littletomatodonkey e83e3038e1
fix local rank get word size in dist (#402)
* fix local rank
* fix export model
2020-11-18 13:59:34 +08:00
littletomatodonkey 6a5f4626d7
add static running in dygraph (#399)
* add static running in dygraph
2020-11-18 09:48:56 +08:00