Commit Graph

190 Commits (c9d5694cf2a23077e678fe3e96fa355fd9b9be97)

Author SHA1 Message Date
weishengyu 920a44b0ce move files 2021-05-24 11:43:47 +08:00
Yiqun Liu 0d832a2539
Enable profiler, for both static and dynamic training. (#729)
* Enable profiler for static training.

* Polish the initialize of ProfilerOptions.

* Enable profiler for dynamic mode.
2021-05-21 10:31:43 +08:00
liuyuhui 5170153ee3 merge with upstream/develop 2021-05-08 11:16:41 +00:00
liuyuhui a9bce0409d fix tools/train.py 2021-05-08 08:42:30 +00:00
liuyuhui 55fa094e60 fix one card eval in multicards training 2021-05-08 05:29:12 +00:00
littletomatodonkey e165897c37
fix drop last for training process (#713) 2021-05-06 14:21:54 +08:00
littletomatodonkey 9c0f049603
fix export model eval (#710)
* adapt to net.eval for the framework just contains training flag setting
* fix bug when export swin transformer
2021-05-06 13:55:11 +08:00
Tingquan Gao dd70cb1bb0
Fix HubServing demo (#708) 2021-04-30 16:10:17 +08:00
liuyuhui 14a93e7933
[Kunlun]add multi xpu support for PaddleClas about dygraph (#690)
* add multi xpu support for PaddleClas about dygraph

* add dygraph multi xpu support
2021-04-29 17:08:26 +08:00
littletomatodonkey f17343b017
add support for eval using inference engine (#696) 2021-04-26 15:18:04 +08:00
gaotingquan 64fad361ab Fix the exception of DALI 2021-04-16 10:55:31 +08:00
liuyuhui c3d401b7ea
add multi xpu support for PaddleClas (#678) 2021-04-14 22:31:36 +08:00
littletomatodonkey a7aa14525c
fix repvgg eval (#677)
* fix repvgg eval

* fix dp training

* fix single card train
2021-04-14 02:16:13 +08:00
littletomatodonkey 2e62e2e25e
fix loss reduce from dict to list (#679)
* fix loss reduce from dict to list

* remove note
2021-04-13 23:19:36 +08:00
littletomatodonkey a6e2114e32
add find_unused_parameters param (#668)
* add find_unused_parameters param

* fix default val
2021-04-06 21:29:38 +08:00
Tingquan Gao 4523d4246d
Adapt paddle_inference 2.0.1 API (#667) 2021-04-06 10:52:40 +08:00
yaohai fee32b555a fix small error 2021-03-30 17:14:25 +08:00
yaohai 5fd7085ddf add multilabel feature 2021-03-30 16:02:32 +08:00
Tingquan Gao 8a469799af
support bs>1 (#651)
* support bs>1
2021-03-26 18:52:50 +08:00
Tingquan Gao 8832a3fa0a
Support Visual DL (#650)
* Support Visual DL

* Fix VDL

* Add doc of VDL, test=document_fix

* Add the en doc of VDL, test=document_fix

Co-authored-by: littletomatodonkey <2120160898@bit.edu.cn>
2021-03-23 12:45:01 +08:00
littletomatodonkey 4ba3d47e31
Merge branch 'develop' into cp_fp16_training 2021-03-01 16:18:26 +08:00
Zhang Ting aeccae2128
fix oom for batch_size=208 (#618) 2021-03-01 12:37:17 +08:00
huangxu96 4e43ec6995 new usage of amp training. (#564)
* new usage of amp training.

* change the usage of amp and pure fp16 training.

* modified code as reviews
2021-02-26 09:25:54 +00:00
littletomatodonkey ba17052a54
add eta info (#613)
* add eta info

* rm duplicate desc
2021-02-26 13:40:40 +08:00
littletomatodonkey c8a155635e
fix train and save (#594) 2021-02-01 22:24:25 +08:00
Tingquan Gao aa8e3c1183
Fix the mertirc_list when \'use_mix==True\' (#529) 2020-12-30 17:36:56 +08:00
Tingquan Gao 8c82d89469
Fix the calculation method of batch_time and reader_time (#528) 2020-12-30 17:07:50 +08:00
Tingquan Gao c213c9fc5f
Fix the training log in static graph (#525)
* Adapt to PaddleHub2.0 to eliminate warning
* Fix the training log format
2020-12-30 14:43:00 +08:00
littletomatodonkey e7dbecd22e
fix predict (#527)
* fix predict

* fix export model

* fix doc
2020-12-30 14:28:06 +08:00
QingshuChen 918e68a934
fix bug for kunlun (#518) 2020-12-28 13:43:05 +08:00
Tingquan Gao 11ef05ca32
Update the hubserving (#517)
* Fix the timing of hubserving
2020-12-28 13:28:06 +08:00
Tingquan Gao aa65556433
Adapt to PaddleHub2.0 (#512) 2020-12-24 18:21:37 +08:00
WangXi 62fd192784
optimizer fleet distributed strategy (#500) 2020-12-18 11:18:58 +08:00
Tingquan Gao eb945e5d69
Fix "tar" to "pdparams" to adapt to dygraph (#496)
* Fix "tar" to "pdparams" to adapt to dygraph

* Update the download link of Paddle Inference Library
2020-12-17 23:20:16 +08:00
Tingquan Gao 6ddd8049e0
Add "cpu_num_threads" and "enable_profile" (#494) 2020-12-17 15:23:29 +08:00
littletomatodonkey 8fd56a4503
fix static train (#478) 2020-12-15 16:43:15 +08:00
littletomatodonkey 29b305d228
add profile for pred (#476) 2020-12-15 14:32:07 +08:00
huangxu96 dc3020ab4a
support fp16 training (#435)
* support fp16 training

* Use compiled training program

* Change timing ips.

* Use dali

* add pure fp16 training

* fix a bug, which will not use fuse pass using pure fp16 training.

* modify code as review

* modify loss, so that it will use different loss when using pure fp16 training.

* remove some fluid API

* add static optimizer.
2020-12-11 11:04:51 +08:00
littletomatodonkey 9992415867
add mkldnn speed problm (#465) 2020-12-11 00:39:12 +08:00
littletomatodonkey b9d243f854
Update utils.py 2020-12-11 00:24:14 +08:00
littletomatodonkey 15b18973f1
fix eval script (#464)
* fix eval script

* fix dali shell
2020-12-10 23:45:58 +08:00
Tingquan Gao e3801c5515
Fix the bug about calling create_predictor repeatedly in hubserving (#462) 2020-12-10 01:15:24 +08:00
QingshuChen 066d53f8ec
support cpu/xpu/gpu in static graph (#460) 2020-12-08 20:59:23 +08:00
littletomatodonkey a76e404d9c
fix time sta (#457) 2020-12-08 17:05:00 +08:00
littletomatodonkey 1ecf8334ff
fix pred (#450)
* fix pred

* fix resnest

* fix hrnet se

* fix se for export
2020-12-06 03:08:44 +08:00
littletomatodonkey 0df5ba63df
Update test_hubserving.py 2020-12-04 10:43:56 +08:00
littletomatodonkey 049cbb26d7
Update run_dali.sh 2020-12-03 15:58:07 +08:00
littletomatodonkey 1e4925704d
fix dali train (#446) 2020-12-03 15:48:35 +08:00
littletomatodonkey e92cb0b93c
fix init model in static mode (#444) 2020-12-03 12:50:33 +08:00
lilong12 fff2a92b25
add shell scripts to run training on single node and multiple nodes. (#424)
* add shells, test=develop
2020-12-03 11:11:39 +08:00