Tingquan Gao
ab087065e9
support to specify rank to log when using Fleet API ( #3039 )
...
* support to specify rank to log when using Fleet API
* log max mem reserved
* log_ranks support str type
example: -o Global.log_ranks="0,1"
* log max mem allocated
* support to specify rank to log in static mode
* log max mem reserved and max mem allocated in static mode
2023-11-16 11:32:29 +08:00
baocheny
3d0c0eb59d
add 2 more custom devices intel_gpu and apple mps
2023-06-29 19:42:38 +08:00
Bobholamovic
de5c4e1b1c
Change vdl dir
2023-06-26 14:20:38 +08:00
gaotingquan
bdfa1feb2f
update for amp config refactoring
2023-05-29 19:52:09 +08:00
kangguangli
731006f1fc
set seed by configs
2023-04-25 17:39:55 +08:00
kangguangli
293a216a0b
fix random seed
2023-04-25 17:39:55 +08:00
Tingquan Gao
7d41d24ce3
Revert "support Static"
...
This reverts commit c30df630356604fe0846de769d92a04d0130af61.
2023-03-14 16:47:13 +08:00
gaotingquan
c30df63035
support Static
2023-03-10 16:56:55 +08:00
kangguangli
85f65ce76f
fix paddle2.4 hang problem
2023-02-14 10:50:27 +08:00
gaotingquan
9873236bc8
fix: replace use_gpu, etc. by device
2022-10-31 10:43:00 +08:00
gaotingquan
241572e49a
fix: debug
2022-10-31 10:43:00 +08:00
USTCKAY
c032293a77
change judgment logic for multi device
2022-10-26 10:33:10 +08:00
USTCKAY
0cec70bd22
[CustomDevice]add support for custom NPU, test=develop
2022-10-26 10:33:10 +08:00
gaotingquan
c22bdc7e54
remove fluid
2022-05-26 07:40:15 +00:00
gaotingquan
83ed5195c3
fix: set use_fp16_test to True when AMP O2 is enabled
2022-04-18 06:14:43 +00:00
gaotingquan
b761325faa
fix: fp32 eval by default when enable amp
...
If you want to eval by fp16 when enable amp, please set Amp.use_fp16_test=True, False by default.
2022-04-02 19:22:10 +08:00
dongshuilong
a944603da0
fix log twice bug
2022-03-30 08:31:35 +00:00
huangqipeng
b62b98d79f
feat: support mlu device and amp of mlu
2022-03-14 15:48:26 +08:00
gaotingquan
7040ce8314
refactor: change params to be consistent with amp
2022-01-25 11:58:07 +08:00
Wei Shengyu
0f35f706b6
Fix static training speed ( #1590 )
...
* fix training speed
* update config setting method
2021-12-23 11:13:51 +08:00
gaotingquan
ed459a2a16
refactor: adapt to static graph in deprecating MixCELoss
2021-10-27 19:47:43 +08:00
ronnywang
a0eb34a642
Add npu supporting ( #1324 )
2021-10-22 11:02:29 +08:00
Yiqun Liu
00455839f9
Add the profiler back for static training. ( #1094 )
2021-07-29 10:18:45 +08:00
littletomatodonkey
9d9cd3719e
add static training ( #1037 )
...
* add static training
* fix typo
* add se fp16
* rm note
* fix loader
* fix cfg
2021-07-15 10:30:07 +08:00