Commit Graph

868 Commits (109cd44c7ea2a384ace8a255b8d20c3b9b3dd351)
 

Author SHA1 Message Date
Zhihao Lin 109cd44c7e
[Fix] Fix dist.collect_results to keep all ranks' elements (#1469) 2024-01-11 10:50:36 +08:00
Zhihao Lin b51bf60964
[Fix] Fix the resume of iteration (#1471) 2024-01-11 10:47:05 +08:00
Mashiro 4a50213c69
[Fix] Fix Config.to_dict (#1465) 2024-01-02 16:07:54 +08:00
Zaida Zhou e4600a6993
[Docs] Add the usage of ProfilerHook (#1466) 2024-01-02 15:59:37 +08:00
XiwuChen 369f15e27a
[Docs] Fix nnodes in the doc of ddp training (#1462) 2024-01-02 10:42:58 +08:00
fanqiNO1 1398e4200e
bump version to v0.10.2 (#1460) 2023-12-26 16:30:01 +08:00
lanzeshun 8e6fb12b1f
[Fix] Support multi-node distributed training with NPU backend (#1459) 2023-12-26 16:14:45 +08:00
fanqiNO1 671f3bcdf4
[Fix] Fix placement policy in ColossalAIStrategy (#1440) 2023-12-23 16:24:39 +08:00
SCZwangxiao efcd364124
[Fix] Fix load_model_state_dict in BaseStrategy (#1447) 2023-12-23 11:17:46 +08:00
del-zhenwu 504fa4f5cb
[Fix] Use ImportError to cover ModuleNotFoundError raised by opencv-python (#1438) 2023-12-23 11:15:20 +08:00
fanqiNO1 85c0976bc2
bump version to v0.10.1 (#1436) 2023-11-22 11:12:04 +08:00
fanqiNO1 e461581e55
[Docs] Add build mmengine-lite from source (#1435) 2023-11-22 11:02:12 +08:00
fanqiNO1 246ec1ff35
[Fix] Fix collect_env without opencv (#1434) 2023-11-22 10:50:23 +08:00
fanqiNO1 bdd653a8c3
[Fix] Fix deploy.yml (#1431) 2023-11-21 11:21:36 +08:00
fanqiNO1 be48e8b2f4
bump version to v0.10.0 (#1430) 2023-11-21 09:33:51 +08:00
fanqiNO1 6be0aeb777
[Feature] Support for installing mmengine without opencv (#1429) 2023-11-20 22:00:46 +08:00
fanqiNO1 a5db5bedb9
[Fix] Fix CI for torch2.1.0 (#1418) 2023-11-20 19:31:14 +08:00
fanqiNO1 fd5d06243f
[Fix] Fix scale_lr in SingleDeviceStrategy (#1428) 2023-11-20 16:36:43 +08:00
whcao 5a90805b1e
[Bugs] Fix bugs in colo optimwrapper (#1426) 2023-11-14 17:09:26 +08:00
Zhihao Lin 26f22ed283
[Fix] Support exclude_frozen_parameters for DeepSpeedStrategy's resume (#1424) 2023-11-08 23:35:12 +08:00
fanqiNO1 46784185cf
bump version to v0.9.1 (#1421) 2023-11-03 16:03:56 +08:00
fanqiNO1 eb4fa73b56
[Enhancement] Enhance inputs_to_half in DeepSpeedStrategy (#1400) 2023-11-02 17:19:42 +08:00
Zhihao Lin 27ab6a69f5
[Feature] Add `exclude_frozen_parameters` for `DeepSpeedStrategy` (#1415) 2023-11-02 14:32:55 +08:00
Jon 2a563f4dd5
[Fix] ConcatDataset raises error when metainfo is np.array (#1407) 2023-10-31 17:19:34 +08:00
Peng Lu e0cf958074
[Fix] Fix a bug when module is missing in low version of bitsandbytes (#1388) 2023-10-31 16:59:39 +08:00
whlook b0c701a4c9
[Fix] Fix func params using without init in OneCycleLR (#1403) 2023-10-31 14:12:50 +08:00
Mashiro e43bbb5e03
[Fix] Fix new config in visualizer (#1390) 2023-10-26 15:31:03 +08:00
Zaida Zhou c65187c6b8
[Docs] Rename master to main (#1397) 2023-10-18 19:13:26 +08:00
Yiyao Yang 7495b33f34
Add torch 2.1.0 checking in CI (#1389) 2023-10-18 18:42:13 +08:00
POI-WX d198b53426
[Feature] Support slurm distributed training for mlu devices (#1396) 2023-10-18 16:22:31 +08:00
fanqiNO1 6c5eebb823
bump version to v0.9.0 (#1384) 2023-10-10 18:55:41 +08:00
fanqiNO1 3b639da1ef
[Docs] Fix typo (#1385) 2023-10-10 11:32:46 +08:00
Mashiro 8015d62202
[Feature] Support using gradient checkpointing in FSDP (#1382) 2023-10-09 21:04:55 +08:00
fanqiNO1 bf30c444de
Update the version info (#1383) 2023-10-09 16:18:40 +08:00
Shu Liqiang b8a31671a4
[Feature] Runner supports setting the number of iterations for per epoch (#1292) 2023-10-08 16:43:44 +08:00
Mashiro 95d875832a
[Enhance] Support for installing minimal runtime dependencies (#1362) 2023-10-08 16:38:40 +08:00
hiyyg eb5834fa66
[Enhance] metainfo of dataset can be a generic dict-like Mapping (#1378) 2023-10-08 14:25:26 +08:00
Zaida Zhou 9cbe0665d9
[Docs] Add the contributing doc in pr template (#1380) 2023-10-08 11:42:37 +08:00
Mashiro 6f0aae4b52
Fix docs building error caused by deepspeed (#1379) 2023-10-08 10:42:31 +08:00
Gu Wang daacb1878b
[Docs] Fix doc typo our_dir in LoggerHook (#1373) 2023-10-06 23:11:23 +08:00
尹傲雄 c863e8b133
[Fix] Ensure from_cfg of Runner have the same defaults values as its __init__ (#1368) 2023-09-27 10:56:38 +08:00
6V e9e08dbb65
[Enhance] Add unit tests for autocast with Ascend device (#1363) 2023-09-27 10:20:13 +08:00
takuoko 88dc1e98b1
[Fix] Delete yapf verify parameter (#1365) 2023-09-24 09:25:52 +08:00
takuoko d617bcafdd
[Feature] Support Adafactor Optimizer (#1361) 2023-09-21 16:30:24 +08:00
Mashiro 53474ef1ba
[Fix] Fix get class attribute from a string (#1345) 2023-09-18 11:50:41 +08:00
Zaida Zhou be3d5c6f6e
Fix pydantic version to fix mlflow unit tests (#1351) 2023-09-16 13:11:02 +08:00
Zaida Zhou 9b94af42b8
[Docs] Update the usage of bitsandbytes in Chinese document (#1359) 2023-09-15 23:14:33 +08:00
takuoko e91bfa4593
[Feature] Support bitsandbytes (#1357) 2023-09-15 22:56:11 +08:00
Zaida Zhou c5274ba326
[Docs] Fix typos (#1348) 2023-09-10 00:18:21 +08:00
huaibovip 00df73cf43
[Fix] The keyword mode appears nested multiple times in the log (#1305) 2023-09-09 20:54:57 +08:00