docs: rm wsl from release/2.5

pull/2347/head
gaotingquan 2022-09-22 08:29:24 +00:00 committed by Tingquan Gao
parent 4b591c9946
commit c4653b7e26
1 changed files with 0 additions and 70 deletions

View File

@ -583,76 +583,6 @@ Loss:
weight: 1.0
```
<<<<<<< HEAD
=======
<a name='1.2.8'></a>
#### 1.2.8 WSL
##### 1.2.8.1 WSL 算法介绍
论文信息:
> [Rethinking Soft Labels For Knowledge Distillation: A Bias-variance Tradeoff Perspective](https://arxiv.org/abs/2102.0650)
>
> Helong Zhou, Liangchen Song, Jiajie Chen, Ye Zhou, Guoli Wang, Junsong Yuan, Qian Zhang
>
> ICLR, 2021
WSL (Weighted Soft Labels) 损失函数根据教师模型与学生模型关于真值标签的 CE Loss 比值,对每个样本的 KD Loss 分别赋予权重。若学生模型相对教师模型在某个样本上预测结果更好,则对该样本赋予较小的权重。该方法简单、有效,使各个样本的权重可自适应调节,提升了蒸馏精度。
在ImageNet1k公开数据集上效果如下所示。
| 策略 | 骨干网络 | 配置文件 | Top-1 acc | 下载链接 |
| --- | --- | --- | --- | --- |
| baseline | ResNet18 | [ResNet18.yaml](../../../../ppcls/configs/ImageNet/ResNet/ResNet18.yaml) | 70.8% | - |
| WSL | ResNet18 | [resnet34_distill_resnet18_wsl.yaml](../../../../ppcls/configs/ImageNet/Distillation/resnet34_distill_resnet18_wsl.yaml) | 72.23%(**+1.43%**) | - |
##### 1.2.8.2 WSL 配置
WSL 配置如下所示。在模型构建Arch字段中需要同时定义学生模型与教师模型教师模型固定参数且需要加载预训练模型。在损失函数Loss字段中需要定义`DistillationGTCELoss`学生与真值标签之间的CE loss以及`DistillationWSLLoss`学生与教师之间的WSL loss作为训练的损失函数。
```yaml
# model architecture
Arch:
name: "DistillationModel"
# if not null, its lengths should be same as models
pretrained_list:
# if not null, its lengths should be same as models
freeze_params_list:
- True
- False
models:
- Teacher:
name: ResNet34
pretrained: True
- Student:
name: ResNet18
pretrained: False
infer_model_name: "Student"
# loss function config for traing/eval process
Loss:
Train:
- DistillationGTCELoss:
weight: 1.0
model_names: ["Student"]
- DistillationWSLLoss:
weight: 2.5
model_name_pairs: [["Student", "Teacher"]]
temperature: 2
Eval:
- CELoss:
weight: 1.0
```
>>>>>>> 1f6f4797 (docs: refactor & fix link & rename)
<a name="2"></a>
## 2. 模型训练、评估和预测