mmrazor/configs/nas/mmcls/dsnas/README.md

# DSNAS

> [DSNAS: Direct Neural Architecture Search without Parameter Retraining](https://arxiv.org/abs/2002.09128.pdf)

<!-- [ALGORITHM] -->

## Abstract

Most existing NAS methods require two-stage parameter optimization.
However, performance of the same architecture in the two stages correlates poorly.
Based on this observation, DSNAS proposes a task-specific end-to-end differentiable NAS framework that simultaneously optimizes architecture and parameters with a low-biased Monte Carlo estimate. Child networks derived from DSNAS can be deployed directly without parameter retraining.

![pipeline](/docs/en/imgs/model_zoo/dsnas/pipeline.jpg)

## Results and models

### Supernet

| Dataset  | Params(M) | FLOPs (G) | Top-1 Acc (%) | Top-5 Acc (%) |                  Config                   |         Download         |     Remarks      |
| :------: | :-------: | :-------: | :-----------: | :-----------: | :---------------------------------------: | :----------------------: | :--------------: |
| ImageNet |   3.33    |   0.299   |     73.56     |     91.24     | [config](./dsnas_supernet_8xb128_in1k.py) | [model](<>) \| [log](<>) | MMRazor searched |

**Note**:

1. There **might be(not all the case)** some small differences in our experiment in order to be consistent with other repos in OpenMMLab. For example,
   normalize images in data preprocessing; resize by cv2 rather than PIL in training; dropout is not used in network. **Please refer to corresponding config for details.**
2. We convert the official searched checkpoint DSNASsearch240.pth into mmrazor-style and evaluate with pytorch1.8_cuda11.0, Top-1 is 74.1 and Top-5 is 91.51.
3. The implementation of ShuffleNetV2 in official DSNAS is different from OpenMMLab's and we follow the structure design in OpenMMLab. Note that with the
   origin ShuffleNetV2 design in official DSNAS, the Top-1 is 73.92 and Top-5 is 91.59.
4. The finetune stage in our implementation refers to the 'search-from-search' stage mentioned in official DSNAS.
5. We obtain params and FLOPs using `mmrazor.ResourceEstimator`, which may be different from the origin repo.

## Citation

```latex
@inproceedings{hu2020dsnas,
  title={Dsnas: Direct neural architecture search without parameter retraining},
  author={Hu, Shoukang and Xie, Sirui and Zheng, Hehui and Liu, Chunxiao and Shi, Jianping and Liu, Xunying and Lin, Dahua},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={12084--12092},
  year={2020}
}
```
[Feature] Add Dsnas Algorithm (#226) * [tmp] Update Dsnas * [tmp] refactor arch_loss & flops_loss * Update Dsnas & MMRAZOR_EVALUATOR: 1. finalized compute_loss & handle_grads in algorithm; 2. add MMRAZOR_EVALUATOR; 3. fix bugs. * Update lr scheduler & fix a bug: 1. update param_scheduler & lr_scheduler for dsnas; 2. fix a bug of switching to finetune stage. * remove old evaluators * remove old evaluators * update param_scheduler config * merge dev-1.x into gy/estimator * add flops_loss in Dsnas using ResourcesEstimator * get resources before mutator.prepare_from_supernet * delete unness broadcast api from gml * broadcast spec_modules_resources when estimating * update early fix mechanism for Dsnas * fix merge * update units in estimator * minor change * fix data_preprocessor api * add flops_loss_coef * remove DsnasOptimWrapper * fix bn eps and data_preprocessor * fix bn weight decay bug * add betas for mutator optimizer * set diff_rank_seed=True for dsnas * fix start_factor of lr when warm up * remove .module in non-ddp mode * add GlobalAveragePoolingWithDropout * add UT for dsnas * remove unness channel adjustment for shufflenetv2 * update supernet configs * delete unness dropout * delete unness part with minor change on dsnas * minor change on the flag of search stage * update README and subnet configs * add UT for OneHotMutableOP 2022-09-29 16:48:47 +08:00			`# DSNAS`

			`> [DSNAS: Direct Neural Architecture Search without Parameter Retraining](https://arxiv.org/abs/2002.09128.pdf)`

			`<!-- [ALGORITHM] -->`

			`## Abstract`

			`Most existing NAS methods require two-stage parameter optimization.`
			`However, performance of the same architecture in the two stages correlates poorly.`
			`Based on this observation, DSNAS proposes a task-specific end-to-end differentiable NAS framework that simultaneously optimizes architecture and parameters with a low-biased Monte Carlo estimate. Child networks derived from DSNAS can be deployed directly without parameter retraining.`

			`![pipeline](/docs/en/imgs/model_zoo/dsnas/pipeline.jpg)`

			`## Results and models`

			`### Supernet`

			`\| Dataset \| Params(M) \| FLOPs (G) \| Top-1 Acc (%) \| Top-5 Acc (%) \| Config \| Download \| Remarks \|`
			`\| :------: \| :-------: \| :-------: \| :-----------: \| :-----------: \| :---------------------------------------: \| :----------------------: \| :--------------: \|`
			`\| ImageNet \| 3.33 \| 0.299 \| 73.56 \| 91.24 \| [config](./dsnas_supernet_8xb128_in1k.py) \| [model](<>) \\| [log](<>) \| MMRazor searched \|`

			`Note:`

			`1. There might be(not all the case) some small differences in our experiment in order to be consistent with other repos in OpenMMLab. For example,`
			`normalize images in data preprocessing; resize by cv2 rather than PIL in training; dropout is not used in network. Please refer to corresponding config for details.`
			`2. We convert the official searched checkpoint DSNASsearch240.pth into mmrazor-style and evaluate with pytorch1.8_cuda11.0, Top-1 is 74.1 and Top-5 is 91.51.`
			`3. The implementation of ShuffleNetV2 in official DSNAS is different from OpenMMLab's and we follow the structure design in OpenMMLab. Note that with the`
			`origin ShuffleNetV2 design in official DSNAS, the Top-1 is 73.92 and Top-5 is 91.59.`
			`4. The finetune stage in our implementation refers to the 'search-from-search' stage mentioned in official DSNAS.`
			5. We obtain params and FLOPs using `mmrazor.ResourceEstimator`, which may be different from the origin repo.

			`## Citation`

			```latex
			`@inproceedings{hu2020dsnas,`
			`title={Dsnas: Direct neural architecture search without parameter retraining},`
			`author={Hu, Shoukang and Xie, Sirui and Zheng, Hehui and Liu, Chunxiao and Shi, Jianping and Liu, Xunying and Lin, Dahua},`
			`booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},`
			`pages={12084--12092},`
			`year={2020}`
			`}`
			```