mirror of
https://github.com/open-mmlab/mmrazor.git
synced 2025-06-03 15:02:54 +08:00
* add gptq implementation * pre-checkout * passed resnet example * passed llama example * aglin gptq acc * add activation quantization * uniform interfaces * add gptq readme * update mmrazor_large redame * add gptq opt example * fix sparse_gpt example for opt * fix import Protocol from py37 * fix error function name * fix bug in test * fix bug * fix bug * limit sparsegpt test with torch>=1.12 * add docstring for gptq and sparse_gpt * pre-commit * align acc & add save load ckpt & add ut * fix ut * fix ut * fix ut * fix ut & add torch2.0 for ci * del torch2.0 for ci * fix ut --------- Co-authored-by: FIRST_NAME LAST_NAME <MY_NAME@example.com>
43 lines
2.8 KiB
Markdown
43 lines
2.8 KiB
Markdown
<div align="center">
|
|
<img src="../../resources/mmrazor-logo.png" width="600"/>
|
|
</div>
|
|
|
|
# MMRazor for Large Models
|
|
|
|
## Introduction
|
|
|
|
MMRazor is dedicated to the development of general-purpose model compression tools. Now, MMRazor not only supports conventional CV model compression but also extends to support large models. This project will provide examples of MMRazor's compression for various large models, including LLaMA, stable diffusion, and more.
|
|
|
|
Code structure overview about large models.
|
|
|
|
```
|
|
mmrazor
|
|
├── implementations # core algorithm components
|
|
├── pruning
|
|
└── quantization
|
|
projects
|
|
└── mmrazor_large
|
|
├── algorithms # algorithms usage introduction
|
|
└── examples # examples for various models about algorithms
|
|
├── language_models
|
|
│ ├── LLaMA
|
|
│ └── OPT
|
|
└── ResNet
|
|
```
|
|
|
|
## Model-Algorithm Example Matrix
|
|
|
|
| | ResNet | OPT | LLama | Stable diffusion |
|
|
| ------------------------------------ | ----------------------------------------------- | ------------------------------------------------------------ | -------------------------------------------------------------- | ---------------- |
|
|
| [SparseGPT](algorithms/SparseGPT.md) | [:white_check_mark:](examples/ResNet/README.md) | [:white_check_mark:](examples/language_models/OPT/README.md) | [:white_check_mark:](examples/language_models/LLaMA/README.md) | |
|
|
| [GPTQ](algorithms/GPTQ.md) | [:white_check_mark:](examples/ResNet/README.md) | [:white_check_mark:](examples/language_models/OPT/README.md) | [:white_check_mark:](examples/language_models/LLaMA/README.md) | |
|
|
|
|
## PaperList
|
|
|
|
We provide a paperlist for researchers in the field of model compression for large models. If you want to add your paper to this list, please submit a PR.
|
|
|
|
| Paper | Title | Type | MMRazor |
|
|
| --------- | --------------------------------------------------------------------------------------------------------------------- | ------------ | --------------------------------------------- |
|
|
| SparseGPT | [SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot](https://arxiv.org/abs/2301.00774) | Pruning | [:white_check_mark:](algorithms/SparseGPT.md) |
|
|
| GPTQ | [GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers](https://arxiv.org/abs/2210.17323) | Quantization | [:white_check_mark:](algorithms/GPTQ.md) |
|