QServe-benchmarks
收藏魔搭社区2025-12-05 更新2025-06-14 收录
下载链接:
https://modelscope.cn/datasets/mit-han-lab/QServe-benchmarks
下载链接
链接失效反馈官方服务:
资源简介:
# QServe benchmarks
This huggingface repository contains configurations and tokenizer files for all models benchmarked in our [QServe](https://github.com/mit-han-lab/qserve) project:
- Llama-3-8B
- Llama-2-7B
- Llama-2-13B
- Llama-2-70B
- Llama-30B
- Mistral-7B
- Yi-34B
- Qwen1.5-72B
Please clone this repository if you wish to run our QServe benchmark code without cloning full models.
Please consider citing our paper if it is helpful:
```
@article{lin2024qserve,
title={QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving},
author={Lin*, Yujun and Tang*, Haotian and Yang*, Shang and Zhang, Zhekai and Xiao, Guangxuan and Gan, Chuang and Han, Song},
year={2024}
}
```
# QServe 基准测试集
本Hugging Face仓库包含了我们[QServe](https://github.com/mit-han-lab/qserve)项目中所有参与基准测试的模型的配置文件与分词器文件:
- Llama-3-8B
- Llama-2-7B
- Llama-2-13B
- Llama-2-70B
- Llama-30B
- Mistral-7B
- Yi-34B
- Qwen1.5-72B
若您希望运行QServe基准测试代码且无需克隆完整模型,请克隆本仓库。
若本项目对您的研究有所帮助,请考虑引用我们的论文:
@article{lin2024qserve,
title={QServe:面向高效大语言模型(LLM)服务的W4A8KV4量化与系统协同设计},
author={Lin*, Yujun and Tang*, Haotian and Yang*, Shang and Zhang, Zhekai and Xiao, Guangxuan and Gan, Chuang and Han, Song},
year={2024}
}
提供机构:
maas
创建时间:
2025-04-08



