arena-hard-auto
收藏魔搭社区2026-01-06 更新2025-04-26 收录
下载链接:
https://modelscope.cn/datasets/lmarena-ai/arena-hard-auto
下载链接
链接失效反馈官方服务:
资源简介:
# Arena-Hard-Auto
Repo for storing pre-generated model answers and judgment for
- Arena-Hard-v0.1
- Arena-Hard-v2.0-Preview
Repo -> https://github.com/lmarena/arena-hard-auto
Paper -> https://arxiv.org/abs/2406.11939
## Citation
The code in this repository is developed from the papers below. Please cite it if you find the repository helpful.
```
@article{li2024crowdsourced,
title={From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline},
author={Li, Tianle and Chiang, Wei-Lin and Frick, Evan and Dunlap, Lisa and Wu, Tianhao and Zhu, Banghua and Gonzalez, Joseph E and Stoica, Ion},
journal={arXiv preprint arXiv:2406.11939},
year={2024}
}
@misc{arenahard2024,
title = {From Live Data to High-Quality Benchmarks: The Arena-Hard Pipeline},
url = {https://lmsys.org/blog/2024-04-19-arena-hard/},
author = {Tianle Li*, Wei-Lin Chiang*, Evan Frick, Lisa Dunlap, Banghua Zhu, Joseph E. Gonzalez, Ion Stoica},
month = {April},
year = {2024}
}
```
# Arena-Hard-Auto
本仓库用于存储预生成的模型应答与评判结果,覆盖以下两个基准测试集:
- Arena-Hard-v0.1
- Arena-Hard-v2.0-Preview
仓库地址:https://github.com/lmarena/arena-hard-auto
相关论文:https://arxiv.org/abs/2406.11939
## 引用
本仓库的代码基于下述论文开发,若本仓库对您的研究有所助益,请引用相关文献:
@article{li2024crowdsourced,
title={From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline},
author={Li, Tianle and Chiang, Wei-Lin and Frick, Evan and Dunlap, Lisa and Wu, Tianhao and Zhu, Banghua and Gonzalez, Joseph E and Stoica, Ion},
journal={arXiv preprint arXiv:2406.11939},
year={2024}
}
@misc{arenahard2024,
title = {From Live Data to High-Quality Benchmarks: The Arena-Hard Pipeline},
url = {https://lmsys.org/blog/2024-04-19-arena-hard/},
author = {Tianle Li*, Wei-Lin Chiang*, Evan Frick, Lisa Dunlap, Banghua Zhu, Joseph E. Gonzalez, Ion Stoica},
month = {April},
year = {2024}
}
提供机构:
maas
创建时间:
2025-04-21



