CodeElo
收藏魔搭社区2026-05-09 更新2025-05-03 收录
下载链接:
https://modelscope.cn/datasets/Qwen/CodeElo
下载链接
链接失效反馈官方服务:
资源简介:
The evaluation problems in CodeElo benchmark proposed by [CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings](https://arxiv.org/abs/2501.01257).
`description`, `input`, `output`, `interaction` and `note` are in Markdown format.
`input`, `output`, `interaction` and `note` may be empty, and `interaction` is not empty if and only if it is an interactive problem.
A dedicated data explorer is available on our [main page](https://CodeElo-bench.github.io/).
```
@article{codeelo,
title={CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings},
author={Quan, Shanghaoran and Yang, Jiaxi and Yu, Bowen and Zheng, Bo and Liu, Dayiheng and Yang, An and Ren, Xuancheng and Gao, Bofei and Miao, Yibo and Feng, Yunlong and Wang, Zekun and Yang, Jian and Cui, Zeyu and Fan, Yang and Zhang, Yichang and Hui, Binyuan and Lin, Junyang},
journal={arXiv preprint arXiv:2501.01257},
year={2025}
}
```
本数据集采用的评测问题,源自论文《CodeElo:基于可媲美人类的Elo评分(Elo Ratings)评测大语言模型(Large Language Model,LLM)的竞赛级代码生成能力》(链接:https://arxiv.org/abs/2501.01257)所提出的CodeElo基准测试。
数据集中的`description`(描述)、`input`(输入)、`output`(输出)、`interaction`(交互)与`note`(备注)字段均采用Markdown格式。
`input`、`output`、`interaction`及`note`字段均可为空;仅当该问题为交互型问题时,`interaction`字段非空。
我们的官方主页(https://CodeElo-bench.github.io/)提供了专用的数据探索工具。
@article{codeelo,
title={CodeElo:基于可与人类媲美的Elo评分评测大语言模型的竞赛级代码生成能力},
author={Quan, Shanghaoran and Yang, Jiaxi and Yu, Bowen and Zheng, Bo and Liu, Dayiheng and Yang, An and Ren, Xuancheng and Gao, Bofei and Miao, Yibo and Feng, Yunlong and Wang, Zekun and Yang, Jian and Cui, Zeyu and Fan, Yang and Zhang, Yichang and Hui, Binyuan and Lin, Junyang},
journal={arXiv预印本 arXiv:2501.01257},
year={2025}
}
提供机构:
maas
创建时间:
2025-04-27



