HumanEval

Name: HumanEval
Creator: maas
Published: 2025-11-06 18:53:29
License: 暂无描述

魔搭社区2025-11-06 更新2024-08-31 收录

下载链接：

https://modelscope.cn/datasets/OmniData/HumanEval

下载链接

链接失效反馈

官方服务：

资源简介：

displayName: HumanEval license: - MIT taskTypes: - Image Classification - Code Generation - Program Synthesis mediaTypes: - Text labelTypes: [] tags: [] publisher: - OpenAI - Anthropic AI - Zipline publishDate: '2021-07-01' publishUrl: https://github.com/openai/human-eval paperUrl: '' --- ## 简介这是论文“Evaluating Large Language Models Trained on Code”中描述的 HumanEval 问题解决数据集的评估工具。它用于测量从文档字符串合成程序的功能正确性。它由 164 个原始编程问题组成，评估语言理解、算法和简单的数学，还有一些类似于简单的软件面试问题。 ## 引文 ``` @article{chen2021evaluating, title={Evaluating large language models trained on code}, author={Chen, Mark and Tworek, Jerry and Jun, Heewoo and Yuan, Qiming and Pinto, Henrique Ponde de Oliveira and Kaplan, Jared and Edwards, Harri and Burda, Yuri and Joseph, Nicholas and Brockman, Greg and others}, journal={arXiv preprint arXiv:2107.03374}, year={2021} } ``` ## Download dataset :modelscope-code[]{type="git"}

显示名称: HumanEval 许可证: - MIT许可证任务类型: - 图像分类 - 代码生成 - 程序合成媒体类型: - 文本标签类型: 无标签: 无发布方: - OpenAI - Anthropic AI - Zipline 发布日期: 2021年7月1日发布地址: https://github.com/openai/human-eval 论文地址: 无 --- ## 简介本工具为论文《Evaluating Large Language Models Trained on Code》中所描述的HumanEval问题求解数据集的评估套件，用于衡量大语言模型（Large Language Model）从文档字符串合成程序的功能正确性。该数据集包含164道原创编程问题，覆盖语言理解、算法设计与基础数学能力测试，题型类似简易软件面试题。 ## 引文 @article{chen2021evaluating, title={Evaluating large language models trained on code}, author={Chen, Mark and Tworek, Jerry and Jun, Heewoo and Yuan, Qiming and Pinto, Henrique Ponde de Oliveira and Kaplan, Jared and Edwards, Harri and Burda, Yuri and Joseph, Nicholas and Brockman, Greg and others}, journal={arXiv preprint arXiv:2107.03374}, year={2021} } ## 数据集下载 :modelscope-code[]{type="git"}

提供机构：

maas

创建时间：

2024-06-30

5,000+

优质数据集

54 个

任务类型

进入经典数据集