five

RaymondLi/perturbed_humaneval

收藏
Hugging Face2023-08-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/RaymondLi/perturbed_humaneval
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 --- # Dataset Card for Dataset Name ## Dataset Description - **Repository:** https://github.com/amazon-science/recode/tree/main - **Paper:** https://arxiv.org/abs/2212.10264 ### Dataset Summary The Recode benchmark proposes to apply code and natural language transformations to code-generation benchmarks to evaluate the robustness of code-generation models. This dataset contains the perturbed version of HumanEval that they released. It was automatically generated from the [HumanEval](https://huggingface.co/datasets/openai_humaneval) dataset. ### Subsets There are four transformation categories that form the subsets of this dataset: `func_name`, `nlaugmenter`, `natgen` and `format`. ### Languages The programming problems are written in Python and contains docstrings and comments in English. ## Dataset Structure ### Data Instances [More Information Needed] ### Data Fields - `task_id`: ID of the original HumanEval example - `prompt`: the perturbed prompt - `entry_point`: entry point for test - `canonical_solution`: solution for the problem in the `prompt` - `test`: contains function to test generated code for correctness - `seed`: seed of the perturbed prompt - `perturbation_name`: name of the perturbation - `partial`: partial solution to the problem. This field is only present for transformation categories that affect a partial solution: `natgen` and `format`. ### Data Splits The dataset only has a test split. ## Dataset Creation ### Curation Rationale [More Information Needed] ### Source Data #### Initial Data Collection and Normalization [More Information Needed] #### Who are the source language producers? [More Information Needed] ### Annotations #### Annotation process [More Information Needed] #### Who are the annotators? [More Information Needed] ### Personal and Sensitive Information [More Information Needed] ## Considerations for Using the Data ### Social Impact of Dataset [More Information Needed] ### Discussion of Biases [More Information Needed] ### Other Known Limitations [More Information Needed] ## Additional Information ### Dataset Curators [More Information Needed] ### Licensing Information [More Information Needed] ### Citation Information ``` @article{wang2022recode, title={ReCode: Robustness Evaluation of Code Generation Models}, author={Wang, Shiqi and Li, Zheng and Qian, Haifeng and Yang, Chenghao and Wang, Zijian and Shang, Mingyue and Kumar, Varun and Tan, Samson and Ray, Baishakhi and Bhatia, Parminder and others}, journal={arXiv preprint arXiv:2212.10264}, year={2022} } ``` ### Contributions [More Information Needed]
提供机构:
RaymondLi
原始信息汇总

数据集概述

数据集描述

  • 名称: Recode Benchmark
  • 目的: 评估代码生成模型的鲁棒性
  • 来源: 自动生成的扰动版本HumanEval数据集
  • 语言: Python编程语言,包含英文的docstrings和comments

数据集结构

数据字段

  • task_id: 原始HumanEval示例的ID
  • prompt: 扰动后的提示
  • entry_point: 测试入口点
  • canonical_solution: 问题的解决方案
  • test: 用于测试生成代码正确性的函数
  • seed: 扰动提示的种子
  • perturbation_name: 扰动名称
  • partial: 问题的部分解决方案(仅在natgenformat类别中存在)

数据分割

  • 分割: 仅包含测试集

数据集创建

数据集子集

  • 子集: 四个变换类别:func_name, nlaugmenter, natgen, format

数据来源

  • 原始数据: HumanEval数据集
  • 扰动生成: 自动生成

许可证

  • 许可证: Apache-2.0
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作