RaymondLi/perturbed_humaneval
收藏Hugging Face2023-08-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/RaymondLi/perturbed_humaneval
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
---
# Dataset Card for Dataset Name
## Dataset Description
- **Repository:** https://github.com/amazon-science/recode/tree/main
- **Paper:** https://arxiv.org/abs/2212.10264
### Dataset Summary
The Recode benchmark proposes to apply code and natural language transformations to code-generation benchmarks to evaluate the robustness of code-generation models.
This dataset contains the perturbed version of HumanEval that they released.
It was automatically generated from the [HumanEval](https://huggingface.co/datasets/openai_humaneval) dataset.
### Subsets
There are four transformation categories that form the subsets of this dataset: `func_name`, `nlaugmenter`, `natgen` and `format`.
### Languages
The programming problems are written in Python and contains docstrings and comments in English.
## Dataset Structure
### Data Instances
[More Information Needed]
### Data Fields
- `task_id`: ID of the original HumanEval example
- `prompt`: the perturbed prompt
- `entry_point`: entry point for test
- `canonical_solution`: solution for the problem in the `prompt`
- `test`: contains function to test generated code for correctness
- `seed`: seed of the perturbed prompt
- `perturbation_name`: name of the perturbation
- `partial`: partial solution to the problem. This field is only present for transformation categories that affect a partial solution: `natgen` and `format`.
### Data Splits
The dataset only has a test split.
## Dataset Creation
### Curation Rationale
[More Information Needed]
### Source Data
#### Initial Data Collection and Normalization
[More Information Needed]
#### Who are the source language producers?
[More Information Needed]
### Annotations
#### Annotation process
[More Information Needed]
#### Who are the annotators?
[More Information Needed]
### Personal and Sensitive Information
[More Information Needed]
## Considerations for Using the Data
### Social Impact of Dataset
[More Information Needed]
### Discussion of Biases
[More Information Needed]
### Other Known Limitations
[More Information Needed]
## Additional Information
### Dataset Curators
[More Information Needed]
### Licensing Information
[More Information Needed]
### Citation Information
```
@article{wang2022recode,
title={ReCode: Robustness Evaluation of Code Generation Models},
author={Wang, Shiqi and Li, Zheng and Qian, Haifeng and Yang, Chenghao and Wang, Zijian and Shang, Mingyue and Kumar, Varun and Tan, Samson and Ray, Baishakhi and Bhatia, Parminder and others},
journal={arXiv preprint arXiv:2212.10264},
year={2022}
}
```
### Contributions
[More Information Needed]
提供机构:
RaymondLi
原始信息汇总
数据集概述
数据集描述
- 名称: Recode Benchmark
- 目的: 评估代码生成模型的鲁棒性
- 来源: 自动生成的扰动版本HumanEval数据集
- 语言: Python编程语言,包含英文的docstrings和comments
数据集结构
数据字段
task_id: 原始HumanEval示例的IDprompt: 扰动后的提示entry_point: 测试入口点canonical_solution: 问题的解决方案test: 用于测试生成代码正确性的函数seed: 扰动提示的种子perturbation_name: 扰动名称partial: 问题的部分解决方案(仅在natgen和format类别中存在)
数据分割
- 分割: 仅包含测试集
数据集创建
数据集子集
- 子集: 四个变换类别:
func_name,nlaugmenter,natgen,format
数据来源
- 原始数据: HumanEval数据集
- 扰动生成: 自动生成
许可证
- 许可证: Apache-2.0



