codeai-dteam/oop
收藏数据集卡片:面向对象编程
数据集描述
数据集概述
- 数据集名称: OOP benchmark
- 数据量: 431个实例
- 难度级别: 简单、中等、困难三个级别
- 语言: Python
支持的任务和排行榜
- 无具体描述
数据集结构
python from datasets import load_dataset load_dataset("oop")
DatasetDict({ test: Dataset({ features: [task_id, question, canonical_solution, test_list, test_function, entry_point, test_matching, test_match_function], num_rows: 431 }) })
数据实例
json { task_id: OOP/0, question: First, write a WDS class using the Python language. Then, within the WDS class, create a public function called without_duplicates to implement finding the length of the longest substring in a given string s that does not contain any duplicate characters., test_function: def test_run(content1): return WDS().without_duplicates(content1), test_list: [ assert candidate("abcabcbb")==3, assert candidate("bbbbb")==1, assert candidate("pwwkew")==3], entry_point: test_run, test_matching: assert candidate([["class WDS", "def without_duplicates"]]) == True, test_match_function: def matching_function(content): def run_match(text): for task in text: if task not in str_content: return False return True len_cont = len(content) if len_cont==1 and run_match(content[0]) == True: return True elif (len_cont==2 and run_match(content[0]) == True) or (len_cont==2 and run_match(content[1]) == True): return True else: return False }
数据字段
task_id: 数据样本的标识符question: 编程任务的描述test_function: 测试的运行函数test_list: 验证解决方案的测试列表entry_point: 测试的入口点test_matching: 验证解决方案的测试列表test_match_function: 测试的匹配函数
数据分割
- 测试集: 431个样本
引用信息
@inproceedings{wang2024oop, title={OOP: Object-Oriented Programming Evaluation Benchmark for Large Language Models}, author={Shuai Wang and Liang Ding and Li Shen and Yong Luo and Bo Du and Dacheng Tao}, year={2024}, booktitle={Findings of the Association for Computational Linguistics: ACL 2023}, url={https://arxiv.org/abs/2401.06628}, }




