LLM generated Python Compiler Test Dataset

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://zenodo.org/record/11062814

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset is generated by integrating Large Language Models (LLMs) with AFL++ fuzzing to enhance compiler testing for CPython. It includes original Python test scripts created by LLMs such as Mistral 7B, Codellama 7B, and Gemma 7B, targeted at various compiler functionalities. These scripts were subjected to fuzzing, resulting in a rich collection of test cases that tests potential vulnerabilities. An optional minimization process with AFL-cmin refined the dataset, ensuring it focuses on test cases that significantly contribute to code coverage and bug discovery. This dataset serves as a valuable resource for improving compiler design and testing efficiency, supporting further research and development in AI-driven software testing methods. please see references for citations of software used in this development

创建时间：

2024-04-26

5,000+

优质数据集

54 个

任务类型

进入经典数据集