LLM generated Python Compiler Test Dataset
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/11062814
下载链接
链接失效反馈官方服务:
资源简介:
This dataset is generated by integrating Large Language Models (LLMs) with AFL++ fuzzing to enhance compiler testing for CPython. It includes original Python test scripts created by LLMs such as Mistral 7B, Codellama 7B, and Gemma 7B, targeted at various compiler functionalities. These scripts were subjected to fuzzing, resulting in a rich collection of test cases that tests potential vulnerabilities. An optional minimization process with AFL-cmin refined the dataset, ensuring it focuses on test cases that significantly contribute to code coverage and bug discovery. This dataset serves as a valuable resource for improving compiler design and testing efficiency, supporting further research and development in AI-driven software testing methods.
please see references for citations of software used in this development
创建时间:
2024-04-26



