five

Dataset of the Paper: Security Weaknesses of Copilot-Generated Code in GitHub Projects: An Empirical Study

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10802054
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset collected from GitHub was used to conduct an empirical study on security issues in GitHub Copilot-generated code. We provide below a brief description of each folder and file: 1. source-data foldercontains all the code files from the Code and Repository labels that we collected from github and used in our study. Code snippets are included in these code files. 2. scan-result foldercontains the commands used to perform security scans and all the results from security scans performed using static analysis tools. 3. filtered-result foldercontains the results we kept after filtering the scan results in Step 5. 4. fix-result foldercontains code snippets before and after fixes in RQ3 and the results of fixes for security issues. 5. project-url.xlsxprovides the projects from the Repository label and source files from the Code label that contain Copilot-generated code from GitHub.--SOURCE gives the URL of the project from GitHub.--FILE gives the path to the source code file in the project (only for the source files from the Code label).--NOTE gives the statement describing the project as generated by Copilot.--FUNCTION gives a functional description of the project.--DOMAIN gives the application domain that the project containing the Copilot-generated code belongs to. 6. corresponding_cwe.xlsxprovides warning messages from static analysis tools corresponding to CWEs. 7. files_with_security_issues.xlsxprovides information about code snippets with security issues. 8. cwe-result.xlsxprovides the types and quantities of CWEs identified in the scan results.
创建时间:
2024-12-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作