five

Software vulnerability detection datasets - function/method level

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10266598
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset is for software vulnerability detection and includes source code in eight programming languages (C, C++, Java, JavaScript, Go, PHP, Ruby, Python). All data is collected from GitHub. data{programming language}_vul.json: a set of vulnerable code samples in a certain programming language. data{programming language}_patch.json: a set of patching code samples in a certain programming language.   Each source code sample includes the following 16 properties:  index: index of code. If is_vulnerable==False, this index indicates that this code is a patch of the indexing vulnerable code. code: raw source code (may include comments). is_vulnerable: the code is vulnerable (True) or a patch (False). programming_language: programming language of the code. method_name: name of the method. file_name: name of the file where the source code is extracted. repo_url: url of the project repository. repo_owner: owner of the repository. committer: developer who pushed the commit. committer_date: date when the commit was pushed. commit_msg: the commit message. cwe_id: If is_vulnerable==True, the CWE id; otherwise None. cwe_name: If is_vulnerable==True, the name of corresponding CWE; otherwise None. cwe_description: If is_vulnerable==True, the description of corresponding CWE; otherwise None. cwe_url: If is_vulnerable==True, the url to obtain more details of corresponding CWE; otherwise None. cve_id: If is_vulnerable==True, the CVE id; otherwise None.
创建时间:
2024-10-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作