five

Dataset of duplicate vulnerability records across databases

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14580766
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains vulnerability duplicate information from two sources: the cross-database duplicates and the GitHub Advisory Database duplicates. The dataset is provided in JSON format and is intended for use in research related to vulnerability matching and duplication detection. ## Dataset Overview The dataset consists of two files: 1. **cross_database_duplicates.json**: Contains 22,163 pairs of duplicate vulnerabilities identified across multiple databases.2. **github_advisory_database_duplicates.json**: Contains 133 pairs of duplicate vulnerabilities specifically from the GitHub Advisory Database. ## File Format Both files are in JSON format. Each record consists of four attributes: - `id_1`: The ID of the first vulnerability report.- `id_2`: The ID of the second vulnerability report.- `record_1`: The first vulnerability report.- `record_2`: The second vulnerability report. These attributes are designed to help users identify and compare vulnerability reports that are considered duplicates. ## Usage This dataset can be used for studies in vulnerability matching, natural language processing (NLP) applications, and the development of tools for detecting duplicate vulnerabilities in different databases.
创建时间:
2024-12-31
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作