five

The Secret Life of Software Vulnerabilities: A Large-Scale Empirical Study

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4400211
下载链接
链接失效反馈
官方服务:
资源简介:
Online appendix of the paper entitled: "The Secret Life of Software Vulnerabilities: A Large-Scale Empirical Study". It contains all scripts and data required to replicate the four research questions of the study. Abstract: Software vulnerabilities are weaknesses in source code that can be potentially exploited to cause loss or harm. While researchers have been devising a number of methods to deal with vulnerabilities, there is still a noticeable lack of knowledge on their software engineering life cycle, for example how vulnerabilities are introduced and removed by developers. This information can be exploited to design more effective methods for vulnerability prevention and detection, as well as to understand the granularity that these methods should aim at. To investigate the life cycle of software vulnerabilities, we focus on how, when, and under which circumstances vulnerabilities are introduced in software projects, as well as whether, after how long, and how they are removed. We consider 4,097 vulnerabilities with public patches from the National Vulnerability Database—pertaining to 1,163 open-source software projects on GITHUB—and define a six-step process that involves both automated parts (e.g., using the SZZ algorithm to find the vulnerability-inducing commits) and manual analyses (e.g., how vulnerabilities were fixed). The investigated vulnerabilities can be classified in 148 categories, take on average 4.19 commits before being introduced, and remain unfixed for a median of 1,506.50 commits and 691.50 days. Most of them are introduced by developers with high workload, often when doing maintenance activities, and removed with mostly with the addition of new source code aiming at implementing further checks on inputs. We conclude by distilling practical implications on when and how vulnerability detectors should work to better assist developers in early detecting these issues.
创建时间:
2021-01-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作