Dataset
收藏DataCite Commons2024-03-10 更新2024-08-18 收录
下载链接:
https://figshare.com/articles/dataset/Untitled_Item/23821830
下载链接
链接失效反馈官方服务:
资源简介:
Static Analysis Tools (SATs) show potential defect detection ability while their usability is severely hindered by massive unactionable warnings. To improve the usability of SATs, many studies are proposed for Actionable Warning Identification (AWI), which mainly focus on mining warning features and improving identification models to identify actionable warnings. However, the underlying distribution of the warning dataset, which is closely related to feature mining and thereby affects AWI model performance, is not well-explored by these studies. Further, there is a lack of a well-prepared warning dataset to support the distribution analysis. In this paper, we first propose a warning dataset construction approach, which incorporates manual inspection and verification latency into postprocess labels from an advanced closed-warning heuristic and thereby acquire credible labels. Based on 10 large-scale and real-world projects with 25K+ revisions and 2087K+ warnings, we construct a qualified warning dataset with 11975 distinct warnings. Subsequently, we thoroughly analyze the actionable warning distribution within projects against our constructed dataset from six warning characteristics (i.e., category, type, priority, rank, file, and method). Based on the analysis results, we present 16 findings. Finally, a preliminary study demonstrates that our findings can be practical and instructive in improving the usability of SATs.
提供机构:
figshare
创建时间:
2023-08-02



