five

GitHub Projects for Defect Prediction

收藏
arXiv2025-09-30 收录
下载链接:
https://github.com/snaraya7/early-bird
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含了240个基于其提交历史抽取的GitHub项目,旨在通过早期启发式方法来评估缺陷预测模型。这些项目根据星级评分被分为热门和非热门两大类。数据集涵盖了多种编程语言开发的项目,如Java、Python、C、C++等,分析所使用的特征是通过Commit Guru工具提取的。规模上,数据集包含了240个项目,其中155个为热门项目,85个为非热门项目。该数据集的任务是进行缺陷预测。

This dataset contains 240 GitHub projects extracted from their commit histories, with the goal of evaluating defect prediction models using early heuristic methods. These projects are categorized into two groups: popular and non-popular, based on their star ratings. The dataset encompasses projects developed in a variety of programming languages, such as Java, Python, C, C++, and others. The features utilized for analysis are extracted via the Commit Guru tool. In terms of scale, the dataset consists of 240 total projects, among which 155 are popular and 85 are non-popular. The core task of this dataset is defect prediction.
提供机构:
Open source projects on GitHub
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作