five

Technical Debt Prioritization Using Machine Learning

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/7709534
下载链接
链接失效反馈
官方服务:
资源简介:
Technical debt (TD) identification tools can find thousands of technical debt items (TDIs) in a software project. Remedying all of them  would take months or even years, so prioritization and decision-making is needed to make this process efficient. On the other hand, advances in machine learning over the last few decades have allowed researchers to apply methods to cluster behaviors and identify patterns in software engineering data. In this study, we aim to develop machine learning methods to decide whether and when a given TDI should be paid off in real software projects. We performed a survey to collect data from Java open-source software projects hosted on GitHub. From the 2616 survey responses, we created a dataset using three different labeling strategies - "pay or not", 3-classes, and priority. We applied nine well-known machine learning methods over 27 source code metrics to build models to predict if and when a TDI should be paid off. The best methods for determining if an item should be paid off achieved accuracy, precision, and recall of about 0.86. For when to make the payment, we applied four approaches. Their performance achieved 0.81 using traditional analysis and 0.92 with tuned analysis.
创建时间:
2024-07-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作