"Refactoring Multi-Label Dataset"

Name: "Refactoring Multi-Label Dataset"
Creator: IEEE DataPort
Published: 2025-06-29 11:57:42
License: 暂无描述

DataCite Commons2025-06-29 更新2026-05-03 收录

下载链接：

https://ieee-dataport.org/documents/refactoring-multi-label-dataset

下载链接

链接失效反馈

官方服务：

资源简介：

"Refactoring is the process of restructuring existing source code to improve its internal structure without altering its external behavior. Refactoring is essential to maintaining software quality; however, its manual application is labor-intensive, and existing automated techniques often fall short by relying on binary classification, neglecting the co-occurrence of multiple refactoring needs. This study addresses this gap by proposing and evaluating multi-label machine learning models capable of predicting combinations from 20 distinct refactoring operations across class, method, and variable granularities. We systematically investigate three multi-label learning strategies (Label Powerset, Classifier Chains, and Binary Relevance) integrated with five base classifiers: Random Forest, Gradient Boosting, XGBoost, Decision Tree, and Artificial Neural Network. Experiments are conducted using 10-fold cross-validation on a real-world dataset, with relevant feature selection techniques applied. Results indicate that variable-level metrics yield the highest predictive performance, with the Label Powerset strategy combined with Random Forest achieving a Jaccard accuracy of 95.30%. These findings highlight the efficacy of multi-label learning in modeling complex, real-world refactoring scenarios, providing a robust foundation for enhancing automated refactoring tools and advancing software maintenance practices."

提供机构：

IEEE DataPort

创建时间：

2025-06-29

5,000+

优质数据集

54 个

任务类型

进入经典数据集