five

On the role of data balancing for Machine Learning-based Code Smell Detection

收藏
Figshare2019-06-11 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/On_the_role_of_data_balancing_for_Machine_Learning-based_Code_Smell_Detection/8247509
下载链接
链接失效反馈
官方服务:
资源简介:
Code smells can compromise software quality in the long term by inducing technical debt.For this reason, in the last decade many approaches aimed at identifying these design flaws have been proposed.Most of them are based on heuristics in which a set of metrics (e.g., code metrics, process metrics) is used to detect smelly code components.However, these techniques suffer of subjective interpretation, low agreement between detectors, and threshold dependability.To overcome the limitations, previous work applied Machine Learning techniques that can learn from previous datasets without needing any threshold definition.However, more recent work has shown that Machine Learning is not always suitable for code smell detection due to the highly unbalanced nature of the problem.In this study we investigate several approaches able to mitigate data unbalancing issues to understand their impact on ML-based code smells detection algorithms.Our findings highlight a number of limitations and open issues with respect to the usage of data balancing for ML-based code smell detection.
创建时间:
2019-06-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作