five

Feature Representation Repository for Defect Prediction

收藏
IEEE2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/feature-representation-repository-defect-prediction
下载链接
链接失效反馈
官方服务:
资源简介:
The software engineering community is working to develop reliable metrics to improve software quality. It is estimated that understanding the source code accounts for 60% of the software maintenance effort. Cognitive informatics is important in quantifying the degree of difficulty or the efforts made by developers to understand the source code. Several empirical studies were conducted in 2003 to assign cognitive weights to each possible basic control structure of software, and these cognitive weights are used by several researchers to evaluate the cognitive complexity of software systems. In this paper, an effort has been made to categorize the Control Flow Graphs (CFGs) nodes according to their node features. In our case, we extracted seven unique features from the program, and each unique feature was assigned an integer value that we evaluated through Cognitive Complexity Measures (CCMs). We then incorporated CCMs' results as a node feature value in CFGs and generated the same based on the node connectivity for a graph. In order to obtain the feature representation of the graph, a node vector matrix is then created for the graph and passed to the Graph Convolutional Network (GCN). We prepared our data sets using GCN output and then built Deep Neural Network Defect Prediction (DNN-DP) and Convolutional Neural Network Defect Prediction (CNN-DP) models to predict software defects. The Python programming language is used, along with Keras and TensorFlow. Three hundred twenty Python programs were written by our talented UG and PG students, and all experiments were carried out during laboratory classes. Together with three skilled lab programmers, they compiled and ran each individual program and detected defect/no-defect programs before categorizing them into three different classes, namely Simple, Medium, and Complex programs. Accuracy, Receiver Operating Characteristics (ROC), Area Under Curve (AUC), F-measure, Precision and hyper-parameter tuning procedures are used to evaluate the approaches. The experimental results show that the proposed models outperformed state-of-the-art methods such as Nave Bayes (NB), Decision Tree (DT), Support Vector Machine (SVM), and Random Forest (RF) in all evaluation criteria.
提供机构:
Bhattacharjee, Vandana; Gupta, Mansi; Rajnish, Kumar
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作