Replication Package: "Classifying Code Review Comments"
收藏Figshare2025-10-22 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Replication_Package_Classifying_Code_Review_Comments_/30418735
下载链接
链接失效反馈官方服务:
资源简介:
This replication package contains all datasets and source code used in our study. The package is organized into two main folders: Datasets and Source_Codes.Datasets/Code_attributes_Our.csv — Contains our extracted code-level attributes for each reviewed code snippet.Our_labeled_dataset.csv — The final labeled dataset combining comment text, corresponding code, and labels.Code_attributes_Turzo_et_al.csv — Code attribute features from the dataset of Turzo et al. for comparative experiments.Dataset_Turzo_et_al.xlsx — Original dataset from Turzo et al. containing comment texts, code snippets, and labels used in prior research.Source_Codes/code_review_comment_ai_assisted_labeling.py — Script for ChatGPT-based annotation of review comments, used for semi-automated dataset labeling.C1_Turzo_Model_Turzo_Data.ipynb — Implements Turzo et al.’s baseline model on their original dataset.C2_Turzo_Model_Our_Data.ipynb — Applies Turzo et al.’s baseline model to our dataset.C3_Our_Model_Turzo_Data.ipynb — Trains our proposed model on the Turzo et al. dataset.C4_Our_Model_Our_Data.ipynb — Implements our proposed model on our labeled dataset.C4variant_Our_ModelwithLSTM_Our_Data.ipynb — A variant of C4 that incorporates an LSTM layer instead of BiLSTM layer.C4_Error_analysis.xlsx — Contains the detailed results of misclassified examples from the C4 model, used for qualitative analysis. In this file, the class labels 0, 1, 2, 3, and 4 represent Discussion, Documentation, False Positive, Functional, and Refactoring, respectively.
创建时间:
2025-10-22



