five

Replication Package: "Classifying Code Review Comments"

收藏
DataCite Commons2026-04-23 更新2026-02-09 收录
下载链接:
https://figshare.com/articles/dataset/Replication_Package_Classifying_Code_Review_Comments_/30418735/1
下载链接
链接失效反馈
官方服务:
资源简介:
This replication package contains all datasets and source code used in our study. The package is organized into two main folders: <b>Datasets</b> and <b>Source_Codes</b>.<b>Datasets/</b><b>Code_attributes_Our.csv</b> — Contains our extracted code-level attributes for each reviewed code snippet.<b>Our_labeled_dataset.csv</b> — The final labeled dataset combining comment text, corresponding code, and labels.<b>Code_attributes_Turzo_et_al.csv</b> — Code attribute features from the dataset of <i>Turzo et al.</i> for comparative experiments.<b>Dataset_Turzo_et_al.xlsx</b> — Original dataset from <i>Turzo et al.</i> containing comment texts, code snippets, and labels used in prior research.<b>Source_Codes/</b><b>code_review_comment_ai_assisted_labeling.py</b> — Script for ChatGPT-based annotation of review comments, used for semi-automated dataset labeling.<b>C1_Turzo_Model_Turzo_Data.ipynb</b> — Implements <i>Turzo et al.</i>’s baseline model on their original dataset.<b>C2_Turzo_Model_Our_Data.ipynb</b> — Applies <i>Turzo et al.</i>’s baseline model to our dataset.<b>C3_Our_Model_Turzo_Data.ipynb</b> — Trains our proposed model on the <i>Turzo et al.</i> dataset.<b>C4_Our_Model_Our_Data.ipynb</b> — Implements our proposed model on our labeled dataset.<b>C4variant_Our_ModelwithLSTM_Our_Data.ipynb</b> — A variant of C4 that incorporates an LSTM layer instead of BiLSTM layer.<b>C4_Error_analysis.xlsx</b> — Contains the detailed results of misclassified examples from the C4 model, used for qualitative analysis. In this file, the class labels 0, 1, 2, 3, and 4 represent <i>Discussion</i>, <i>Documentation</i>, <i>False Positive</i>, <i>Functional</i>, and <i>Refactoring</i>, respectively.<br>
提供机构:
figshare
创建时间:
2025-10-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作