Dataset of the Paper "Exploring the Problems, their Causes and Solutions of AI Pair Programming: A Study on GitHub and Stack Overflow"
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11080113
下载链接
链接失效反馈官方服务:
资源简介:
The dataset collected from GitHub Discussions, GitHub Issues, and Stack Overflow is used to conduct an empirical study on the problems, causes, and solutions of using GitHub Copilot in practice. A brief description of each document in the dataset is provided below:
1. Dataset(GitHub_Discussions).xlsx
contains the Discussion IDs and URLs in the Copilot category of GitHub Discussions, and the data extracted from the related discussions along with analysis results.
2. Dataset(GitHub_Issues).xlsx
contains the Issue IDs and URLs of the labelled issues which are related to Copilot from GitHub Issues, and the data extracted from the related issues along with analysis results.
3. Dataset(SO_Posts).xlsx
contains the SO Post IDs and URLs of the labelled posts which are related to Copilot from Stack Overflow, and the data extracted from the related posts along with analysis results.
4. Extracted_Data.xlsx
contains the final results of the data extracted from GitHub Discussions, GitHub Issues, and SO posts.
5. pilot labelling folder
contains three .xlsx files (i.e., Pilot_Labelling(GitHub_Discussions).xlsx, Pilot_Labelling(GitHub_Issues), and Pilot_Labelling(SO)), with each file corresponding to one of the three data sources and containing the pilot data labelling results with the Cohen's kappa value.
创建时间:
2024-07-30



