five

Dataset of the Paper "Exploring the Problems, their Causes and Solutions of AI Pair Programming: A Study on GitHub and Stack Overflow"

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11080113
下载链接
链接失效反馈
官方服务:
资源简介:
The dataset collected from GitHub Discussions, GitHub Issues, and Stack Overflow is used to conduct an empirical study on the problems, causes, and solutions of using GitHub Copilot in practice. A brief description of each document in the dataset is provided below: 1. Dataset(GitHub_Discussions).xlsx contains the Discussion IDs and URLs in the Copilot category of GitHub Discussions, and the data extracted from the related discussions along with analysis results. 2. Dataset(GitHub_Issues).xlsx contains the Issue IDs and URLs of the labelled issues which are related to Copilot from GitHub Issues, and the data extracted from the related issues along with analysis results. 3. Dataset(SO_Posts).xlsx contains the SO Post IDs and URLs of the labelled posts which are related to Copilot from Stack Overflow, and the data extracted from the related posts along with analysis results. 4. Extracted_Data.xlsx contains the final results of the data extracted from GitHub Discussions, GitHub Issues, and SO posts. 5. pilot labelling folder contains three .xlsx files (i.e., Pilot_Labelling(GitHub_Discussions).xlsx, Pilot_Labelling(GitHub_Issues), and Pilot_Labelling(SO)), with each file corresponding to one of the three data sources and containing the pilot data labelling results with the Cohen's kappa value.
创建时间:
2024-07-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作