Code diffs and commit messages from top1000-2000 Java projects in GitHub
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/2529946
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains pairs from top 1000-2000 Java projects in GitHub via GitHub Developer API.
The structure of files is as follows:
./ # Each directory contains commits of one project.
./ # Each directory contains one commit information, and
# the commit-sha is the primary key of this commit in GitHub.
# The children in this directory have 3 types: the file
# "commit_msg", multiple directories "", and the file
# "error".
commit_msg # This file contains one line, representing the commit
# message.
/ # The name of this directory is the changed file name.
patch # The code diffs of this , describing
# which lines are added and which lines are
# removed with some same lines context.
/
patch
error # This file contains the error message when crawling
# from GitHub. If this file exists, the directories
# will not exist.
创建时间:
2020-01-24



