five

Prioritising GitHub Priority Labels - Data Set and Software

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10894271
下载链接
链接失效反馈
官方服务:
资源简介:
This is the data set and software produced for the paper Prioritising GitHub Priority Labels, J. Caddy and C. Treude. The CSV file contains a manually categorised set of GitHub issue labels that are priority-related. They have been ranked and normalised into three values; "High", "Medium", and "Low" priorities. These labels have been gathered from the 5000 most-starred repositories on GitHub as of 2022-06-01. The Python script makes use of this data set as an example, and will retrieve the highest priority issues from all of the repositories contributed to by the author specified. Run the python script from the same directory as the CSV file, providing the username you wish to see the highest priority issues for as the first command line argument. Supply your GitHub Personal Access Token either at the prompt so it's not displayed, or as the second command line argument.

本数据集与配套软件系为论文《Prioritising GitHub Priority Labels》(作者J. Caddy与C. Treude)所研发制作。 该CSV文件包含经人工分类的一批与优先级相关的GitHub议题标签(GitHub issue labels),已完成排序与标准化处理,划分为「高」「中」「低」三类优先级。上述标签采集自2022年6月1日GitHub平台上星标数排名前5000的代码仓库。 本Python脚本以该数据集作为示例数据集,可检索指定作者参与贡献的所有仓库中的最高优先级议题。 请在与该CSV文件同目录下运行此Python脚本,将你希望查询最高优先级议题的用户名作为首个命令行参数传入。你可通过交互提示符输入GitHub个人访问令牌(GitHub Personal Access Token)以避免令牌明文泄露,或将其作为第二个命令行参数传入。
创建时间:
2024-05-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作