aniruddh10124/github-issues
收藏Hugging Face2024-05-11 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/aniruddh10124/github-issues
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: url
dtype: string
- name: repository_url
dtype: string
- name: labels_url
dtype: string
- name: comments_url
dtype: string
- name: events_url
dtype: string
- name: html_url
dtype: string
- name: id
dtype: int64
- name: node_id
dtype: string
- name: number
dtype: int64
- name: title
dtype: string
- name: user
struct:
- name: login
dtype: string
- name: id
dtype: int64
- name: node_id
dtype: string
- name: avatar_url
dtype: string
- name: gravatar_id
dtype: string
- name: url
dtype: string
- name: html_url
dtype: string
- name: followers_url
dtype: string
- name: following_url
dtype: string
- name: gists_url
dtype: string
- name: starred_url
dtype: string
- name: subscriptions_url
dtype: string
- name: organizations_url
dtype: string
- name: repos_url
dtype: string
- name: events_url
dtype: string
- name: received_events_url
dtype: string
- name: type
dtype: string
- name: site_admin
dtype: bool
- name: labels
list:
- name: id
dtype: int64
- name: node_id
dtype: string
- name: url
dtype: string
- name: name
dtype: string
- name: color
dtype: string
- name: default
dtype: bool
- name: description
dtype: string
- name: state
dtype: string
- name: locked
dtype: bool
- name: assignee
struct:
- name: login
dtype: string
- name: id
dtype: int64
- name: node_id
dtype: string
- name: avatar_url
dtype: string
- name: gravatar_id
dtype: string
- name: url
dtype: string
- name: html_url
dtype: string
- name: followers_url
dtype: string
- name: following_url
dtype: string
- name: gists_url
dtype: string
- name: starred_url
dtype: string
- name: subscriptions_url
dtype: string
- name: organizations_url
dtype: string
- name: repos_url
dtype: string
- name: events_url
dtype: string
- name: received_events_url
dtype: string
- name: type
dtype: string
- name: site_admin
dtype: bool
- name: assignees
list:
- name: login
dtype: string
- name: id
dtype: int64
- name: node_id
dtype: string
- name: avatar_url
dtype: string
- name: gravatar_id
dtype: string
- name: url
dtype: string
- name: html_url
dtype: string
- name: followers_url
dtype: string
- name: following_url
dtype: string
- name: gists_url
dtype: string
- name: starred_url
dtype: string
- name: subscriptions_url
dtype: string
- name: organizations_url
dtype: string
- name: repos_url
dtype: string
- name: events_url
dtype: string
- name: received_events_url
dtype: string
- name: type
dtype: string
- name: site_admin
dtype: bool
- name: milestone
struct:
- name: url
dtype: string
- name: html_url
dtype: string
- name: labels_url
dtype: string
- name: id
dtype: int64
- name: node_id
dtype: string
- name: number
dtype: int64
- name: title
dtype: string
- name: description
dtype: string
- name: creator
struct:
- name: login
dtype: string
- name: id
dtype: int64
- name: node_id
dtype: string
- name: avatar_url
dtype: string
- name: gravatar_id
dtype: string
- name: url
dtype: string
- name: html_url
dtype: string
- name: followers_url
dtype: string
- name: following_url
dtype: string
- name: gists_url
dtype: string
- name: starred_url
dtype: string
- name: subscriptions_url
dtype: string
- name: organizations_url
dtype: string
- name: repos_url
dtype: string
- name: events_url
dtype: string
- name: received_events_url
dtype: string
- name: type
dtype: string
- name: site_admin
dtype: bool
- name: open_issues
dtype: int64
- name: closed_issues
dtype: int64
- name: state
dtype: string
- name: created_at
dtype: timestamp[s]
- name: updated_at
dtype: timestamp[s]
- name: due_on
dtype: 'null'
- name: closed_at
dtype: 'null'
- name: comments
sequence: string
- name: created_at
dtype: timestamp[s]
- name: updated_at
dtype: timestamp[s]
- name: closed_at
dtype: timestamp[s]
- name: author_association
dtype: string
- name: active_lock_reason
dtype: 'null'
- name: body
dtype: string
- name: reactions
struct:
- name: url
dtype: string
- name: total_count
dtype: int64
- name: '+1'
dtype: int64
- name: '-1'
dtype: int64
- name: laugh
dtype: int64
- name: hooray
dtype: int64
- name: confused
dtype: int64
- name: heart
dtype: int64
- name: rocket
dtype: int64
- name: eyes
dtype: int64
- name: timeline_url
dtype: string
- name: performed_via_github_app
dtype: 'null'
- name: state_reason
dtype: string
- name: draft
dtype: bool
- name: pull_request
struct:
- name: url
dtype: string
- name: html_url
dtype: string
- name: diff_url
dtype: string
- name: patch_url
dtype: string
- name: merged_at
dtype: timestamp[s]
- name: is_pull_request
dtype: bool
splits:
- name: train
num_bytes: 19682385
num_examples: 2000
download_size: 5743780
dataset_size: 19682385
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
The dataset provides detailed information about issues and pull requests on GitHub, including URLs, IDs, titles, user information, labels, state, assignees, milestones, comments, creation and update times, author association, body, reactions, timeline URL, state reason, draft status, pull request details, and an indicator of whether it is a pull request. The dataset is divided into a training set with 2000 samples, totaling 19682385 bytes.
提供机构:
aniruddh10124
原始信息汇总
数据集特征概述
基本特征
- url: 字符串类型
- repository_url: 字符串类型
- labels_url: 字符串类型
- comments_url: 字符串类型
- events_url: 字符串类型
- html_url: 字符串类型
- id: 整数类型
- node_id: 字符串类型
- number: 整数类型
- title: 字符串类型
用户特征
- user: 结构类型,包含以下字段:
- login: 字符串类型
- id: 整数类型
- node_id: 字符串类型
- avatar_url: 字符串类型
- gravatar_id: 字符串类型
- url: 字符串类型
- html_url: 字符串类型
- followers_url: 字符串类型
- following_url: 字符串类型
- gists_url: 字符串类型
- starred_url: 字符串类型
- subscriptions_url: 字符串类型
- organizations_url: 字符串类型
- repos_url: 字符串类型
- events_url: 字符串类型
- received_events_url: 字符串类型
- type: 字符串类型
- site_admin: 布尔类型
标签特征
- labels: 列表类型,包含以下字段:
- id: 整数类型
- node_id: 字符串类型
- url: 字符串类型
- name: 字符串类型
- color: 字符串类型
- default: 布尔类型
- description: 字符串类型
状态与锁定特征
- state: 字符串类型
- locked: 布尔类型
指派者特征
- assignee: 结构类型,包含与用户特征相同的字段
- assignees: 列表类型,包含与用户特征相同的字段
里程碑特征
- milestone: 结构类型,包含以下字段:
- url: 字符串类型
- html_url: 字符串类型
- labels_url: 字符串类型
- id: 整数类型
- node_id: 字符串类型
- number: 整数类型
- title: 字符串类型
- description: 字符串类型
- creator: 结构类型,包含与用户特征相同的字段
- open_issues: 整数类型
- closed_issues: 整数类型
- state: 字符串类型
- created_at: 时间戳类型
- updated_at: 时间戳类型
- due_on: 空值类型
- closed_at: 空值类型
评论与时间特征
- comments: 字符串序列类型
- created_at: 时间戳类型
- updated_at: 时间戳类型
- closed_at: 时间戳类型
作者关联与反应特征
- author_association: 字符串类型
- active_lock_reason: 空值类型
- body: 字符串类型
- reactions: 结构类型,包含以下字段:
- url: 字符串类型
- total_count: 整数类型
- +1: 整数类型
- -1: 整数类型
- laugh: 整数类型
- hooray: 整数类型
- confused: 整数类型
- heart: 整数类型
- rocket: 整数类型
- eyes: 整数类型
其他特征
- timeline_url: 字符串类型
- performed_via_github_app: 空值类型
- state_reason: 字符串类型
- draft: 布尔类型
- pull_request: 结构类型,包含以下字段:
- url: 字符串类型
- html_url: 字符串类型
- diff_url: 字符串类型
- patch_url: 字符串类型
- merged_at: 时间戳类型
- is_pull_request: 布尔类型



