leoli04/github-HF-datasets-issues
收藏Hugging Face2024-05-28 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/leoli04/github-HF-datasets-issues
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: url
dtype: string
- name: repository_url
dtype: string
- name: labels_url
dtype: string
- name: comments_url
dtype: string
- name: events_url
dtype: string
- name: html_url
dtype: string
- name: id
dtype: int64
- name: node_id
dtype: string
- name: number
dtype: int64
- name: title
dtype: string
- name: user
struct:
- name: avatar_url
dtype: string
- name: events_url
dtype: string
- name: followers_url
dtype: string
- name: following_url
dtype: string
- name: gists_url
dtype: string
- name: gravatar_id
dtype: string
- name: html_url
dtype: string
- name: id
dtype: int64
- name: login
dtype: string
- name: node_id
dtype: string
- name: organizations_url
dtype: string
- name: received_events_url
dtype: string
- name: repos_url
dtype: string
- name: site_admin
dtype: bool
- name: starred_url
dtype: string
- name: subscriptions_url
dtype: string
- name: type
dtype: string
- name: url
dtype: string
- name: labels
list:
- name: color
dtype: string
- name: default
dtype: bool
- name: description
dtype: string
- name: id
dtype: int64
- name: name
dtype: string
- name: node_id
dtype: string
- name: url
dtype: string
- name: state
dtype: string
- name: locked
dtype: bool
- name: assignee
struct:
- name: avatar_url
dtype: string
- name: events_url
dtype: string
- name: followers_url
dtype: string
- name: following_url
dtype: string
- name: gists_url
dtype: string
- name: gravatar_id
dtype: string
- name: html_url
dtype: string
- name: id
dtype: int64
- name: login
dtype: string
- name: node_id
dtype: string
- name: organizations_url
dtype: string
- name: received_events_url
dtype: string
- name: repos_url
dtype: string
- name: site_admin
dtype: bool
- name: starred_url
dtype: string
- name: subscriptions_url
dtype: string
- name: type
dtype: string
- name: url
dtype: string
- name: assignees
list:
- name: avatar_url
dtype: string
- name: events_url
dtype: string
- name: followers_url
dtype: string
- name: following_url
dtype: string
- name: gists_url
dtype: string
- name: gravatar_id
dtype: string
- name: html_url
dtype: string
- name: id
dtype: int64
- name: login
dtype: string
- name: node_id
dtype: string
- name: organizations_url
dtype: string
- name: received_events_url
dtype: string
- name: repos_url
dtype: string
- name: site_admin
dtype: bool
- name: starred_url
dtype: string
- name: subscriptions_url
dtype: string
- name: type
dtype: string
- name: url
dtype: string
- name: milestone
struct:
- name: closed_at
dtype: string
- name: closed_issues
dtype: int64
- name: created_at
dtype: string
- name: creator
struct:
- name: avatar_url
dtype: string
- name: events_url
dtype: string
- name: followers_url
dtype: string
- name: following_url
dtype: string
- name: gists_url
dtype: string
- name: gravatar_id
dtype: string
- name: html_url
dtype: string
- name: id
dtype: int64
- name: login
dtype: string
- name: node_id
dtype: string
- name: organizations_url
dtype: string
- name: received_events_url
dtype: string
- name: repos_url
dtype: string
- name: site_admin
dtype: bool
- name: starred_url
dtype: string
- name: subscriptions_url
dtype: string
- name: type
dtype: string
- name: url
dtype: string
- name: description
dtype: string
- name: due_on
dtype: string
- name: html_url
dtype: string
- name: id
dtype: int64
- name: labels_url
dtype: string
- name: node_id
dtype: string
- name: number
dtype: int64
- name: open_issues
dtype: int64
- name: state
dtype: string
- name: title
dtype: string
- name: updated_at
dtype: string
- name: url
dtype: string
- name: comments
sequence: string
- name: created_at
dtype: timestamp[ns, tz=UTC]
- name: updated_at
dtype: timestamp[ns, tz=UTC]
- name: closed_at
dtype: timestamp[ns, tz=UTC]
- name: author_association
dtype: string
- name: active_lock_reason
dtype: float64
- name: body
dtype: string
- name: reactions
struct:
- name: '+1'
dtype: int64
- name: '-1'
dtype: int64
- name: confused
dtype: int64
- name: eyes
dtype: int64
- name: heart
dtype: int64
- name: hooray
dtype: int64
- name: laugh
dtype: int64
- name: rocket
dtype: int64
- name: total_count
dtype: int64
- name: url
dtype: string
- name: timeline_url
dtype: string
- name: performed_via_github_app
dtype: float64
- name: state_reason
dtype: string
- name: draft
dtype: float64
- name: pull_request
struct:
- name: diff_url
dtype: string
- name: html_url
dtype: string
- name: merged_at
dtype: string
- name: patch_url
dtype: string
- name: url
dtype: string
- name: is_pull_request
dtype: bool
splits:
- name: train
num_bytes: 21454467
num_examples: 6884
download_size: 5305119
dataset_size: 21454467
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
The dataset provides comprehensive details of issues and pull requests on GitHub, including URLs, user details, labels, state, assignees, milestones, comments, creation and update times, author association, active lock reason, body, reactions, timeline URL, performed via GitHub app, state reason, draft status, pull request details, and an indicator of whether it is a pull request. The dataset is divided into a training set with 6884 samples.
提供机构:
leoli04
原始信息汇总
数据集概述
数据集特征
基本特征
- url: 字符串类型
- repository_url: 字符串类型
- labels_url: 字符串类型
- comments_url: 字符串类型
- events_url: 字符串类型
- html_url: 字符串类型
- id: 整数类型
- node_id: 字符串类型
- number: 整数类型
- title: 字符串类型
用户特征
- user: 结构体类型,包含以下字段:
- avatar_url: 字符串类型
- events_url: 字符串类型
- followers_url: 字符串类型
- following_url: 字符串类型
- gists_url: 字符串类型
- gravatar_id: 字符串类型
- html_url: 字符串类型
- id: 整数类型
- login: 字符串类型
- node_id: 字符串类型
- organizations_url: 字符串类型
- received_events_url: 字符串类型
- repos_url: 字符串类型
- site_admin: 布尔类型
- starred_url: 字符串类型
- subscriptions_url: 字符串类型
- type: 字符串类型
- url: 字符串类型
标签特征
- labels: 列表类型,包含以下字段:
- color: 字符串类型
- default: 布尔类型
- description: 字符串类型
- id: 整数类型
- name: 字符串类型
- node_id: 字符串类型
- url: 字符串类型
状态与锁定
- state: 字符串类型
- locked: 布尔类型
指派者特征
- assignee: 结构体类型,包含与用户特征相同的字段
- assignees: 列表类型,包含与用户特征相同的字段
里程碑特征
- milestone: 结构体类型,包含以下字段:
- closed_at: 字符串类型
- closed_issues: 整数类型
- created_at: 字符串类型
- creator: 结构体类型,包含与用户特征相同的字段
- description: 字符串类型
- due_on: 字符串类型
- html_url: 字符串类型
- id: 整数类型
- labels_url: 字符串类型
- node_id: 字符串类型
- number: 整数类型
- open_issues: 整数类型
- state: 字符串类型
- title: 字符串类型
- updated_at: 字符串类型
- url: 字符串类型
其他特征
- comments: 字符串序列类型
- created_at: 时间戳类型
- updated_at: 时间戳类型
- closed_at: 时间戳类型
- author_association: 字符串类型
- active_lock_reason: 浮点数类型
- body: 字符串类型
- reactions: 结构体类型,包含表情反应计数
- timeline_url: 字符串类型
- performed_via_github_app: 浮点数类型
- state_reason: 字符串类型
- draft: 浮点数类型
- pull_request: 结构体类型,包含拉取请求相关信息
- is_pull_request: 布尔类型
数据集划分
- train: 训练集,包含6884个样本,总大小为21454467字节,下载大小为5305119字节。
配置
- config_name: default
- data_files:
- split: train
- path: data/train-*



