amasiukevich/github-issues-datasets
收藏Hugging Face2024-04-09 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/amasiukevich/github-issues-datasets
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
dataset_info:
features:
- name: url
dtype: string
- name: repository_url
dtype: string
- name: labels_url
dtype: string
- name: comments_url
dtype: string
- name: events_url
dtype: string
- name: html_url
dtype: string
- name: id
dtype: int64
- name: node_id
dtype: string
- name: number
dtype: int64
- name: title
dtype: string
- name: user
struct:
- name: avatar_url
dtype: string
- name: events_url
dtype: string
- name: followers_url
dtype: string
- name: following_url
dtype: string
- name: gists_url
dtype: string
- name: gravatar_id
dtype: string
- name: html_url
dtype: string
- name: id
dtype: int64
- name: login
dtype: string
- name: node_id
dtype: string
- name: organizations_url
dtype: string
- name: received_events_url
dtype: string
- name: repos_url
dtype: string
- name: site_admin
dtype: bool
- name: starred_url
dtype: string
- name: subscriptions_url
dtype: string
- name: type
dtype: string
- name: url
dtype: string
- name: labels
list:
- name: color
dtype: string
- name: default
dtype: bool
- name: description
dtype: string
- name: id
dtype: int64
- name: name
dtype: string
- name: node_id
dtype: string
- name: url
dtype: string
- name: state
dtype: string
- name: locked
dtype: bool
- name: assignee
struct:
- name: avatar_url
dtype: string
- name: events_url
dtype: string
- name: followers_url
dtype: string
- name: following_url
dtype: string
- name: gists_url
dtype: string
- name: gravatar_id
dtype: string
- name: html_url
dtype: string
- name: id
dtype: int64
- name: login
dtype: string
- name: node_id
dtype: string
- name: organizations_url
dtype: string
- name: received_events_url
dtype: string
- name: repos_url
dtype: string
- name: site_admin
dtype: bool
- name: starred_url
dtype: string
- name: subscriptions_url
dtype: string
- name: type
dtype: string
- name: url
dtype: string
- name: assignees
list:
- name: avatar_url
dtype: string
- name: events_url
dtype: string
- name: followers_url
dtype: string
- name: following_url
dtype: string
- name: gists_url
dtype: string
- name: gravatar_id
dtype: string
- name: html_url
dtype: string
- name: id
dtype: int64
- name: login
dtype: string
- name: node_id
dtype: string
- name: organizations_url
dtype: string
- name: received_events_url
dtype: string
- name: repos_url
dtype: string
- name: site_admin
dtype: bool
- name: starred_url
dtype: string
- name: subscriptions_url
dtype: string
- name: type
dtype: string
- name: url
dtype: string
- name: milestone
struct:
- name: closed_at
dtype: 'null'
- name: closed_issues
dtype: int64
- name: created_at
dtype: string
- name: creator
struct:
- name: avatar_url
dtype: string
- name: events_url
dtype: string
- name: followers_url
dtype: string
- name: following_url
dtype: string
- name: gists_url
dtype: string
- name: gravatar_id
dtype: string
- name: html_url
dtype: string
- name: id
dtype: int64
- name: login
dtype: string
- name: node_id
dtype: string
- name: organizations_url
dtype: string
- name: received_events_url
dtype: string
- name: repos_url
dtype: string
- name: site_admin
dtype: bool
- name: starred_url
dtype: string
- name: subscriptions_url
dtype: string
- name: type
dtype: string
- name: url
dtype: string
- name: description
dtype: string
- name: due_on
dtype: string
- name: html_url
dtype: string
- name: id
dtype: int64
- name: labels_url
dtype: string
- name: node_id
dtype: string
- name: number
dtype: int64
- name: open_issues
dtype: int64
- name: state
dtype: string
- name: title
dtype: string
- name: updated_at
dtype: string
- name: url
dtype: string
- name: comments
sequence: string
- name: created_at
dtype: timestamp[ns, tz=UTC]
- name: updated_at
dtype: timestamp[ns, tz=UTC]
- name: closed_at
dtype: timestamp[ns, tz=UTC]
- name: author_association
dtype: string
- name: active_lock_reason
dtype: float64
- name: draft
dtype: float64
- name: pull_request
struct:
- name: diff_url
dtype: string
- name: html_url
dtype: string
- name: merged_at
dtype: string
- name: patch_url
dtype: string
- name: url
dtype: string
- name: body
dtype: string
- name: reactions
struct:
- name: '+1'
dtype: int64
- name: '-1'
dtype: int64
- name: confused
dtype: int64
- name: eyes
dtype: int64
- name: heart
dtype: int64
- name: hooray
dtype: int64
- name: laugh
dtype: int64
- name: rocket
dtype: int64
- name: total_count
dtype: int64
- name: url
dtype: string
- name: timeline_url
dtype: string
- name: performed_via_github_app
dtype: float64
- name: state_reason
dtype: string
- name: is_pull_request
dtype: bool
splits:
- name: train
num_bytes: 27536364
num_examples: 4000
download_size: 8074913
dataset_size: 27536364
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
提供机构:
amasiukevich
原始信息汇总
数据集概述
数据集特征
-
基本信息
url: 字符串类型repository_url: 字符串类型labels_url: 字符串类型comments_url: 字符串类型events_url: 字符串类型html_url: 字符串类型id: 整数类型node_id: 字符串类型number: 整数类型title: 字符串类型
-
用户信息
user: 结构体类型,包含多个字段如avatar_url,events_url,followers_url,following_url,gists_url,gravatar_id,html_url,id,login,node_id,organizations_url,received_events_url,repos_url,site_admin,starred_url,subscriptions_url,type,url
-
标签信息
labels: 列表类型,包含多个字段如color,default,description,id,name,node_id,url
-
状态信息
state: 字符串类型locked: 布尔类型
-
指派信息
assignee: 结构体类型,包含多个字段如avatar_url,events_url,followers_url,following_url,gists_url,gravatar_id,html_url,id,login,node_id,organizations_url,received_events_url,repos_url,site_admin,starred_url,subscriptions_url,type,url
-
里程碑信息
milestone: 结构体类型,包含多个字段如closed_at,closed_issues,created_at,creator,description,due_on,html_url,id,labels_url,node_id,number,open_issues,state,title,updated_at,url
-
评论信息
comments: 字符串序列类型
-
时间信息
created_at: 时间戳类型,时区为UTCupdated_at: 时间戳类型,时区为UTCclosed_at: 时间戳类型,时区为UTC
-
其他信息
author_association: 字符串类型active_lock_reason: 浮点数类型draft: 浮点数类型pull_request: 结构体类型,包含多个字段如diff_url,html_url,merged_at,patch_url,urlbody: 字符串类型reactions: 结构体类型,包含多个字段如+1,-1,confused,eyes,heart,hooray,laugh,rocket,total_count,urltimeline_url: 字符串类型performed_via_github_app: 浮点数类型state_reason: 字符串类型is_pull_request: 布尔类型
数据集分片
- 训练集
train: 数据大小为27536364字节,包含4000个示例
数据集大小
- 下载大小: 8074913字节
- 数据集大小: 27536364字节
配置
- 默认配置
config_name: defaultdata_files:split: trainpath: data/train-*



