loubnabnl/clean_prs2
收藏Hugging Face2023-09-15 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/loubnabnl/clean_prs2
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: bucket
dtype: string
- name: pull_request_info
struct:
- name: org.id
dtype: int64
- name: public
dtype: bool
- name: pull_request.additions
dtype: int64
- name: pull_request.body
dtype: string
- name: pull_request.changed_files
dtype: int64
- name: pull_request.closed_at
dtype: string
- name: pull_request.comments
dtype: int64
- name: pull_request.commits
dtype: int64
- name: pull_request.created_at
dtype: string
- name: pull_request.deletions
dtype: int64
- name: pull_request.guid
dtype: string
- name: pull_request.id
dtype: int64
- name: pull_request.merged_at
dtype: string
- name: pull_request.merged_by.login
dtype: string
- name: pull_request.milestone.description
dtype: string
- name: pull_request.milestone.number
dtype: int64
- name: pull_request.milestone.title
dtype: string
- name: pull_request.number
dtype: int64
- name: pull_request.review_comments
dtype: int64
- name: pull_request.state
dtype: string
- name: pull_request.title
dtype: string
- name: pull_request.user.id
dtype: int64
- name: pull_request.user.login
dtype: string
- name: repo.id
dtype: int64
- name: repo.name
dtype: string
- name: head_repo_info
struct:
- name: pull_request.head.label
dtype: string
- name: pull_request.head.ref
dtype: string
- name: pull_request.head.repo.default_branch
dtype: string
- name: pull_request.head.repo.description
dtype: string
- name: pull_request.head.repo.homepage
dtype: string
- name: pull_request.head.repo.language
dtype: string
- name: pull_request.head.repo.license.name
dtype: string
- name: pull_request.head.repo.name
dtype: string
- name: pull_request.head.repo.owner.login
dtype: string
- name: pull_request.head.repo.owner.type
dtype: string
- name: pull_request.head.repo.private
dtype: bool
- name: pull_request.head.repo.stargazers_count
dtype: int64
- name: pull_request.head.sha
dtype: string
- name: pull_request.head.user.login
dtype: string
- name: pull_request.head.user.type
dtype: string
- name: base_repo_info
struct:
- name: pull_request.base.label
dtype: string
- name: pull_request.base.ref
dtype: string
- name: pull_request.base.repo.default_branch
dtype: string
- name: pull_request.base.repo.description
dtype: string
- name: pull_request.base.repo.forks_count
dtype: int64
- name: pull_request.base.repo.homepage
dtype: string
- name: pull_request.base.repo.language
dtype: string
- name: pull_request.base.repo.license.name
dtype: string
- name: pull_request.base.repo.name
dtype: string
- name: pull_request.base.repo.open_issues_count
dtype: int64
- name: pull_request.base.repo.owner.login
dtype: string
- name: pull_request.base.repo.owner.type
dtype: string
- name: pull_request.base.repo.private
dtype: bool
- name: pull_request.base.repo.stargazers_count
dtype: int64
- name: pull_request.base.repo.watchers_count
dtype: int64
- name: pull_request.base.sha
dtype: string
- name: pull_request.base.user.login
dtype: string
- name: pull_request.base.user.type
dtype: string
- name: pull_request.comments
dtype: int64
- name: pull_request.label.name
dtype: 'null'
- name: pull_request.review_comments
dtype: int64
- name: events
list:
- name: action
dtype: string
- name: actor.id
dtype: int64
- name: actor.login
dtype: string
- name: comment.author_association
dtype: string
- name: comment.body
dtype: string
- name: comment.commit_id
dtype: string
- name: comment.created_at
dtype: string
- name: comment.diff_hunk
dtype: string
- name: comment.id
dtype: int64
- name: comment.in_reply_to_id
dtype: int64
- name: comment.line
dtype: int64
- name: comment.original_commit_id
dtype: string
- name: comment.original_line
dtype: int64
- name: comment.original_position
dtype: int64
- name: comment.original_start_line
dtype: int64
- name: comment.path
dtype: string
- name: comment.position
dtype: int64
- name: comment.side
dtype: string
- name: comment.start_line
dtype: int64
- name: comment.start_side
dtype: string
- name: comment.updated_at
dtype: string
- name: created_at
dtype: timestamp[us, tz=UTC]
- name: issue.author
dtype: string
- name: issue.comment
dtype: string
- name: issue.comment_id
dtype: float64
- name: review.author_association
dtype: string
- name: review.body
dtype: string
- name: review.commit_id
dtype: string
- name: review.id
dtype: int64
- name: review.state
dtype: string
- name: review.submitted_at
dtype: string
- name: type
dtype: string
- name: user.login
dtype: string
- name: user.type
dtype: string
splits:
- name: train
num_bytes: 54214029
num_examples: 10000
download_size: 16095878
dataset_size: 54214029
---
# Dataset Card for "clean_prs2"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
loubnabnl
原始信息汇总
数据集概述
数据集信息
- 特征列表:
- bucket:类型为字符串。
- pull_request_info:包含以下字段:
org.id:类型为int64。public:类型为布尔值。pull_request.additions:类型为int64。pull_request.body:类型为字符串。pull_request.changed_files:类型为int64。pull_request.closed_at:类型为字符串。pull_request.comments:类型为int64。pull_request.commits:类型为int64。pull_request.created_at:类型为字符串。pull_request.deletions:类型为int64。pull_request.guid:类型为字符串。pull_request.id:类型为int64。pull_request.merged_at:类型为字符串。pull_request.merged_by.login:类型为字符串。pull_request.milestone.description:类型为字符串。pull_request.milestone.number:类型为int64。pull_request.milestone.title:类型为字符串。pull_request.number:类型为int64。pull_request.review_comments:类型为int64。pull_request.state:类型为字符串。pull_request.title:类型为字符串。pull_request.user.id:类型为int64。pull_request.user.login:类型为字符串。repo.id:类型为int64。repo.name:类型为字符串。
- head_repo_info:包含以下字段:
pull_request.head.label:类型为字符串。pull_request.head.ref:类型为字符串。pull_request.head.repo.default_branch:类型为字符串。pull_request.head.repo.description:类型为字符串。pull_request.head.repo.homepage:类型为字符串。pull_request.head.repo.language:类型为字符串。pull_request.head.repo.license.name:类型为字符串。pull_request.head.repo.name:类型为字符串。pull_request.head.repo.owner.login:类型为字符串。pull_request.head.repo.owner.type:类型为字符串。pull_request.head.repo.private:类型为布尔值。pull_request.head.repo.stargazers_count:类型为int64。pull_request.head.sha:类型为字符串。pull_request.head.user.login:类型为字符串。pull_request.head.user.type:类型为字符串。
- base_repo_info:包含以下字段:
pull_request.base.label:类型为字符串。pull_request.base.ref:类型为字符串。pull_request.base.repo.default_branch:类型为字符串。pull_request.base.repo.description:类型为字符串。pull_request.base.repo.forks_count:类型为int64。pull_request.base.repo.homepage:类型为字符串。pull_request.base.repo.language:类型为字符串。pull_request.base.repo.license.name:类型为字符串。pull_request.base.repo.name:类型为字符串。pull_request.base.repo.open_issues_count:类型为int64。pull_request.base.repo.owner.login:类型为字符串。pull_request.base.repo.owner.type:类型为字符串。pull_request.base.repo.private:类型为布尔值。pull_request.base.repo.stargazers_count:类型为int64。pull_request.base.repo.watchers_count:类型为int64。pull_request.base.sha:类型为字符串。pull_request.base.user.login:类型为字符串。pull_request.base.user.type:类型为字符串。pull_request.comments:类型为int64。pull_request.label.name:类型为null。pull_request.review_comments:类型为int64。
- events:包含以下字段:
action:类型为字符串。actor.id:类型为int64。actor.login:类型为字符串。comment.author_association:类型为字符串。comment.body:类型为字符串。comment.commit_id:类型为字符串。comment.created_at:类型为字符串。comment.diff_hunk:类型为字符串。comment.id:类型为int64。comment.in_reply_to_id:类型为int64。comment.line:类型为int64。comment.original_commit_id:类型为字符串。comment.original_line:类型为int64。comment.original_position:类型为int64。comment.original_start_line:类型为int64。comment.path:类型为字符串。comment.position:类型为int64。comment.side:类型为字符串。comment.start_line:类型为int64。comment.start_side:类型为字符串。comment.updated_at:类型为字符串。created_at:类型为timestamp[us, tz=UTC]。issue.author:类型为字符串。issue.comment:类型为字符串。issue.comment_id:类型为float64。review.author_association:类型为字符串。review.body:类型为字符串。review.commit_id:类型为字符串。review.id:类型为int64。review.state:类型为字符串。review.submitted_at:类型为字符串。type:类型为字符串。user.login:类型为字符串。user.type:类型为字符串。
数据集分割
- train:包含10000个样本,总字节数为54214029。
数据集大小
- 下载大小:16095878字节。
- 数据集大小:54214029字节。



