hwang2006/huggingface-datasets-issues-2024-03-20
收藏Hugging Face2024-03-21 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/hwang2006/huggingface-datasets-issues-2024-03-20
下载链接
链接失效反馈官方服务:
资源简介:
Load Dataset
```
from datasets import load_dataset
#issues_dataset = load_dataset("hwang2006/huggingface-datasets-issues-2024-03-20", split="train")
#DatasetGenerationError: An error occurred while generating the dataset
from huggingface_hub import hf_hub_url
import pandas as pd
from datasets import Dataset
data_files = hf_hub_url(repo_id="hwang2006/huggingface-datasets-issues-2024-03-20", filename="datasets-issues-with-comments.jsonl", repo_type="dataset")
print(data_files)
#https://huggingface.co/datasets/hwang2006/huggingface-datasets-issues-2024-03-20/resolve/main/datasets-issues-with-comments.jsonl
df = pd.read_json(data_files, orient="records", lines=True)
issues_dataset = Dataset.from_pandas(df)
issues_dataset
#Dataset({
# features: ['url', 'repository_url', 'labels_url', 'comments_url', 'events_url', 'html_url', 'id', 'node_id', 'number', 'title', 'user', 'labels', 'state', 'locked', 'assignee', 'assignees', 'milestone', 'comments', 'created_at', 'updated_at', 'closed_at', 'author_association', 'active_lock_reason', 'body', 'reactions', 'timeline_url', 'performed_via_github_app', 'state_reason', 'draft', 'pull_request', 'is_pull_request'],
# num_rows: 6707
#})
```
提供机构:
hwang2006
原始信息汇总
数据集概述
数据集名称
- 名称:huggingface-datasets-issues-2024-03-20
数据集文件
- 文件名:datasets-issues-with-comments.jsonl
- 存储位置:https://huggingface.co/datasets/hwang2006/huggingface-datasets-issues-2024-03-20/resolve/main/datasets-issues-with-comments.jsonl
数据集内容
- 特征:包括url, repository_url, labels_url, comments_url, events_url, html_url, id, node_id, number, title, user, labels, state, locked, assignee, assignees, milestone, comments, created_at, updated_at, closed_at, author_association, active_lock_reason, body, reactions, timeline_url, performed_via_github_app, state_reason, draft, pull_request, is_pull_request等。
- 数据行数:6707行
数据集格式
- 数据格式:JSONL(JSON Lines)
- 读取工具:pandas
- 数据集对象:Dataset



