librarian-bots/dataset_abstracts
收藏Hugging Face2023-10-31 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/librarian-bots/dataset_abstracts
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
size_categories:
- n<1K
task_categories:
- text-classification
dataset_info:
- config_name: annotated
features:
- name: text
dtype: string
- name: inputs
struct:
- name: abstract
dtype: string
- name: title
dtype: string
- name: url
dtype: string
- name: prediction
dtype: 'null'
- name: prediction_agent
dtype: 'null'
- name: annotation
dtype: string
- name: annotation_agent
dtype: string
- name: vectors
dtype: 'null'
- name: multi_label
dtype: bool
- name: explanation
dtype: 'null'
- name: id
dtype: string
- name: metadata
dtype: 'null'
- name: status
dtype: string
- name: event_timestamp
dtype: string
- name: metrics
struct:
- name: text_length
dtype: int64
- name: label
dtype:
class_label:
names:
'0': new_dataset
'1': no_new_dataset
splits:
- name: train
num_bytes: 1099065.75
num_examples: 390
- name: test
num_bytes: 366355.25
num_examples: 130
download_size: 865263
dataset_size: 1465421.0
- config_name: unlabelled
features:
- name: text
dtype: string
- name: inputs
struct:
- name: abstract
dtype: string
- name: title
dtype: string
- name: url
dtype: string
- name: prediction
dtype: 'null'
- name: prediction_agent
dtype: 'null'
- name: annotation
dtype: string
- name: annotation_agent
dtype: string
- name: vectors
dtype: 'null'
- name: multi_label
dtype: bool
- name: explanation
dtype: 'null'
- name: id
dtype: string
- name: metadata
dtype: 'null'
- name: status
dtype: string
- name: event_timestamp
dtype: timestamp[us]
- name: metrics
struct:
- name: text_length
dtype: int64
- name: label
dtype: string
splits:
- name: train
num_bytes: 1372259.876
num_examples: 494
download_size: 792644
dataset_size: 1372259.876
configs:
- config_name: annotated
data_files:
- split: train
path: annotated/train-*
- split: test
path: annotated/test-*
- config_name: unlabelled
data_files:
- split: train
path: unlabelled/train-*
tags:
- 'arxiv '
---
# Dataset Card for "dataset_abstracts"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
librarian-bots
原始信息汇总
数据集概述
语言
- 英语 (en)
数据集大小类别
- n<1K
任务类别
- 文本分类 (text-classification)
数据集信息
配置名称: annotated
- 特征:
- text: 字符串 (string)
- inputs: 结构体
- abstract: 字符串 (string)
- title: 字符串 (string)
- url: 字符串 (string)
- prediction: null
- prediction_agent: null
- annotation: 字符串 (string)
- annotation_agent: 字符串 (string)
- vectors: null
- multi_label: 布尔值 (bool)
- explanation: null
- id: 字符串 (string)
- metadata: null
- status: 字符串 (string)
- event_timestamp: 字符串 (string)
- metrics: 结构体
- text_length: 64位整数 (int64)
- label: 类别标签
- names:
- 0: new_dataset
- 1: no_new_dataset
- names:
- 分割:
- train:
- 字节数: 1099065.75
- 样本数: 390
- test:
- 字节数: 366355.25
- 样本数: 130
- train:
- 下载大小: 865263
- 数据集大小: 1465421.0
配置名称: unlabelled
- 特征:
- text: 字符串 (string)
- inputs: 结构体
- abstract: 字符串 (string)
- title: 字符串 (string)
- url: 字符串 (string)
- prediction: null
- prediction_agent: null
- annotation: 字符串 (string)
- annotation_agent: 字符串 (string)
- vectors: null
- multi_label: 布尔值 (bool)
- explanation: null
- id: 字符串 (string)
- metadata: null
- status: 字符串 (string)
- event_timestamp: 时间戳 (timestamp[us])
- metrics: 结构体
- text_length: 64位整数 (int64)
- label: 字符串 (string)
- 分割:
- train:
- 字节数: 1372259.876
- 样本数: 494
- train:
- 下载大小: 792644
- 数据集大小: 1372259.876
配置
- 配置名称: annotated
- 数据文件:
- train: annotated/train-*
- test: annotated/test-*
- 数据文件:
- 配置名称: unlabelled
- 数据文件:
- train: unlabelled/train-*
- 数据文件:
标签
- arxiv



