gayanin/pubmed-abstracts-noised-with-kaggle-dist
收藏Hugging Face2024-02-07 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/gayanin/pubmed-abstracts-noised-with-kaggle-dist
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: prob-01
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 18080692
num_examples: 74724
- name: test
num_bytes: 2316437
num_examples: 9341
- name: validation
num_bytes: 2380973
num_examples: 9341
download_size: 12750634
dataset_size: 22778102
- config_name: prob-02
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 17348001
num_examples: 74724
- name: test
num_bytes: 2221947
num_examples: 9341
- name: validation
num_bytes: 2284820
num_examples: 9341
download_size: 12451805
dataset_size: 21854768
- config_name: prob-03
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 16610860
num_examples: 74724
- name: test
num_bytes: 2128222
num_examples: 9341
- name: validation
num_bytes: 2185283
num_examples: 9341
download_size: 12122298
dataset_size: 20924365
- config_name: prob-04
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 15890091
num_examples: 74724
- name: test
num_bytes: 2031043
num_examples: 9341
- name: validation
num_bytes: 2091710
num_examples: 9341
download_size: 11751717
dataset_size: 20012844
- config_name: prob-05
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 15156449
num_examples: 74724
- name: test
num_bytes: 1944482
num_examples: 9341
- name: validation
num_bytes: 1997171
num_examples: 9341
download_size: 11347983
dataset_size: 19098102
configs:
- config_name: prob-01
data_files:
- split: train
path: prob-01/train-*
- split: test
path: prob-01/test-*
- split: validation
path: prob-01/validation-*
- config_name: prob-02
data_files:
- split: train
path: prob-02/train-*
- split: test
path: prob-02/test-*
- split: validation
path: prob-02/validation-*
- config_name: prob-03
data_files:
- split: train
path: prob-03/train-*
- split: test
path: prob-03/test-*
- split: validation
path: prob-03/validation-*
- config_name: prob-04
data_files:
- split: train
path: prob-04/train-*
- split: test
path: prob-04/test-*
- split: validation
path: prob-04/validation-*
- config_name: prob-05
data_files:
- split: train
path: prob-05/train-*
- split: test
path: prob-05/test-*
- split: validation
path: prob-05/validation-*
---
提供机构:
gayanin
原始信息汇总
数据集概述
数据集配置
-
config_name: prob-01
- 特征:
- name: refs, dtype: string
- name: trans, dtype: string
- 分割:
- train: num_bytes: 18080692, num_examples: 74724
- test: num_bytes: 2316437, num_examples: 9341
- validation: num_bytes: 2380973, num_examples: 9341
- 下载大小: 12750634
- 数据集大小: 22778102
- 特征:
-
config_name: prob-02
- 特征:
- name: refs, dtype: string
- name: trans, dtype: string
- 分割:
- train: num_bytes: 17348001, num_examples: 74724
- test: num_bytes: 2221947, num_examples: 9341
- validation: num_bytes: 2284820, num_examples: 9341
- 下载大小: 12451805
- 数据集大小: 21854768
- 特征:
-
config_name: prob-03
- 特征:
- name: refs, dtype: string
- name: trans, dtype: string
- 分割:
- train: num_bytes: 16610860, num_examples: 74724
- test: num_bytes: 2128222, num_examples: 9341
- validation: num_bytes: 2185283, num_examples: 9341
- 下载大小: 12122298
- 数据集大小: 20924365
- 特征:
-
config_name: prob-04
- 特征:
- name: refs, dtype: string
- name: trans, dtype: string
- 分割:
- train: num_bytes: 15890091, num_examples: 74724
- test: num_bytes: 2031043, num_examples: 9341
- validation: num_bytes: 2091710, num_examples: 9341
- 下载大小: 11751717
- 数据集大小: 20012844
- 特征:
-
config_name: prob-05
- 特征:
- name: refs, dtype: string
- name: trans, dtype: string
- 分割:
- train: num_bytes: 15156449, num_examples: 74724
- test: num_bytes: 1944482, num_examples: 9341
- validation: num_bytes: 1997171, num_examples: 9341
- 下载大小: 11347983
- 数据集大小: 19098102
- 特征:
数据文件路径
-
config_name: prob-01
- train: prob-01/train-*
- test: prob-01/test-*
- validation: prob-01/validation-*
-
config_name: prob-02
- train: prob-02/train-*
- test: prob-02/test-*
- validation: prob-02/validation-*
-
config_name: prob-03
- train: prob-03/train-*
- test: prob-03/test-*
- validation: prob-03/validation-*
-
config_name: prob-04
- train: prob-04/train-*
- test: prob-04/test-*
- validation: prob-04/validation-*
-
config_name: prob-05
- train: prob-05/train-*
- test: prob-05/test-*
- validation: prob-05/validation-*



