gayanin/pubmed-abstracts-dist-noised
收藏Hugging Face2024-02-12 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/gayanin/pubmed-abstracts-dist-noised
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: babylon-01
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 18966629
num_examples: 74724
- name: test
num_bytes: 2498780
num_examples: 9341
- name: validation
num_bytes: 2430470
num_examples: 9341
download_size: 13371241
dataset_size: 23895879
- config_name: babylon-02
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 19119149
num_examples: 74724
- name: test
num_bytes: 2518943
num_examples: 9341
- name: validation
num_bytes: 2450189
num_examples: 9341
download_size: 13665855
dataset_size: 24088281
- config_name: babylon-03
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 19273175
num_examples: 74724
- name: test
num_bytes: 2539404
num_examples: 9341
- name: validation
num_bytes: 2470170
num_examples: 9341
download_size: 13917268
dataset_size: 24282749
- config_name: gcd-01
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 18762932
num_examples: 74724
- name: test
num_bytes: 2470354
num_examples: 9341
- name: validation
num_bytes: 2404075
num_examples: 9341
download_size: 13219782
dataset_size: 23637361
- config_name: gcd-02
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 18711610
num_examples: 74724
- name: test
num_bytes: 2464019
num_examples: 9341
- name: validation
num_bytes: 2397279
num_examples: 9341
download_size: 13357450
dataset_size: 23572908
- config_name: gcd-03
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 18656483
num_examples: 74724
- name: test
num_bytes: 2458101
num_examples: 9341
- name: validation
num_bytes: 2391598
num_examples: 9341
download_size: 13450620
dataset_size: 23506182
- config_name: gcd-04
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 18607987
num_examples: 74724
- name: test
num_bytes: 2452163
num_examples: 9341
- name: validation
num_bytes: 2382726
num_examples: 9341
download_size: 13518201
dataset_size: 23442876
- config_name: kaggle-01
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 18741304
num_examples: 74724
- name: test
num_bytes: 2468049
num_examples: 9341
- name: validation
num_bytes: 2401399
num_examples: 9341
download_size: 13191893
dataset_size: 23610752
- config_name: kaggle-02
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 18668842
num_examples: 74724
- name: test
num_bytes: 2458530
num_examples: 9341
- name: validation
num_bytes: 2391012
num_examples: 9341
download_size: 13313844
dataset_size: 23518384
- config_name: kaggle-03
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 18598440
num_examples: 74724
- name: test
num_bytes: 2449161
num_examples: 9341
- name: validation
num_bytes: 2382943
num_examples: 9341
download_size: 13399488
dataset_size: 23430544
- config_name: kaggle-04
features:
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 18520899
num_examples: 74724
- name: test
num_bytes: 2443154
num_examples: 9341
- name: validation
num_bytes: 2372869
num_examples: 9341
download_size: 13447691
dataset_size: 23336922
configs:
- config_name: babylon-01
data_files:
- split: train
path: babylon-01/train-*
- split: test
path: babylon-01/test-*
- split: validation
path: babylon-01/validation-*
- config_name: babylon-02
data_files:
- split: train
path: babylon-02/train-*
- split: test
path: babylon-02/test-*
- split: validation
path: babylon-02/validation-*
- config_name: babylon-03
data_files:
- split: train
path: babylon-03/train-*
- split: test
path: babylon-03/test-*
- split: validation
path: babylon-03/validation-*
- config_name: gcd-01
data_files:
- split: train
path: gcd-01/train-*
- split: test
path: gcd-01/test-*
- split: validation
path: gcd-01/validation-*
- config_name: gcd-02
data_files:
- split: train
path: gcd-02/train-*
- split: test
path: gcd-02/test-*
- split: validation
path: gcd-02/validation-*
- config_name: gcd-03
data_files:
- split: train
path: gcd-03/train-*
- split: test
path: gcd-03/test-*
- split: validation
path: gcd-03/validation-*
- config_name: gcd-04
data_files:
- split: train
path: gcd-04/train-*
- split: test
path: gcd-04/test-*
- split: validation
path: gcd-04/validation-*
- config_name: kaggle-01
data_files:
- split: train
path: kaggle-01/train-*
- split: test
path: kaggle-01/test-*
- split: validation
path: kaggle-01/validation-*
- config_name: kaggle-02
data_files:
- split: train
path: kaggle-02/train-*
- split: test
path: kaggle-02/test-*
- split: validation
path: kaggle-02/validation-*
- config_name: kaggle-03
data_files:
- split: train
path: kaggle-03/train-*
- split: test
path: kaggle-03/test-*
- split: validation
path: kaggle-03/validation-*
- config_name: kaggle-04
data_files:
- split: train
path: kaggle-04/train-*
- split: test
path: kaggle-04/test-*
- split: validation
path: kaggle-04/validation-*
---
提供机构:
gayanin
原始信息汇总
数据集概述
数据集配置
babylon-01
- 特征:
refs: stringtrans: string
- 分割:
train: 18966629 字节, 74724 样本test: 2498780 字节, 9341 样本validation: 2430470 字节, 9341 样本
- 下载大小: 13371241 字节
- 数据集大小: 23895879 字节
babylon-02
- 特征:
refs: stringtrans: string
- 分割:
train: 19119149 字节, 74724 样本test: 2518943 字节, 9341 样本validation: 2450189 字节, 9341 样本
- 下载大小: 13665855 字节
- 数据集大小: 24088281 字节
babylon-03
- 特征:
refs: stringtrans: string
- 分割:
train: 19273175 字节, 74724 样本test: 2539404 字节, 9341 样本validation: 2470170 字节, 9341 样本
- 下载大小: 13917268 字节
- 数据集大小: 24282749 字节
gcd-01
- 特征:
refs: stringtrans: string
- 分割:
train: 18762932 字节, 74724 样本test: 2470354 字节, 9341 样本validation: 2404075 字节, 9341 样本
- 下载大小: 13219782 字节
- 数据集大小: 23637361 字节
gcd-02
- 特征:
refs: stringtrans: string
- 分割:
train: 18711610 字节, 74724 样本test: 2464019 字节, 9341 样本validation: 2397279 字节, 9341 样本
- 下载大小: 13357450 字节
- 数据集大小: 23572908 字节
gcd-03
- 特征:
refs: stringtrans: string
- 分割:
train: 18656483 字节, 74724 样本test: 2458101 字节, 9341 样本validation: 2391598 字节, 9341 样本
- 下载大小: 13450620 字节
- 数据集大小: 23506182 字节
gcd-04
- 特征:
refs: stringtrans: string
- 分割:
train: 18607987 字节, 74724 样本test: 2452163 字节, 9341 样本validation: 2382726 字节, 9341 样本
- 下载大小: 13518201 字节
- 数据集大小: 23442876 字节
kaggle-01
- 特征:
refs: stringtrans: string
- 分割:
train: 18741304 字节, 74724 样本test: 2468049 字节, 9341 样本validation: 2401399 字节, 9341 样本
- 下载大小: 13191893 字节
- 数据集大小: 23610752 字节
kaggle-02
- 特征:
refs: stringtrans: string
- 分割:
train: 18668842 字节, 74724 样本test: 2458530 字节, 9341 样本validation: 2391012 字节, 9341 样本
- 下载大小: 13313844 字节
- 数据集大小: 23518384 字节
kaggle-03
- 特征:
refs: stringtrans: string
- 分割:
train: 18598440 字节, 74724 样本test: 2449161 字节, 9341 样本validation: 2382943 字节, 9341 样本
- 下载大小: 13399488 字节
- 数据集大小: 23430544 字节
kaggle-04
- 特征:
refs: stringtrans: string
- 分割:
train: 18520899 字节, 74724 样本test: 2443154 字节, 9341 样本validation: 2372869 字节, 9341 样本
- 下载大小: 13447691 字节
- 数据集大小: 23336922 字节



