gayanin/woz-noised-v2
收藏Hugging Face2024-02-12 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/gayanin/woz-noised-v2
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: babylon-01
features:
- name: 'Unnamed: 0'
dtype: int64
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 2696523
num_examples: 20304
- name: test
num_bytes: 333956
num_examples: 2538
- name: validation
num_bytes: 336851
num_examples: 2538
download_size: 1739698
dataset_size: 3367330
- config_name: babylon-02
features:
- name: 'Unnamed: 0'
dtype: int64
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 2731341
num_examples: 20304
- name: test
num_bytes: 338120
num_examples: 2538
- name: validation
num_bytes: 341833
num_examples: 2538
download_size: 1804391
dataset_size: 3411294
- config_name: babylon-03
features:
- name: 'Unnamed: 0'
dtype: int64
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 2766038
num_examples: 20304
- name: test
num_bytes: 342350
num_examples: 2538
- name: validation
num_bytes: 346241
num_examples: 2538
download_size: 1862446
dataset_size: 3454629
- config_name: babylon-04
features:
- name: 'Unnamed: 0'
dtype: int64
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 2800053
num_examples: 20304
- name: test
num_bytes: 346431
num_examples: 2538
- name: validation
num_bytes: 351122
num_examples: 2538
download_size: 1915713
dataset_size: 3497606
- config_name: gcd-01
features:
- name: 'Unnamed: 0'
dtype: int64
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 2674106
num_examples: 20304
- name: test
num_bytes: 331791
num_examples: 2538
- name: validation
num_bytes: 334986
num_examples: 2539
download_size: 1729475
dataset_size: 3340883
- config_name: gcd-02
features:
- name: 'Unnamed: 0'
dtype: int64
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 2675998
num_examples: 20304
- name: test
num_bytes: 332083
num_examples: 2538
- name: validation
num_bytes: 335001
num_examples: 2539
download_size: 1770429
dataset_size: 3343082
- config_name: gcd-03
features:
- name: 'Unnamed: 0'
dtype: int64
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 2678045
num_examples: 20304
- name: test
num_bytes: 331741
num_examples: 2538
- name: validation
num_bytes: 335311
num_examples: 2539
download_size: 1801804
dataset_size: 3345097
- config_name: gcd-04
features:
- name: 'Unnamed: 0'
dtype: int64
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 2679180
num_examples: 20304
- name: test
num_bytes: 331891
num_examples: 2538
- name: validation
num_bytes: 335157
num_examples: 2539
download_size: 1824476
dataset_size: 3346228
- config_name: kaggle-01
features:
- name: 'Unnamed: 0'
dtype: int64
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 2671119
num_examples: 20304
- name: test
num_bytes: 337356
num_examples: 2538
- name: validation
num_bytes: 325630
num_examples: 2539
download_size: 1727525
dataset_size: 3334105
- config_name: kaggle-02
features:
- name: 'Unnamed: 0'
dtype: int64
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 2672890
num_examples: 20304
- name: test
num_bytes: 337629
num_examples: 2538
- name: validation
num_bytes: 325210
num_examples: 2539
download_size: 1769572
dataset_size: 3335729
- config_name: kaggle-03
features:
- name: 'Unnamed: 0'
dtype: int64
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 2675226
num_examples: 20304
- name: test
num_bytes: 337519
num_examples: 2538
- name: validation
num_bytes: 325606
num_examples: 2539
download_size: 1798800
dataset_size: 3338351
- config_name: kaggle-04
features:
- name: 'Unnamed: 0'
dtype: int64
- name: refs
dtype: string
- name: trans
dtype: string
splits:
- name: train
num_bytes: 2673658
num_examples: 20304
- name: test
num_bytes: 338746
num_examples: 2538
- name: validation
num_bytes: 325890
num_examples: 2539
download_size: 1822791
dataset_size: 3338294
configs:
- config_name: babylon-01
data_files:
- split: train
path: babylon-01/train-*
- split: test
path: babylon-01/test-*
- split: validation
path: babylon-01/validation-*
- config_name: babylon-02
data_files:
- split: train
path: babylon-02/train-*
- split: test
path: babylon-02/test-*
- split: validation
path: babylon-02/validation-*
- config_name: babylon-03
data_files:
- split: train
path: babylon-03/train-*
- split: test
path: babylon-03/test-*
- split: validation
path: babylon-03/validation-*
- config_name: babylon-04
data_files:
- split: train
path: babylon-04/train-*
- split: test
path: babylon-04/test-*
- split: validation
path: babylon-04/validation-*
- config_name: gcd-01
data_files:
- split: train
path: gcd-01/train-*
- split: test
path: gcd-01/test-*
- split: validation
path: gcd-01/validation-*
- config_name: gcd-02
data_files:
- split: train
path: gcd-02/train-*
- split: test
path: gcd-02/test-*
- split: validation
path: gcd-02/validation-*
- config_name: gcd-03
data_files:
- split: train
path: gcd-03/train-*
- split: test
path: gcd-03/test-*
- split: validation
path: gcd-03/validation-*
- config_name: gcd-04
data_files:
- split: train
path: gcd-04/train-*
- split: test
path: gcd-04/test-*
- split: validation
path: gcd-04/validation-*
- config_name: kaggle-01
data_files:
- split: train
path: kaggle-01/train-*
- split: test
path: kaggle-01/test-*
- split: validation
path: kaggle-01/validation-*
- config_name: kaggle-02
data_files:
- split: train
path: kaggle-02/train-*
- split: test
path: kaggle-02/test-*
- split: validation
path: kaggle-02/validation-*
- config_name: kaggle-03
data_files:
- split: train
path: kaggle-03/train-*
- split: test
path: kaggle-03/test-*
- split: validation
path: kaggle-03/validation-*
- config_name: kaggle-04
data_files:
- split: train
path: kaggle-04/train-*
- split: test
path: kaggle-04/test-*
- split: validation
path: kaggle-04/validation-*
---
提供机构:
gayanin
原始信息汇总
数据集概述
数据集配置
babylon-01
- 特征:
Unnamed: 0: int64refs: stringtrans: string
- 分割:
train: 2696523 字节, 20304 样本test: 333956 字节, 2538 样本validation: 336851 字节, 2538 样本
- 下载大小: 1739698 字节
- 数据集大小: 3367330 字节
babylon-02
- 特征:
Unnamed: 0: int64refs: stringtrans: string
- 分割:
train: 2731341 字节, 20304 样本test: 338120 字节, 2538 样本validation: 341833 字节, 2538 样本
- 下载大小: 1804391 字节
- 数据集大小: 3411294 字节
babylon-03
- 特征:
Unnamed: 0: int64refs: stringtrans: string
- 分割:
train: 2766038 字节, 20304 样本test: 342350 字节, 2538 样本validation: 346241 字节, 2538 样本
- 下载大小: 1862446 字节
- 数据集大小: 3454629 字节
babylon-04
- 特征:
Unnamed: 0: int64refs: stringtrans: string
- 分割:
train: 2800053 字节, 20304 样本test: 346431 字节, 2538 样本validation: 351122 字节, 2538 样本
- 下载大小: 1915713 字节
- 数据集大小: 3497606 字节
gcd-01
- 特征:
Unnamed: 0: int64refs: stringtrans: string
- 分割:
train: 2674106 字节, 20304 样本test: 331791 字节, 2538 样本validation: 334986 字节, 2539 样本
- 下载大小: 1729475 字节
- 数据集大小: 3340883 字节
gcd-02
- 特征:
Unnamed: 0: int64refs: stringtrans: string
- 分割:
train: 2675998 字节, 20304 样本test: 332083 字节, 2538 样本validation: 335001 字节, 2539 样本
- 下载大小: 1770429 字节
- 数据集大小: 3343082 字节
gcd-03
- 特征:
Unnamed: 0: int64refs: stringtrans: string
- 分割:
train: 2678045 字节, 20304 样本test: 331741 字节, 2538 样本validation: 335311 字节, 2539 样本
- 下载大小: 1801804 字节
- 数据集大小: 3345097 字节
gcd-04
- 特征:
Unnamed: 0: int64refs: stringtrans: string
- 分割:
train: 2679180 字节, 20304 样本test: 331891 字节, 2538 样本validation: 335157 字节, 2539 样本
- 下载大小: 1824476 字节
- 数据集大小: 3346228 字节
kaggle-01
- 特征:
Unnamed: 0: int64refs: stringtrans: string
- 分割:
train: 2671119 字节, 20304 样本test: 337356 字节, 2538 样本validation: 325630 字节, 2539 样本
- 下载大小: 1727525 字节
- 数据集大小: 3334105 字节
kaggle-02
- 特征:
Unnamed: 0: int64refs: stringtrans: string
- 分割:
train: 2672890 字节, 20304 样本test: 337629 字节, 2538 样本validation: 325210 字节, 2539 样本
- 下载大小: 1769572 字节
- 数据集大小: 3335729 字节
kaggle-03
- 特征:
Unnamed: 0: int64refs: stringtrans: string
- 分割:
train: 2675226 字节, 20304 样本test: 337519 字节, 2538 样本validation: 325606 字节, 2539 样本
- 下载大小: 1798800 字节
- 数据集大小: 3338351 字节
kaggle-04
- 特征:
Unnamed: 0: int64refs: stringtrans: string
- 分割:
train: 2673658 字节, 20304 样本test: 338746 字节, 2538 样本validation: 325890 字节, 2539 样本
- 下载大小: 1822791 字节
- 数据集大小: 3338294 字节



