asahi417/seamless-align-enA-esA.tokenized
收藏Hugging Face2024-06-06 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/asahi417/seamless-align-enA-esA.tokenized
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: subset_1
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 881153742
num_examples: 2178
download_size: 136537530
dataset_size: 881153742
- config_name: subset_10
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 848240640
num_examples: 2228
download_size: 131540576
dataset_size: 848240640
- config_name: subset_11
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 847993246
num_examples: 2233
download_size: 131548844
dataset_size: 847993246
- config_name: subset_12
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 834738041
num_examples: 2201
download_size: 129408708
dataset_size: 834738041
- config_name: subset_13
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 836467270
num_examples: 2222
download_size: 129616910
dataset_size: 836467270
- config_name: subset_14
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 841267841
num_examples: 2227
download_size: 130498270
dataset_size: 841267841
- config_name: subset_15
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 839688047
num_examples: 2236
download_size: 130223098
dataset_size: 839688047
- config_name: subset_16
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 834174018
num_examples: 2251
download_size: 129353102
dataset_size: 834174018
- config_name: subset_17
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 831362480
num_examples: 2237
download_size: 128946218
dataset_size: 831362480
- config_name: subset_18
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 813152408
num_examples: 2195
download_size: 125981382
dataset_size: 813152408
- config_name: subset_19
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 817411488
num_examples: 2217
download_size: 126613597
dataset_size: 817411488
- config_name: subset_2
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 893806142
num_examples: 2245
download_size: 138721749
dataset_size: 893806142
- config_name: subset_20
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 766895678
num_examples: 2093
download_size: 118850132
dataset_size: 766895678
- config_name: subset_21
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 756362673
num_examples: 2083
download_size: 117248025
dataset_size: 756362673
- config_name: subset_22
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 729448221
num_examples: 1987
download_size: 112997786
dataset_size: 729448221
- config_name: subset_23
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 735809523
num_examples: 2037
download_size: 114005639
dataset_size: 735809523
- config_name: subset_24
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 755941985
num_examples: 2050
download_size: 117126141
dataset_size: 755941985
- config_name: subset_25
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 747368995
num_examples: 2043
download_size: 115820184
dataset_size: 747368995
- config_name: subset_251
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 506050056
num_examples: 1935
download_size: 78530998
dataset_size: 506050056
- config_name: subset_252
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 515800913
num_examples: 1923
download_size: 79986606
dataset_size: 515800913
- config_name: subset_253
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 512358314
num_examples: 1920
download_size: 79481904
dataset_size: 512358314
- config_name: subset_254
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 509917588
num_examples: 1958
download_size: 79056398
dataset_size: 509917588
- config_name: subset_255
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 494190410
num_examples: 1932
download_size: 76637462
dataset_size: 494190410
- config_name: subset_256
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 521326857
num_examples: 1929
download_size: 80771108
dataset_size: 521326857
- config_name: subset_257
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 504059887
num_examples: 1912
download_size: 78138092
dataset_size: 504059887
- config_name: subset_258
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 510132596
num_examples: 1936
download_size: 79149441
dataset_size: 510132596
- config_name: subset_259
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 505369138
num_examples: 1935
download_size: 78319450
dataset_size: 505369138
- config_name: subset_26
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 734808836
num_examples: 2046
download_size: 113729819
dataset_size: 734808836
- config_name: subset_260
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 470579008
num_examples: 1838
download_size: 72971858
dataset_size: 470579008
- config_name: subset_261
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 452873863
num_examples: 1757
download_size: 70242883
dataset_size: 452873863
- config_name: subset_262
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 458970628
num_examples: 1740
download_size: 71192318
dataset_size: 458970628
- config_name: subset_263
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 456531761
num_examples: 1734
download_size: 70794195
dataset_size: 456531761
- config_name: subset_264
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 459015977
num_examples: 1709
download_size: 71227007
dataset_size: 459015977
- config_name: subset_265
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 452473655
num_examples: 1744
download_size: 70138095
dataset_size: 452473655
- config_name: subset_266
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 459551128
num_examples: 1742
download_size: 71184721
dataset_size: 459551128
- config_name: subset_267
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 474554184
num_examples: 1778
download_size: 73589016
dataset_size: 474554184
- config_name: subset_268
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 490600469
num_examples: 1872
download_size: 76057313
dataset_size: 490600469
- config_name: subset_269
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 506671090
num_examples: 1928
download_size: 78559135
dataset_size: 506671090
- config_name: subset_27
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 735739642
num_examples: 2036
download_size: 113886271
dataset_size: 735739642
- config_name: subset_270
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 484405651
num_examples: 1911
download_size: 75128567
dataset_size: 484405651
- config_name: subset_271
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 488402361
num_examples: 1889
download_size: 75820802
dataset_size: 488402361
- config_name: subset_272
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 502900745
num_examples: 1916
download_size: 77939853
dataset_size: 502900745
- config_name: subset_273
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 493576150
num_examples: 1924
download_size: 76584833
dataset_size: 493576150
- config_name: subset_274
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 486450542
num_examples: 1867
download_size: 75369120
dataset_size: 486450542
- config_name: subset_275
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 485568877
num_examples: 1898
download_size: 75316446
dataset_size: 485568877
- config_name: subset_276
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 482407737
num_examples: 1882
download_size: 74822703
dataset_size: 482407737
- config_name: subset_277
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 491235418
num_examples: 1898
download_size: 76181845
dataset_size: 491235418
- config_name: subset_278
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 496336617
num_examples: 1913
download_size: 76966000
dataset_size: 496336617
- config_name: subset_279
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 498547599
num_examples: 1905
download_size: 77238062
dataset_size: 498547599
- config_name: subset_28
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 746358224
num_examples: 2067
download_size: 115565073
dataset_size: 746358224
- config_name: subset_280
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 477679597
num_examples: 1900
download_size: 74089202
dataset_size: 477679597
- config_name: subset_29
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 762157012
num_examples: 2111
download_size: 118020803
dataset_size: 762157012
- config_name: subset_3
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 878633142
num_examples: 2234
download_size: 136305739
dataset_size: 878633142
- config_name: subset_30
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 785009756
num_examples: 2197
download_size: 121587401
dataset_size: 785009756
- config_name: subset_31
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 783207625
num_examples: 2192
download_size: 121401615
dataset_size: 783207625
- config_name: subset_32
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 796526167
num_examples: 2219
download_size: 123345872
dataset_size: 796526167
- config_name: subset_33
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 781004926
num_examples: 2198
download_size: 121058727
dataset_size: 781004926
- config_name: subset_34
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 772054226
num_examples: 2199
download_size: 119759766
dataset_size: 772054226
- config_name: subset_35
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 750251312
num_examples: 2125
download_size: 116229974
dataset_size: 750251312
- config_name: subset_36
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 752658562
num_examples: 2145
download_size: 116617303
dataset_size: 752658562
- config_name: subset_37
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 755359982
num_examples: 2162
download_size: 117079496
dataset_size: 755359982
- config_name: subset_38
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 744287399
num_examples: 2136
download_size: 115267020
dataset_size: 744287399
- config_name: subset_39
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 752666106
num_examples: 2174
download_size: 116586427
dataset_size: 752666106
- config_name: subset_4
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 884721085
num_examples: 2240
download_size: 137345614
dataset_size: 884721085
- config_name: subset_40
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 747976326
num_examples: 2154
download_size: 115748183
dataset_size: 747976326
- config_name: subset_41
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 746698427
num_examples: 2142
download_size: 115561767
dataset_size: 746698427
- config_name: subset_42
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 739896287
num_examples: 2161
download_size: 114492504
dataset_size: 739896287
- config_name: subset_43
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 758975848
num_examples: 2157
download_size: 117505790
dataset_size: 758975848
- config_name: subset_44
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 732483037
num_examples: 2124
download_size: 113482105
dataset_size: 732483037
- config_name: subset_45
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 744844960
num_examples: 2161
download_size: 115469814
dataset_size: 744844960
- config_name: subset_46
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 747016900
num_examples: 2152
download_size: 115579942
dataset_size: 747016900
- config_name: subset_47
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 737115239
num_examples: 2150
download_size: 114149181
dataset_size: 737115239
- config_name: subset_48
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 722342098
num_examples: 2123
download_size: 111741617
dataset_size: 722342098
- config_name: subset_49
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 739176034
num_examples: 2154
download_size: 114604775
dataset_size: 739176034
- config_name: subset_5
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 881051151
num_examples: 2216
download_size: 136696003
dataset_size: 881051151
- config_name: subset_50
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 732994952
num_examples: 2139
download_size: 113516764
dataset_size: 732994952
- config_name: subset_6
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 848715939
num_examples: 2212
download_size: 131752301
dataset_size: 848715939
- config_name: subset_7
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 872942854
num_examples: 2266
download_size: 135365796
dataset_size: 872942854
- config_name: subset_8
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: enA.audio.tokens
sequence:
sequence: int64
- name: esA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 866726842
num_examples: 2240
download_size: 134450881
dataset_size: 866726842
- config_name: subset_9
features:
- name: line_no
dtype: int64
- name: enA.id
dtype: string
- name: enA.laser_score
dtype: float64
- name: esA.id
dtype: string
- name: esA.laser_score
dtype: float64
- name: esA.audio.tokens
sequence:
sequence: int64
- name: enA.audio.tokens
sequence:
sequence: int64
splits:
- name: train
num_bytes: 855496325
num_examples: 2236
download_size: 132772078
dataset_size: 855496325
configs:
- config_name: subset_1
data_files:
- split: train
path: subset_1/train-*
- config_name: subset_10
data_files:
- split: train
path: subset_10/train-*
- config_name: subset_11
data_files:
- split: train
path: subset_11/train-*
- config_name: subset_12
data_files:
- split: train
path: subset_12/train-*
- config_name: subset_13
data_files:
- split: train
path: subset_13/train-*
- config_name: subset_14
data_files:
- split: train
path: subset_14/train-*
- config_name: subset_15
data_files:
- split: train
path: subset_15/train-*
- config_name: subset_16
data_files:
- split: train
path: subset_16/train-*
- config_name: subset_17
data_files:
- split: train
path: subset_17/train-*
- config_name: subset_18
data_files:
- split: train
path: subset_18/train-*
- config_name: subset_19
data_files:
- split: train
path: subset_19/train-*
- config_name: subset_2
data_files:
- split: train
path: subset_2/train-*
- config_name: subset_20
data_files:
- split: train
path: subset_20/train-*
- config_name: subset_21
data_files:
- split: train
path: subset_21/train-*
- config_name: subset_22
data_files:
- split: train
path: subset_22/train-*
- config_name: subset_23
data_files:
- split: train
path: subset_23/train-*
- config_name: subset_24
data_files:
- split: train
path: subset_24/train-*
- config_name: subset_25
data_files:
- split: train
path: subset_25/train-*
- config_name: subset_251
data_files:
- split: train
path: subset_251/train-*
- config_name: subset_252
data_files:
- split: train
path: subset_252/train-*
- config_name: subset_253
data_files:
- split: train
path: subset_253/train-*
- config_name: subset_254
data_files:
- split: train
path: subset_254/train-*
- config_name: subset_255
data_files:
- split: train
path: subset_255/train-*
- config_name: subset_256
data_files:
- split: train
path: subset_256/train-*
- config_name: subset_257
data_files:
- split: train
path: subset_257/train-*
- config_name: subset_258
data_files:
- split: train
path: subset_258/train-*
- config_name: subset_259
data_files:
- split: train
path: subset_259/train-*
- config_name: subset_26
data_files:
- split: train
path: subset_26/train-*
- config_name: subset_260
data_files:
- split: train
path: subset_260/train-*
- config_name: subset_261
data_files:
- split: train
path: subset_261/train-*
- config_name: subset_262
data_files:
- split: train
path: subset_262/train-*
- config_name: subset_263
data_files:
- split: train
path: subset_263/train-*
- config_name: subset_264
data_files:
- split: train
path: subset_264/train-*
- config_name: subset_265
data_files:
- split: train
path: subset_265/train-*
- config_name: subset_266
data_files:
- split: train
path: subset_266/train-*
- config_name: subset_267
data_files:
- split: train
path: subset_267/train-*
- config_name: subset_268
data_files:
- split: train
path: subset_268/train-*
- config_name: subset_269
data_files:
- split: train
path: subset_269/train-*
- config_name: subset_27
data_files:
- split: train
path: subset_27/train-*
- config_name: subset_270
data_files:
- split: train
path: subset_270/train-*
- config_name: subset_271
data_files:
- split: train
path: subset_271/train-*
- config_name: subset_272
data_files:
- split: train
path: subset_272/train-*
- config_name: subset_273
data_files:
- split: train
path: subset_273/train-*
- config_name: subset_274
data_files:
- split: train
path: subset_274/train-*
- config_name: subset_275
data_files:
- split: train
path: subset_275/train-*
- config_name: subset_276
data_files:
- split: train
path: subset_276/train-*
- config_name: subset_277
data_files:
- split: train
path: subset_277/train-*
- config_name: subset_278
data_files:
- split: train
path: subset_278/train-*
- config_name: subset_279
data_files:
- split: train
path: subset_279/train-*
- config_name: subset_28
data_files:
- split: train
path: subset_28/train-*
- config_name: subset_280
data_files:
- split: train
path: subset_280/train-*
- config_name: subset_29
data_files:
- split: train
path: subset_29/train-*
- config_name: subset_3
data_files:
- split: train
path: subset_3/train-*
- config_name: subset_30
data_files:
- split: train
path: subset_30/train-*
- config_name: subset_31
data_files:
- split: train
path: subset_31/train-*
- config_name: subset_32
data_files:
- split: train
path: subset_32/train-*
- config_name: subset_33
data_files:
- split: train
path: subset_33/train-*
- config_name: subset_34
data_files:
- split: train
path: subset_34/train-*
- config_name: subset_35
data_files:
- split: train
path: subset_35/train-*
- config_name: subset_36
data_files:
- split: train
path: subset_36/train-*
- config_name: subset_37
data_files:
- split: train
path: subset_37/train-*
- config_name: subset_38
data_files:
- split: train
path: subset_38/train-*
- config_name: subset_39
data_files:
- split: train
path: subset_39/train-*
- config_name: subset_4
data_files:
- split: train
path: subset_4/train-*
- config_name: subset_40
data_files:
- split: train
path: subset_40/train-*
- config_name: subset_41
data_files:
- split: train
path: subset_41/train-*
- config_name: subset_42
data_files:
- split: train
path: subset_42/train-*
- config_name: subset_43
data_files:
- split: train
path: subset_43/train-*
- config_name: subset_44
data_files:
- split: train
path: subset_44/train-*
- config_name: subset_45
data_files:
- split: train
path: subset_45/train-*
- config_name: subset_46
data_files:
- split: train
path: subset_46/train-*
- config_name: subset_47
data_files:
- split: train
path: subset_47/train-*
- config_name: subset_48
data_files:
- split: train
path: subset_48/train-*
- config_name: subset_49
data_files:
- split: train
path: subset_49/train-*
- config_name: subset_5
data_files:
- split: train
path: subset_5/train-*
- config_name: subset_50
data_files:
- split: train
path: subset_50/train-*
- config_name: subset_6
data_files:
- split: train
path: subset_6/train-*
- config_name: subset_7
data_files:
- split: train
path: subset_7/train-*
- config_name: subset_8
data_files:
- split: train
path: subset_8/train-*
- config_name: subset_9
data_files:
- split: train
path: subset_9/train-*
---
The dataset consists of multiple subsets, each with similar features including line number (int64), English ID (string), English LASER score (float64), Spanish ID (string), Spanish LASER score (float64), English audio tokens (sequence of int64), and Spanish audio tokens (sequence of int64). Each subset includes training data with specified sizes and number of examples.
提供机构:
asahi417
原始信息汇总
数据集概述
子集1 (subset_1)
- 特征:
line_no: 数据类型为int64enA.id: 数据类型为stringenA.laser_score: 数据类型为float64esA.id: 数据类型为stringesA.laser_score: 数据类型为float64enA.audio.tokens: 序列类型为int64esA.audio.tokens: 序列类型为int64
- 分割:
train: 大小为881153742字节,包含2178个样本
- 下载大小: 136537530字节
- 数据集大小: 881153742字节
子集10 (subset_10)
- 特征: 同
subset_1 - 分割:
train: 大小为848240640字节,包含2228个样本
- 下载大小: 131540576字节
- 数据集大小: 848240640字节
子集11 (subset_11)
- 特征: 同
subset_1 - 分割:
train: 大小为847993246字节,包含2233个样本
- 下载大小: 131548844字节
- 数据集大小: 847993246字节
子集12 (subset_12)
- 特征: 同
subset_1 - 分割:
train: 大小为834738041字节,包含2201个样本
- 下载大小: 129408708字节
- 数据集大小: 834738041字节
子集13 (subset_13)
- 特征: 同
subset_1 - 分割:
train: 大小为836467270字节,包含2222个样本
- 下载大小: 129616910字节
- 数据集大小: 836467270字节
子集14 (subset_14)
- 特征: 同
subset_1 - 分割:
train: 大小为841267841字节,包含2227个样本
- 下载大小: 130498270字节
- 数据集大小: 841267841字节
子集15 (subset_15)
- 特征: 同
subset_1 - 分割:
train: 大小为839688047字节,包含2236个样本
- 下载大小: 130223098字节
- 数据集大小: 839688047字节
子集16 (subset_16)
- 特征: 同
subset_1 - 分割:
train: 大小为834174018字节,包含2251个样本
- 下载大小: 129353102字节
- 数据集大小: 834174018字节
子集17 (subset_17)
- 特征: 同
subset_1 - 分割:
train: 大小为831362480字节,包含2237个样本
- 下载大小: 128946218字节
- 数据集大小: 831362480字节
子集18 (subset_18)
- 特征: 同
subset_1 - 分割:
train: 大小为813152408字节,包含2195个样本
- 下载大小: 125981382字节
- 数据集大小: 813152408字节
子集19 (subset_19)
- 特征: 同
subset_1 - 分割:
train: 大小为817411488字节,包含2217个样本
- 下载大小: 126613597字节
- 数据集大小: 817411488字节
子集2 (subset_2)
- 特征: 同
subset_1 - 分割:
train: 大小为893806142字节,包含2245个样本
- 下载大小: 138721749字节
- 数据集大小: 893806142字节
子集20 (subset_20)
- 特征: 同
subset_1 - 分割:
train: 大小为766895678字节,包含2093个样本
- 下载大小: 118850132字节
- 数据集大小: 766895678字节
子集21 (subset_21)
- 特征: 同
subset_1 - 分割:
train: 大小为756362673字节,包含2083个样本
- 下载大小: 117248025字节
- 数据集大小: 756362673字节
子集22 (subset_22)
- 特征: 同
subset_1 - 分割:
train: 大小为729448221字节,包含1987个样本
- 下载大小: 112997786字节
- 数据集大小: 729448221字节
子集23 (subset_23)
- 特征: 同
subset_1 - 分割:
train: 大小为735809523字节,包含2037个样本
- 下载大小: 114005639字节
- 数据集大小: 735809523字节
子集24 (subset_24)
- 特征: 同
subset_1 - 分割:
train: 大小为755941985字节,包含2050个样本
- 下载大小: 117126141字节
- 数据集大小: 755941985字节
子集25 (subset_25)
- 特征: 同
subset_1 - 分割:
train: 大小为747368995字节,包含2043个样本
- 下载大小: 115820184字节
- 数据集大小: 747368995字节
子集251 (subset_251)
- 特征: 同
subset_1 - 分割:
train: 大小为506050056字节,包含1935个样本
- 下载大小: 78530998字节
- 数据集大小: 506050056字节
子集252 (subset_252)
- 特征: 同
subset_1 - 分割:
train: 大小为515800913字节,包含1923个样本
- 下载大小: 79986606字节
- 数据集大小: 515800913字节
子集253 (subset_253)
- 特征: 同
subset_1 - 分割:
train: 大小为512358314字节,包含1920个样本
- 下载大小: 79481904字节
- 数据集大小: 512358314字节
子集254 (subset_254)
- 特征: 同
subset_1 - 分割:
train: 大小为509917588字节,包含1958个样本
- 下载大小: 79056398字节
- 数据集大小: 509917588字节
子集255 (subset_255)
- 特征: 同
subset_1 - 分割:
train: 大小为494190410字节,包含1932个样本
- 下载大小: 76637462字节
- 数据集大小: 494190410字节
子集256 (subset_256)
- 特征: 同
subset_1 - 分割:
train: 大小为521326857字节,包含1929个样本
- 下载大小: 80771108字节
- 数据集大小: 521326857字节
子集257 (subset_257)
- 特征: 同
subset_1 - 分割:
train: 大小为504059887字节,包含1912个样本
- 下载大小: 78138092字节
- 数据集大小: 504059887字节
子集258 (subset_258)
- 特征: 同
subset_1 - 分割:
train: 大小为510132596字节,包含1936个样本
- 下载大小: 79149441字节
- 数据集大小: 510132596字节
子集259 (subset_259)
- 特征: 同
subset_1 - 分割:
train: 大小为505369138字节,包含1935个样本
- 下载大小: 78319450字节
- 数据集大小: 505369138字节
子集26 (subset_26)
- 特征: 同
subset_1 - 分割:
train: 大小为734808836字节,包含2046个样本
- 下载大小: 113729819字节
- 数据集大小: 734808836字节
子集260 (subset_260)
- 特征: 同
subset_1 - 分割:
train: 大小为470579008字节,包含1838个样本
- 下载大小: 72971858字节
- 数据集大小: 470579008字节
子集261 (subset_261)
- 特征: 同
subset_1 - 分割:
train: 大小为452873863字节,包含1757个样本
- 下载大小: 70242883字节
- 数据集大小: 452873863字节
子集262 (subset_262)
- 特征: 同
subset_1 - 分割:
train: 大小为458970628字节,包含1740个样本
- 下载大小: 71192318字节
- 数据集大小: 458970628字节
子集263 (subset_263)
- 特征: 同
subset_1 - 分割:
train: 大小为456531761字节,包含1734个样本
- 下载大小: 70794195字节
- 数据集大小: 456531761字节
子集264 (subset_264)
- 特征: 同
subset_1 - 分割:
train: 大小为459015977字节,包含1709个样本
- 下载大小: 71227007字节
- 数据集大小: 459015977字节
子集265 (subset_265)
- 特征: 同
subset_1 - 分割:
train: 大小为452473655字节,包含1744个样本
- 下载大小: 70138095字节
- 数据集大小: 452473655字节
子集266 (subset_266)
- 特征: 同
subset_1 - 分割:
train: 大小为459551128字节,包含1742个样本
- 下载大小: 71184721字节
- 数据集大小: 459551128字节
子集267 (subset_267)
- 特征: 同
subset_1 - 分割:
train: 大小为474554184字节,包含1778个样本
- 下载大小: 73589016字节
- 数据集大小: 474554184字节
子集268 (subset_268)
- 特征: 同
subset_1 - 分割:
train: 大小为490600469字节,包含1872个样本
- 下载大小: 76057313字节
- 数据集大小: 490600469字节
子集269 (subset_269)
- 特征: 同
subset_1 - 分割:
train: 大小为506671090字节,包含1928个样本
- 下载大小: 78559135字节
- 数据集大小: 506671090字节
子集27 (subset_27)
- 特征: 同
subset_1 - 分割:
train: 大小为735739642字节,包含2036个样本
- 下载大小: 113886271字节
- 数据集大小: 735739642字节
子集270 (subset_270)
- 特征: 同
subset_1 - 分割:
train: 大小为484405651字节,包含1911个样本
- 下载大小: 75128567字节
- 数据集大小: 484405651字节
子集271 (subset_271)
- 特征: 同
subset_1 - 分割:
train: 大小为488402361字节,包含1889个样本
- 下载大小: 75820802字节
- 数据集大小: 488402361字节



