mrochk/opengenome-clean
收藏Hugging Face2026-04-09 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/mrochk/opengenome-clean
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: text
dtype: string
splits:
- name: chunk_0
num_bytes: 8196000000
num_examples: 1000000
- name: chunk_1
num_bytes: 8200220940
num_examples: 1000515
- name: chunk_2
num_bytes: 8196385212
num_examples: 1000047
- name: chunk_3
num_bytes: 8218694724
num_examples: 1002769
- name: chunk_4
num_bytes: 8200048824
num_examples: 1000494
- name: chunk_5
num_bytes: 8198745660
num_examples: 1000335
- name: chunk_6
num_bytes: 8198049000
num_examples: 1000250
- name: chunk_7
num_bytes: 8207638320
num_examples: 1001420
- name: chunk_8
num_bytes: 8205687672
num_examples: 1001182
- name: chunk_9
num_bytes: 8207908788
num_examples: 1001453
- name: chunk_10
num_bytes: 8196631092
num_examples: 1000077
- name: chunk_11
num_bytes: 8196073764
num_examples: 1000009
- name: chunk_12
num_bytes: 8199368556
num_examples: 1000411
- name: chunk_13
num_bytes: 8206540056
num_examples: 1001286
- name: chunk_14
num_bytes: 8207326872
num_examples: 1001382
- name: chunk_15
num_bytes: 8204359920
num_examples: 1001020
- name: chunk_16
num_bytes: 8225259720
num_examples: 1003570
- name: chunk_17
num_bytes: 8206113864
num_examples: 1001234
- name: chunk_18
num_bytes: 8202745308
num_examples: 1000823
- name: chunk_19
num_bytes: 8221317444
num_examples: 1003089
- name: chunk_20
num_bytes: 8210711820
num_examples: 1001795
- name: chunk_21
num_bytes: 8202155196
num_examples: 1000751
- name: chunk_22
num_bytes: 8199901296
num_examples: 1000476
- name: chunk_23
num_bytes: 8196303252
num_examples: 1000037
- name: chunk_24
num_bytes: 8196737640
num_examples: 1000090
- name: chunk_25
num_bytes: 8204720544
num_examples: 1001064
- name: chunk_26
num_bytes: 8204597604
num_examples: 1001049
- name: chunk_27
num_bytes: 8256863496
num_examples: 1007426
- name: chunk_28
num_bytes: 8196049176
num_examples: 1000006
- name: chunk_29
num_bytes: 8203015776
num_examples: 1000856
download_size: 112374241102
dataset_size: 246166171536
configs:
- config_name: default
data_files:
- split: chunk_0
path: data/chunk_0-*
- split: chunk_1
path: data/chunk_1-*
- split: chunk_2
path: data/chunk_2-*
- split: chunk_3
path: data/chunk_3-*
- split: chunk_4
path: data/chunk_4-*
- split: chunk_5
path: data/chunk_5-*
- split: chunk_6
path: data/chunk_6-*
- split: chunk_7
path: data/chunk_7-*
- split: chunk_8
path: data/chunk_8-*
- split: chunk_9
path: data/chunk_9-*
- split: chunk_10
path: data/chunk_10-*
- split: chunk_11
path: data/chunk_11-*
- split: chunk_12
path: data/chunk_12-*
- split: chunk_13
path: data/chunk_13-*
- split: chunk_14
path: data/chunk_14-*
- split: chunk_15
path: data/chunk_15-*
- split: chunk_16
path: data/chunk_16-*
- split: chunk_17
path: data/chunk_17-*
- split: chunk_18
path: data/chunk_18-*
- split: chunk_19
path: data/chunk_19-*
- split: chunk_20
path: data/chunk_20-*
- split: chunk_21
path: data/chunk_21-*
- split: chunk_22
path: data/chunk_22-*
- split: chunk_23
path: data/chunk_23-*
- split: chunk_24
path: data/chunk_24-*
- split: chunk_25
path: data/chunk_25-*
- split: chunk_26
path: data/chunk_26-*
- split: chunk_27
path: data/chunk_27-*
- split: chunk_28
path: data/chunk_28-*
- split: chunk_29
path: data/chunk_29-*
---
提供机构:
mrochk



