orionweller/dolma_20bn_no_instruct
收藏Hugging Face2024-06-13 更新2024-06-29 收录
下载链接:
https://hf-mirror.com/datasets/orionweller/dolma_20bn_no_instruct
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: id
dtype: string
- name: text
dtype: string
- name: added
dtype: string
- name: created
dtype: string
- name: source
dtype: string
- name: original_shard_dir
dtype: string
- name: original_shard_idx
dtype: int64
- name: num_tokens
dtype: int64
splits:
- name: shard_0
num_bytes: 10003124149
num_examples: 3134249
- name: shard_1
num_bytes: 10035960197
num_examples: 2701189
- name: shard_2
num_bytes: 10010637072
num_examples: 2767789
- name: shard_3
num_bytes: 10065278086
num_examples: 2709704
- name: shard_4
num_bytes: 10052996422
num_examples: 2910410
- name: shard_5
num_bytes: 10010888181
num_examples: 3241524
- name: shard_6
num_bytes: 10034980761
num_examples: 3610526
- name: shard_7
num_bytes: 10019688054
num_examples: 3427831
- name: shard_8
num_bytes: 10047461424
num_examples: 3192115
- name: shard_9
num_bytes: 10005163228
num_examples: 3149657
- name: shard_10
num_bytes: 10025903890
num_examples: 2702469
- name: shard_11
num_bytes: 10070926286
num_examples: 5580676
- name: shard_12
num_bytes: 6505250163
num_examples: 1807809
download_size: 73438352410
dataset_size: 126888257913
configs:
- config_name: default
data_files:
- split: shard_0
path: data/shard_0-*
- split: shard_1
path: data/shard_1-*
- split: shard_2
path: data/shard_2-*
- split: shard_3
path: data/shard_3-*
- split: shard_4
path: data/shard_4-*
- split: shard_5
path: data/shard_5-*
- split: shard_6
path: data/shard_6-*
- split: shard_7
path: data/shard_7-*
- split: shard_8
path: data/shard_8-*
- split: shard_9
path: data/shard_9-*
- split: shard_10
path: data/shard_10-*
- split: shard_11
path: data/shard_11-*
- split: shard_12
path: data/shard_12-*
---
提供机构:
orionweller



