HydraLM/partitioned_v3_light
收藏Hugging Face2023-08-01 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/HydraLM/partitioned_v3_light
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: '0'
path: data/0-*
- split: '1'
path: data/1-*
- split: '2'
path: data/2-*
- split: '3'
path: data/3-*
- split: '4'
path: data/4-*
- split: '5'
path: data/5-*
- split: '6'
path: data/6-*
- split: '7'
path: data/7-*
- split: '8'
path: data/8-*
- split: '9'
path: data/9-*
- split: '10'
path: data/10-*
- split: '11'
path: data/11-*
- split: '12'
path: data/12-*
- split: '13'
path: data/13-*
- split: '14'
path: data/14-*
- split: '15'
path: data/15-*
- split: '16'
path: data/16-*
- split: '17'
path: data/17-*
- split: '18'
path: data/18-*
- split: '19'
path: data/19-*
- split: '20'
path: data/20-*
- split: '21'
path: data/21-*
- split: '22'
path: data/22-*
- split: '23'
path: data/23-*
- split: '24'
path: data/24-*
- split: '25'
path: data/25-*
- split: '26'
path: data/26-*
- split: '27'
path: data/27-*
- split: '28'
path: data/28-*
- split: '29'
path: data/29-*
- split: '30'
path: data/30-*
- split: '31'
path: data/31-*
dataset_info:
features:
- name: conversation_id
dtype: int64
- name: dataset_id
dtype: string
- name: cluster_text
dtype: string
- name: unique_id
dtype: string
- name: cluster
dtype: int64
- name: id
dtype: int64
splits:
- name: '0'
num_bytes: 30992664
num_examples: 16523
- name: '1'
num_bytes: 52095796
num_examples: 16425
- name: '2'
num_bytes: 47561841
num_examples: 25909
- name: '3'
num_bytes: 2815376
num_examples: 5684
- name: '4'
num_bytes: 58605236
num_examples: 21059
- name: '5'
num_bytes: 8155103
num_examples: 6470
- name: '6'
num_bytes: 128701190
num_examples: 24422
- name: '7'
num_bytes: 38130966
num_examples: 26253
- name: '8'
num_bytes: 11186625
num_examples: 15819
- name: '9'
num_bytes: 39419303
num_examples: 14042
- name: '10'
num_bytes: 21521823
num_examples: 7654
- name: '11'
num_bytes: 120962836
num_examples: 23956
- name: '12'
num_bytes: 36300158
num_examples: 14898
- name: '13'
num_bytes: 24926182
num_examples: 23098
- name: '14'
num_bytes: 10550746
num_examples: 10271
- name: '15'
num_bytes: 50092026
num_examples: 24944
- name: '16'
num_bytes: 22094384
num_examples: 10785
- name: '17'
num_bytes: 18684676
num_examples: 14417
- name: '18'
num_bytes: 26827192
num_examples: 32254
- name: '19'
num_bytes: 7490725
num_examples: 10446
- name: '20'
num_bytes: 23774066
num_examples: 40593
- name: '21'
num_bytes: 23942749
num_examples: 17353
- name: '22'
num_bytes: 79104576
num_examples: 47188
- name: '23'
num_bytes: 65591366
num_examples: 15443
- name: '24'
num_bytes: 29085329
num_examples: 10707
- name: '25'
num_bytes: 14869667
num_examples: 9539
- name: '26'
num_bytes: 14156821
num_examples: 16207
- name: '27'
num_bytes: 13720088
num_examples: 5294
- name: '28'
num_bytes: 12888055
num_examples: 16797
- name: '29'
num_bytes: 24111036
num_examples: 9189
- name: '30'
num_bytes: 27279270
num_examples: 41940
- name: '31'
num_bytes: 56129266
num_examples: 24350
download_size: 476510182
dataset_size: 1141767137
---
# Dataset Card for "partitioned_v3_light"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
HydraLM
原始信息汇总
数据集概述
配置信息
- config_name: default
- data_files:
- 数据集分为32个子集,每个子集通过
split参数标识,路径格式为data/{split}-*。
- 数据集分为32个子集,每个子集通过
数据集信息
-
features:
- conversation_id: int64
- dataset_id: string
- cluster_text: string
- unique_id: string
- cluster: int64
- id: int64
-
splits:
- 数据集包含32个不同的分割,每个分割有不同的
num_bytes和num_examples。
- 数据集包含32个不同的分割,每个分割有不同的
数据集大小
- download_size: 476510182 bytes
- dataset_size: 1141767137 bytes
数据集详细信息
分割详情
| Split | num_bytes | num_examples |
|---|---|---|
| 0 | 30992664 | 16523 |
| 1 | 52095796 | 16425 |
| 2 | 47561841 | 25909 |
| 3 | 2815376 | 5684 |
| 4 | 58605236 | 21059 |
| 5 | 8155103 | 6470 |
| 6 | 128701190 | 24422 |
| 7 | 38130966 | 26253 |
| 8 | 11186625 | 15819 |
| 9 | 39419303 | 14042 |
| 10 | 21521823 | 7654 |
| 11 | 120962836 | 23956 |
| 12 | 36300158 | 14898 |
| 13 | 24926182 | 23098 |
| 14 | 10550746 | 10271 |
| 15 | 50092026 | 24944 |
| 16 | 22094384 | 10785 |
| 17 | 18684676 | 14417 |
| 18 | 26827192 | 32254 |
| 19 | 7490725 | 10446 |
| 20 | 23774066 | 40593 |
| 21 | 23942749 | 17353 |
| 22 | 79104576 | 47188 |
| 23 | 65591366 | 15443 |
| 24 | 29085329 | 10707 |
| 25 | 14869667 | 9539 |
| 26 | 14156821 | 16207 |
| 27 | 13720088 | 5294 |
| 28 | 12888055 | 16797 |
| 29 | 24111036 | 9189 |
| 30 | 27279270 | 41940 |
| 31 | 56129266 | 24350 |
数据集大小
- download_size: 476510182 bytes
- dataset_size: 1141767137 bytes



