five

HydraLM/partitioned_v3_light

收藏
Hugging Face2023-08-01 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/HydraLM/partitioned_v3_light
下载链接
链接失效反馈
官方服务:
资源简介:
--- configs: - config_name: default data_files: - split: '0' path: data/0-* - split: '1' path: data/1-* - split: '2' path: data/2-* - split: '3' path: data/3-* - split: '4' path: data/4-* - split: '5' path: data/5-* - split: '6' path: data/6-* - split: '7' path: data/7-* - split: '8' path: data/8-* - split: '9' path: data/9-* - split: '10' path: data/10-* - split: '11' path: data/11-* - split: '12' path: data/12-* - split: '13' path: data/13-* - split: '14' path: data/14-* - split: '15' path: data/15-* - split: '16' path: data/16-* - split: '17' path: data/17-* - split: '18' path: data/18-* - split: '19' path: data/19-* - split: '20' path: data/20-* - split: '21' path: data/21-* - split: '22' path: data/22-* - split: '23' path: data/23-* - split: '24' path: data/24-* - split: '25' path: data/25-* - split: '26' path: data/26-* - split: '27' path: data/27-* - split: '28' path: data/28-* - split: '29' path: data/29-* - split: '30' path: data/30-* - split: '31' path: data/31-* dataset_info: features: - name: conversation_id dtype: int64 - name: dataset_id dtype: string - name: cluster_text dtype: string - name: unique_id dtype: string - name: cluster dtype: int64 - name: id dtype: int64 splits: - name: '0' num_bytes: 30992664 num_examples: 16523 - name: '1' num_bytes: 52095796 num_examples: 16425 - name: '2' num_bytes: 47561841 num_examples: 25909 - name: '3' num_bytes: 2815376 num_examples: 5684 - name: '4' num_bytes: 58605236 num_examples: 21059 - name: '5' num_bytes: 8155103 num_examples: 6470 - name: '6' num_bytes: 128701190 num_examples: 24422 - name: '7' num_bytes: 38130966 num_examples: 26253 - name: '8' num_bytes: 11186625 num_examples: 15819 - name: '9' num_bytes: 39419303 num_examples: 14042 - name: '10' num_bytes: 21521823 num_examples: 7654 - name: '11' num_bytes: 120962836 num_examples: 23956 - name: '12' num_bytes: 36300158 num_examples: 14898 - name: '13' num_bytes: 24926182 num_examples: 23098 - name: '14' num_bytes: 10550746 num_examples: 10271 - name: '15' num_bytes: 50092026 num_examples: 24944 - name: '16' num_bytes: 22094384 num_examples: 10785 - name: '17' num_bytes: 18684676 num_examples: 14417 - name: '18' num_bytes: 26827192 num_examples: 32254 - name: '19' num_bytes: 7490725 num_examples: 10446 - name: '20' num_bytes: 23774066 num_examples: 40593 - name: '21' num_bytes: 23942749 num_examples: 17353 - name: '22' num_bytes: 79104576 num_examples: 47188 - name: '23' num_bytes: 65591366 num_examples: 15443 - name: '24' num_bytes: 29085329 num_examples: 10707 - name: '25' num_bytes: 14869667 num_examples: 9539 - name: '26' num_bytes: 14156821 num_examples: 16207 - name: '27' num_bytes: 13720088 num_examples: 5294 - name: '28' num_bytes: 12888055 num_examples: 16797 - name: '29' num_bytes: 24111036 num_examples: 9189 - name: '30' num_bytes: 27279270 num_examples: 41940 - name: '31' num_bytes: 56129266 num_examples: 24350 download_size: 476510182 dataset_size: 1141767137 --- # Dataset Card for "partitioned_v3_light" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
HydraLM
原始信息汇总

数据集概述

配置信息

  • config_name: default
  • data_files:
    • 数据集分为32个子集,每个子集通过split参数标识,路径格式为data/{split}-*

数据集信息

  • features:

    • conversation_id: int64
    • dataset_id: string
    • cluster_text: string
    • unique_id: string
    • cluster: int64
    • id: int64
  • splits:

    • 数据集包含32个不同的分割,每个分割有不同的num_bytesnum_examples

数据集大小

  • download_size: 476510182 bytes
  • dataset_size: 1141767137 bytes

数据集详细信息

分割详情

Split num_bytes num_examples
0 30992664 16523
1 52095796 16425
2 47561841 25909
3 2815376 5684
4 58605236 21059
5 8155103 6470
6 128701190 24422
7 38130966 26253
8 11186625 15819
9 39419303 14042
10 21521823 7654
11 120962836 23956
12 36300158 14898
13 24926182 23098
14 10550746 10271
15 50092026 24944
16 22094384 10785
17 18684676 14417
18 26827192 32254
19 7490725 10446
20 23774066 40593
21 23942749 17353
22 79104576 47188
23 65591366 15443
24 29085329 10707
25 14869667 9539
26 14156821 16207
27 13720088 5294
28 12888055 16797
29 24111036 9189
30 27279270 41940
31 56129266 24350

数据集大小

  • download_size: 476510182 bytes
  • dataset_size: 1141767137 bytes
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作