five

wisenut-nlp-team/llama_ko_chat

收藏
Hugging Face2024-05-03 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/wisenut-nlp-team/llama_ko_chat
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: emotion features: - name: instruction dtype: string - name: input dtype: string - name: output dtype: string splits: - name: train num_bytes: 84166474 num_examples: 28638 download_size: 44411146 dataset_size: 84166474 - config_name: knowledge features: - name: instruction dtype: string - name: input dtype: string - name: output dtype: string splits: - name: train num_bytes: 68556571 num_examples: 16449 download_size: 36636367 dataset_size: 68556571 - config_name: multi features: - name: instruction dtype: string - name: input dtype: string - name: output dtype: string splits: - name: train num_bytes: 325699817 num_examples: 206000 download_size: 165748903 dataset_size: 325699817 - config_name: persona features: - name: instruction dtype: string - name: input dtype: string - name: output dtype: string splits: - name: train num_bytes: 81616610 num_examples: 28768 download_size: 42397466 dataset_size: 81616610 - config_name: sns features: - name: instruction dtype: string - name: input dtype: string - name: output dtype: string splits: - name: train num_bytes: 1134808019 num_examples: 1707453 download_size: 575737157 dataset_size: 1134808019 configs: - config_name: emotion data_files: - split: train path: emotion/train-* - config_name: knowledge data_files: - split: train path: knowledge/train-* - config_name: multi data_files: - split: train path: multi/train-* - config_name: persona data_files: - split: train path: persona/train-* - config_name: sns data_files: - split: train path: sns/train-* --- ## [한국어 멀티세션 대화](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=71630) - subset: multi - length: 206k ## [SNS 데이터 고도화](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=71343) - subset: sns - length: 1.71M ## [공감형 대화](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=71305) - subset: emotion - length: 28.6k ## [지식검색 대화](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=71304) - subset: knowledge - length: 16.4k ## [페르소나 대화](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=71302) - subset: persona - length: 28.8k
提供机构:
wisenut-nlp-team
原始信息汇总

数据集概述

1. 情感数据集 (emotion)

  • 特征:
    • instruction: 字符串
    • input: 字符串
    • output: 字符串
  • 训练集:
    • 字节数: 84166474
    • 示例数: 28638
    • 下载大小: 44411146
    • 数据集大小: 84166474

2. 知识数据集 (knowledge)

  • 特征:
    • instruction: 字符串
    • input: 字符串
    • output: 字符串
  • 训练集:
    • 字节数: 68556571
    • 示例数: 16449
    • 下载大小: 36636367
    • 数据集大小: 68556571

3. 多主题数据集 (multi)

  • 特征:
    • instruction: 字符串
    • input: 字符串
    • output: 字符串
  • 训练集:
    • 字节数: 325699817
    • 示例数: 206000
    • 下载大小: 165748903
    • 数据集大小: 325699817

4. 个人化数据集 (persona)

  • 特征:
    • instruction: 字符串
    • input: 字符串
    • output: 字符串
  • 训练集:
    • 字节数: 81616610
    • 示例数: 28768
    • 下载大小: 42397466
    • 数据集大小: 81616610

5. 社交网络数据集 (sns)

  • 特征:
    • instruction: 字符串
    • input: 字符串
    • output: 字符串
  • 训练集:
    • 字节数: 1134808019
    • 示例数: 1707453
    • 下载大小: 575737157
    • 数据集大小: 1134808019
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作