wisenut-nlp-team/llama_ko_chat
收藏Hugging Face2024-05-03 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/wisenut-nlp-team/llama_ko_chat
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: emotion
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 84166474
num_examples: 28638
download_size: 44411146
dataset_size: 84166474
- config_name: knowledge
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 68556571
num_examples: 16449
download_size: 36636367
dataset_size: 68556571
- config_name: multi
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 325699817
num_examples: 206000
download_size: 165748903
dataset_size: 325699817
- config_name: persona
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 81616610
num_examples: 28768
download_size: 42397466
dataset_size: 81616610
- config_name: sns
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 1134808019
num_examples: 1707453
download_size: 575737157
dataset_size: 1134808019
configs:
- config_name: emotion
data_files:
- split: train
path: emotion/train-*
- config_name: knowledge
data_files:
- split: train
path: knowledge/train-*
- config_name: multi
data_files:
- split: train
path: multi/train-*
- config_name: persona
data_files:
- split: train
path: persona/train-*
- config_name: sns
data_files:
- split: train
path: sns/train-*
---
## [한국어 멀티세션 대화](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=71630)
- subset: multi
- length: 206k
## [SNS 데이터 고도화](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=71343)
- subset: sns
- length: 1.71M
## [공감형 대화](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=71305)
- subset: emotion
- length: 28.6k
## [지식검색 대화](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=71304)
- subset: knowledge
- length: 16.4k
## [페르소나 대화](https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=71302)
- subset: persona
- length: 28.8k
提供机构:
wisenut-nlp-team
原始信息汇总
数据集概述
1. 情感数据集 (emotion)
- 特征:
- instruction: 字符串
- input: 字符串
- output: 字符串
- 训练集:
- 字节数: 84166474
- 示例数: 28638
- 下载大小: 44411146
- 数据集大小: 84166474
2. 知识数据集 (knowledge)
- 特征:
- instruction: 字符串
- input: 字符串
- output: 字符串
- 训练集:
- 字节数: 68556571
- 示例数: 16449
- 下载大小: 36636367
- 数据集大小: 68556571
3. 多主题数据集 (multi)
- 特征:
- instruction: 字符串
- input: 字符串
- output: 字符串
- 训练集:
- 字节数: 325699817
- 示例数: 206000
- 下载大小: 165748903
- 数据集大小: 325699817
4. 个人化数据集 (persona)
- 特征:
- instruction: 字符串
- input: 字符串
- output: 字符串
- 训练集:
- 字节数: 81616610
- 示例数: 28768
- 下载大小: 42397466
- 数据集大小: 81616610
5. 社交网络数据集 (sns)
- 特征:
- instruction: 字符串
- input: 字符串
- output: 字符串
- 训练集:
- 字节数: 1134808019
- 示例数: 1707453
- 下载大小: 575737157
- 数据集大小: 1134808019



