causal-lm/instructions-ko
收藏Hugging Face2024-05-14 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/causal-lm/instructions-ko
下载链接
链接失效反馈官方服务:
资源简介:
---
language: ko
dataset_info:
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
- name: dialogue
list:
- name: content
dtype: string
- name: role
dtype: string
splits:
- name: train
num_bytes: 138160314
num_examples: 112104
- name: validation
num_bytes: 15418231
num_examples: 12429
download_size: 85992704
dataset_size: 153578545
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: validation
path: data/validation-*
---
# Dataset Card for "instructions-ko"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
The dataset is in Korean language and includes four main features: instruction, input, output, and dialogue. The dialogue feature is a list containing sub-features content and role. The dataset is split into training and validation sets, with 112104 and 12429 examples respectively. The total download size of the dataset is 85992704 bytes, and the total size is 153578545 bytes. The dataset configuration is named default, with paths for training and validation data files.
提供机构:
causal-lm
原始信息汇总
数据集概述
数据集名称
- 名称:instructions-ko
数据结构
- 特征(Features):
- instruction: 字符串类型
- input: 字符串类型
- output: 字符串类型
- dialogue: 列表类型,包含:
- content: 字符串类型
- role: 字符串类型
数据分割(Splits)
- 训练集(train):
- 示例数量:112104
- 数据大小:138160314字节
- 验证集(validation):
- 示例数量:12429
- 数据大小:15418231字节
数据集大小
- 下载大小:85992704字节
- 数据集总大小:153578545字节
数据文件配置
- 默认配置(default):
- 训练集路径:data/train-*
- 验证集路径:data/validation-*



