five

causal-lm/instructions-ko

收藏
Hugging Face2024-05-14 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/causal-lm/instructions-ko
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: ko dataset_info: features: - name: instruction dtype: string - name: input dtype: string - name: output dtype: string - name: dialogue list: - name: content dtype: string - name: role dtype: string splits: - name: train num_bytes: 138160314 num_examples: 112104 - name: validation num_bytes: 15418231 num_examples: 12429 download_size: 85992704 dataset_size: 153578545 configs: - config_name: default data_files: - split: train path: data/train-* - split: validation path: data/validation-* --- # Dataset Card for "instructions-ko" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)

The dataset is in Korean language and includes four main features: instruction, input, output, and dialogue. The dialogue feature is a list containing sub-features content and role. The dataset is split into training and validation sets, with 112104 and 12429 examples respectively. The total download size of the dataset is 85992704 bytes, and the total size is 153578545 bytes. The dataset configuration is named default, with paths for training and validation data files.
提供机构:
causal-lm
原始信息汇总

数据集概述

数据集名称

  • 名称:instructions-ko

数据结构

  • 特征(Features)
    • instruction: 字符串类型
    • input: 字符串类型
    • output: 字符串类型
    • dialogue: 列表类型,包含:
      • content: 字符串类型
      • role: 字符串类型

数据分割(Splits)

  • 训练集(train)
    • 示例数量:112104
    • 数据大小:138160314字节
  • 验证集(validation)
    • 示例数量:12429
    • 数据大小:15418231字节

数据集大小

  • 下载大小:85992704字节
  • 数据集总大小:153578545字节

数据文件配置

  • 默认配置(default)
    • 训练集路径:data/train-*
    • 验证集路径:data/validation-*
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作