botp/shibing624_alpaca-zh
收藏Hugging Face2024-05-29 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/botp/shibing624_alpaca-zh
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 32150579
num_examples: 48818
download_size: 35100559
dataset_size: 32150579
license: cc-by-4.0
language:
- zh
pretty_name: Instruction Tuning with GPT-4
size_categories:
- 10K<n<100K
task_categories:
- text-generation
tags:
- gpt
- alpaca
- fine-tune
- instruct-tune
- instruction
---
# Dataset Description
- **Project Page:** https://instruction-tuning-with-gpt-4.github.io
- **Repo:** https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM
- **Paper:** https://arxiv.org/abs/2304.03277
# Dataset Card for "alpaca-zh"
本数据集是参考Alpaca方法基于GPT4得到的self-instruct数据,约5万条。
Dataset from https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM
It is the chinese dataset from https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/blob/main/data/alpaca_gpt4_data_zh.json
# Usage and License Notices
The data is intended and licensed for research use only. The dataset is CC BY NC 4.0 (allowing only non-commercial use) and models trained using the dataset should not be used outside of research purposes.
train model with alpaca-zh dataset: https://github.com/shibing624/textgen
# English Dataset
[Found here](https://huggingface.co/datasets/c-s-ale/alpaca-gpt4-data)
# Citation
```
@article{peng2023gpt4llm,
title={Instruction Tuning with GPT-4},
author={Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao},
journal={arXiv preprint arXiv:2304.03277},
year={2023}
}
```
提供机构:
botp
原始信息汇总
数据集概述
基本信息
- 名称: Instruction Tuning with GPT-4
- 语言: 中文 (zh)
- 大小: 10K<n<100K
- 任务类别: 文本生成
- 标签: gpt, alpaca, fine-tune, instruct-tune, instruction
- 许可证: CC-BY-4.0
数据集结构
- 特征:
instruction: 字符串类型input: 字符串类型output: 字符串类型
数据集划分
- 训练集:
- 示例数量: 48818
- 字节数: 32150579
数据集大小
- 下载大小: 35100559
- 数据集大小: 32150579
数据集用途
- 用途: 研究使用
- 许可证: CC BY NC 4.0 (仅限非商业用途)



