alpaca-gpt4-data-zh-500

Name: alpaca-gpt4-data-zh-500
Creator: maas
Published: 2026-05-15 16:42:08
License: 暂无描述

魔搭社区2026-05-15 更新2025-11-03 收录

下载链接：

https://modelscope.cn/datasets/jackyan2025/alpaca-gpt4-data-zh-500

下载链接

链接失效反馈

官方服务：

资源简介：

## 数据集描述该数据集为GPT-4生成的中文数据集，用于LLM的指令精调和强化学习等。 ### 数据集加载方式 ```python from modelscope.msdatasets import MsDataset ds = MsDataset.load("alpaca-gpt4-data-zh", namespace="AI-ModelScope", split="train") print(next(iter(ds))) ``` ### 数据分片数据已经预设了train分片。 ## 数据集版权信息数据集已经开源，license为CC BY NC 4.0（仅用于非商业化用途），如有违反相关条款，随时联系modelscope删除。 ## 引用方式 ``` @article{peng2023gpt4llm, title={Instruction Tuning with GPT-4}, author={Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao}, journal={arXiv preprint arXiv:2304.03277}, year={2023} } ``` ## 参考链接 ``` https://huggingface.co/datasets/c-s-ale/alpaca-gpt4-data-zh https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM ``` ### Clone with HTTP ```bash git clone https://www.modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-zh.git ```

# 数据集描述本数据集为由GPT-4生成的中文数据集，可用于大语言模型（LLM）的指令精调与强化学习等任务。 ## 数据集加载方式 python from modelscope.msdatasets import MsDataset ds = MsDataset.load("alpaca-gpt4-data-zh", namespace="AI-ModelScope", split="train") print(next(iter(ds))) ## 数据分片本数据集已预先设置train分片。 ## 数据集版权信息本数据集已开源，许可证为CC BY-NC 4.0（仅可用于非商业用途），若违反相关条款，可随时联系ModelScope进行删除处理。 ## 引用方式 @article{peng2023gpt4llm, title={Instruction Tuning with GPT-4}, author={Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao}, journal={arXiv preprint arXiv:2304.03277}, year={2023} } ## 参考链接 https://huggingface.co/datasets/c-s-ale/alpaca-gpt4-data-zh https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM ### Clone with HTTP bash git clone https://www.modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-zh.git

提供机构：

maas

创建时间：

2025-10-31

搜集汇总

数据集介绍