LongAlpaca12k
收藏arXiv2025-09-30 收录
下载链接:
https://huggingface.co/datasets/Open-Orca/OpenOrca
下载链接
链接失效反馈官方服务:
资源简介:
该数据集被用于训练模型处理不超过32k个令牌的序列,它是现有的一种长上下文数据集。此外,该数据集还作为第一阶段和第二阶段指令调整的初始数据集的一部分。所涉及的任务是指令调整。
This dataset is utilized for training models to handle sequences with up to 32k tokens, and it is a currently available long-context dataset. Furthermore, this dataset also serves as a component of the initial datasets for first-stage and second-stage instruction tuning. The task involved herein is instruction tuning.



