brahmairesearch/OpenHermes-2.5-Formatted
收藏Hugging Face2024-09-02 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/brahmairesearch/OpenHermes-2.5-Formatted
下载链接
链接失效反馈官方服务:
资源简介:
OpenHermes 2.5数据集是由Teknium创建的,主要用于生成训练内容,并添加了新的文本字段。它是Open Hermes 1数据集的延续,规模更大、多样性更高、质量更好,包含了100万条主要通过合成生成的指令和聊天样本。数据集结构遵循sharegpt格式,每个条目包含对话列表,每个对话包含角色和实际文本。数据集来源于多个开源数据集和自定义合成数据集,如Airoboros 2.2、CamelAI Domain Expert Datasets等。
The OpenHermes 2.5 dataset is a continuation of the Open Hermes 1 dataset, scaled up significantly, more diverse, and of higher quality, containing 1 million primarily synthetically generated instruction and chat samples. This dataset is a compilation of various open-source datasets and custom synthetic datasets, contributing to the significant advancements of state-of-the-art large language models (LLMs) over recent months. The dataset structure follows a sharegpt format, with each entry containing conversations and metadata such as source and category. The dataset has been integrated into Lilac, a data curation and exploration platform, and includes sources such as Airoboros 2.2, CamelAI Domain Expert Datasets, ChatBot Arena, and others.
提供机构:
brahmairesearch



