cognitivecomputations/mlabonne_orca-agentinstruct-1M-v1-cleaned-DolphinLabeled
收藏Hugging Face2025-01-05 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/cognitivecomputations/mlabonne_orca-agentinstruct-1M-v1-cleaned-DolphinLabeled
下载链接
链接失效反馈官方服务:
资源简介:
orca-agentinstruct-1M-v1-cleaned数据集是DolphinLabeled系列数据集的一部分,由Eric Hartford和Cognitive Computations提供。该数据集是微软发布的orca-agentinstruct-1M-v1的清洁版本,使用网络上公开可用的原始文本生成的合成数据集。经过修改,数据集增加了标识不同输出类型的flags列,并且删除了少量被审查的行。该数据集适用于Orca-3-Mistral模型的创建,并在多个评估指标上取得了比Mistral 7B Instruct更好的表现。
The orca-agentinstruct-1M-v1-cleaned dataset is part of the DolphinLabeled series, provided by Eric Hartford and Cognitive Computations. It is a cleaned version of the orca-agentinstruct-1M-v1 dataset released by Microsoft, which is a synthetic dataset generated using raw text publicly available on the web. After modification, the dataset includes a flags column indicating different types of outputs and has removed a small number of rows that were censored. This dataset is suitable for the creation of the Orca-3-Mistral model and has shown improved performance on several evaluation metrics compared to Mistral 7B Instruct.
提供机构:
cognitivecomputations



