mlabonne/orca-agentinstruct-1M-v1-cleaned
收藏Hugging Face2025-01-25 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/mlabonne/orca-agentinstruct-1M-v1-cleaned
下载链接
链接失效反馈官方服务:
资源简介:
这是由Microsoft发布的orca-agentinstruct-1M-v1数据集的清理版本。orca-agentinstruct-1M-v1是一个完全合成的数据集,仅使用网络上公开的原始文本作为种子数据。它是创建Orca-3-Mistral的完整AgentInstruct数据集(约25M样本)的一个子集。与Mistral 7B Instruct相比,作者声称在AGIEval上提高了40%,在MMLU上提高了19%,在GSM8K上提高了54%,在BBH上提高了38%,在AlpacaEval上提高了45%。
This is a cleaned version of the orca-agentinstruct-1M-v1 dataset released by Microsoft. orca-agentinstruct-1M-v1 is a fully synthetic dataset using only raw text publicly available on the web as seed data. It is a subset of the full AgentInstruct dataset (~25M samples) that created Orca-3-Mistral. Compared to Mistral 7B Instruct, the authors claim 40% improvement on AGIEval, 19% improvement on MMLU, 54% improvement on GSM8K, 38% improvement on BBH and 45% improvement on AlpacaEval.
提供机构:
mlabonne



