five

argilla/magpie-ultra-v1.0

收藏
Hugging Face2024-11-26 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/argilla/magpie-ultra-v1.0
下载链接
链接失效反馈
官方服务:
资源简介:
`magpie-ultra-v1.0`是一个用于监督微调的合成数据集,使用了Llama 3.1 405B-Instruct模型以及其他Llama模型生成。数据集包含多种任务的指令和响应,如编程与调试、数学、数据分析、创意写作、寻求建议和头脑风暴等。数据集分为多个子集,包括默认集、过滤后的长对话和短对话等。数据生成过程使用了Magpie流程,包括生成指令、评估质量和难度、分类为安全或非安全对话、评分和排序、以及生成嵌入以确保指令多样性。数据集的结构包括对话内容、难度、质量、评分等多个特征。

The magpie-ultra-v1.0 dataset is a synthetically generated dataset for supervised fine-tuning using the Llama 3.1 405B-Instruct model and other Llama models. The dataset contains challenging instructions and responses for a variety of tasks such as coding, math, data analysis, creative writing, and more. It includes multiple subsets with different configurations, each tailored for specific types of conversations. The dataset is generated using a Magpie pipeline, which involves multiple steps including instruction generation, quality assessment, safety classification, scoring, and embedding generation to ensure diversity and quality. The README also mentions the differences between this version and its predecessor, magpie-ultra-v0.1, highlighting the increased size, diversity, and inclusion of multi-turn conversations.
提供机构:
argilla
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作