CCCCCC/SPaR

Name: CCCCCC/SPaR
Creator: CCCCCC
Published: 2024-10-30 07:52:17
License: 暂无描述

Hugging Face2024-10-30 更新2024-12-14 收录

下载链接：

https://hf-mirror.com/datasets/CCCCCC/SPaR

下载链接

链接失效反馈

官方服务：

资源简介：

SPaR数据集旨在通过自我对弈框架提高语言模型的指令跟随能力。该数据集包含8,000个样本的SFT数据集，以及基于llama-3-8b-instruct和mistral-7b-instruct的DPO数据集。数据集主要用于指令跟随任务，数据主要为英文，原始提示来源于Infinity-Instruct数据集。

The SPaR Dataset is a self-play framework designed to enhance the instruction-following abilities of language models by generating high-quality preference pairs to minimize interfering factors. The dataset contains 8,000 samples curated using gpt-4o-mini, and provides DPO datasets derived from llama-3-8b-instruct and mistral-7b-instruct. It is primarily used for instruction-following tasks, particularly suitable for enhancing foundational instruction-following capabilities and preference learning. The data in the dataset is mostly in English.

提供机构：

CCCCCC

5,000+

优质数据集

54 个

任务类型

进入经典数据集