Aratako/magpie-sft-v1.0-dpo-judged

Name: Aratako/magpie-sft-v1.0-dpo-judged
Creator: Aratako
Published: 2024-12-15 05:36:57
License: 暂无描述

Hugging Face2024-12-15 更新2024-12-21 收录

下载链接：

https://hf-mirror.com/datasets/Aratako/magpie-sft-v1.0-dpo-judged

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是基于[llm-jp/magpie-sft-v1.0]进行修改的日语Preference数据集，使用了开发中的模型[Aratako/Llama-Gemma-2-27b-SFT-trial1]重新生成回答，并通过[google/gemma-2-27b-it]进行判断，将更好的回答标记为chosen，较差的标记为rejected。数据集包含id、prompt、chosen、rejected等特征，主要用于文本生成任务，语言为日语。

This dataset is a Japanese Preference dataset, modified based on [llm-jp/magpie-sft-v1.0](https://huggingface.co/datasets/llm-jp/magpie-sft-v1.0). It uses the in-development model [Aratako/Llama-Gemma-2-27b-SFT-trial1](https://huggingface.co/Aratako/Llama-Gemma-2-27b-SFT-trial1) to regenerate responses and compares them with the responses from the original datasets [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct). The [google/gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it) model judges which response is better and labels the better response as chosen and the inferior response as rejected.

提供机构：

Aratako

5,000+

优质数据集

54 个

任务类型

进入经典数据集