Aratako/magpie-sft-v1.0-dpo-judged
收藏Hugging Face2024-12-15 更新2024-12-21 收录
下载链接:
https://hf-mirror.com/datasets/Aratako/magpie-sft-v1.0-dpo-judged
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是基于[llm-jp/magpie-sft-v1.0]进行修改的日语Preference数据集,使用了开发中的模型[Aratako/Llama-Gemma-2-27b-SFT-trial1]重新生成回答,并通过[google/gemma-2-27b-it]进行判断,将更好的回答标记为chosen,较差的标记为rejected。数据集包含id、prompt、chosen、rejected等特征,主要用于文本生成任务,语言为日语。
This dataset is a Japanese Preference dataset, modified based on [llm-jp/magpie-sft-v1.0](https://huggingface.co/datasets/llm-jp/magpie-sft-v1.0). It uses the in-development model [Aratako/Llama-Gemma-2-27b-SFT-trial1](https://huggingface.co/Aratako/Llama-Gemma-2-27b-SFT-trial1) to regenerate responses and compares them with the responses from the original datasets [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct). The [google/gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it) model judges which response is better and labels the better response as chosen and the inferior response as rejected.
提供机构:
Aratako



