Aratako/iterative-dpo-data-for-SimPO-iter2
收藏Hugging Face2024-12-15 更新2024-12-21 收录
下载链接:
https://hf-mirror.com/datasets/Aratako/iterative-dpo-data-for-SimPO-iter2
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是基于合成instruction数据创建的日语Preference数据集。创建过程包括使用开发中的模型生成回答,并使用另一个模型对这些回答进行评分,最终选择最高分和最低分的回答作为chosen和rejected。数据集的特征包括id、prompt、chosen、rejected等多个字段,并且提供了训练集的具体字节数和示例数。
This dataset is a Japanese Preference dataset created based on synthetic instruction data. The creation process involves generating responses using a model in development and scoring these responses using another model, ultimately selecting the highest and lowest scored responses as chosen and rejected. The dataset features include id, prompt, chosen, rejected, and several other fields, with specific byte size and number of examples provided for the training set.
提供机构:
Aratako



