causal-rewards/sycophancy_dpo_llama3.1_8b_ultrachat200k_iter1_new
收藏Hugging Face2025-09-21 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/causal-rewards/sycophancy_dpo_llama3.1_8b_ultrachat200k_iter1_new
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含两个主要字段chosen和rejected的数据集,每个字段下都有内容和角色两个信息,所有信息都是以字符串形式存储。数据集被划分为训练集train,共有847个示例,数据集大小为6504643字节。
This dataset includes two main fields, chosen and rejected, each with content and role information stored as strings. The dataset is split into a training set called train, which contains 847 examples and is 6504643 bytes in size.
提供机构:
causal-rewards



