five

causal-rewards/sycophancy_dpo_llama3.1_8b_ultrachat200k_iter1_new

收藏
Hugging Face2025-09-21 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/causal-rewards/sycophancy_dpo_llama3.1_8b_ultrachat200k_iter1_new
下载链接
链接失效反馈
官方服务:
资源简介:
这是一个包含两个主要字段chosen和rejected的数据集,每个字段下都有内容和角色两个信息,所有信息都是以字符串形式存储。数据集被划分为训练集train,共有847个示例,数据集大小为6504643字节。

This dataset includes two main fields, chosen and rejected, each with content and role information stored as strings. The dataset is split into a training set called train, which contains 847 examples and is 6504643 bytes in size.
提供机构:
causal-rewards
二维码
社区交流群
二维码
科研交流群
商业服务