MWilinski/hh-rlhf-harmless-base
收藏Hugging Face2025-11-05 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/MWilinski/hh-rlhf-harmless-base
下载链接
链接失效反馈官方服务:
资源简介:
HH-RLHF无害基础数据集是一个经过处理的Anthropic HH-RLHF数据目录的变种,它将原始字符串转换为与TRL偏好和RM训练流程兼容的对话格式。该数据集保留了人类和助手的对话轮次,并明确区分了选择的和被拒绝的对话延续。数据集被分割为训练集(train)和测试集(test)。
HH-RLHF Harmless Base dataset is a processed variant of the Anthropic HH-RLHF data directory, converting the raw strings into a conversational format compatible with the TRL preference and RM training pipelines. The dataset preserves the human and assistant turns and clearly distinguishes between the chosen and rejected continuations. It is split into training (train) and test (test) sets.
提供机构:
MWilinski



