KHuss/hh-rlhf-formatted
收藏Hugging Face2024-11-10 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/KHuss/hh-rlhf-formatted
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是Anthropic的hh-rlhf数据集的重新格式化版本,原始数据集可在HuggingFace上获取。RLHF数据集通常定义为三元组D={(x,y_1,y_2)_n}的集合,其中x是提示,y_1是选择的响应,y_2是被拒绝的响应。原始数据集提供了两列,chosen=x+y_1和rejected=x+y_2。为了适应不同的RLHF设置,本数据集保留了原始的chosen和rejected列,但增加了x、y1和y2列,其中x是chosen和rejected的最长共同前缀。
This is a reformatted version of Anthropics hh-rlhf dataset, originally available on Hugging Face. RLHF datasets are generally defined as a collection of triples D={(x,y_1,y_2)_n}, where x is the prompt, y_1 is the chosen response, and y_2 is the rejected response. The original dataset provides two columns, chosen=x+y_1 and rejected=x+y_2. To accommodate various RLHF setups, this dataset keeps the original chosen and rejected columns and adds x, y1, and y2 columns, where x is the longest common prefix of both chosen and rejected.
提供机构:
KHuss



