alexredna/oasst2_dpo_pairs
收藏数据集卡片 for "oasst2_dpo_pairs"
数据集描述
该数据集被转换为用于DPO训练的结构,并且可以与Alignment Handbook一起使用。结构遵循与HuggingFaceH4/ultrafeedback_binarized类似的方案。
使用方法
要加载数据集,请运行:
python from datasets import load_dataset
ds = load_dataset("alexredna/oasst2_dpo_pairs")
语言
基础数据集被过滤,仅包含德语、英语、西班牙语和法语对话。
数据集创建
我使用了以下脚本转换oasst2数据集:
python from datasets import Dataset, load_dataset import pandas as pd
def build_tree(df): tree = {} message_dict = df.set_index(message_id).to_dict(orient=index)
for message_id, message in message_dict.items():
parent_id = message[parent_id]
if parent_id is None or pd.isna(parent_id):
tree[message_id] = message
tree[message_id][replies] = []
else:
if parent_id in message_dict:
if replies not in message_dict[parent_id]:
message_dict[parent_id][replies] = []
message_dict[parent_id][replies].append(message)
return tree
def convert_for_dpo(entry): example = dict() example["system"] = ""
prompt_id = entry["message_tree_id"]
prompt = entry["text"]
chosen = []
rejected = []
chosen_reply = entry["replies"][0]
rejected_reply = entry["replies"][1]
score_chosen = len(entry["replies"]) - chosen_reply["rank"]
score_rejected = len(entry["replies"]) - rejected_reply["rank"]
chosen.append({"role": "user", "content": prompt})
chosen.append({"role": "assistant", "content": entry["replies"][0]["text"]})
rejected.append({"role": "user", "content": prompt})
rejected.append({"role": "assistant", "content": entry["replies"][1]["text"]})
return {"prompt_id": prompt_id, "prompt": prompt,"messages": chosen, "chosen": chosen, "rejected": rejected, "score_chosen": score_chosen, "score_rejected": score_rejected, "lang": entry["lang"]}
oasst2 = load_dataset("OpenAssistant/oasst2") df = oasst2["train"].to_pandas()
df_multi = df.loc[df[lang].isin([en, de, es, fr])] tree = build_tree(df_multi)
transformed_for_dpo = [] for row in tree.values(): try: transformed_for_dpo.append(convert_for_dpo(row)) except: print("row does not contain chosen or rejected values")
df = pd.DataFrame.from_records(transformed_for_dpo) ds = Dataset.from_pandas(df)
ds.push_to_hub("oasst2_dpo_pairs", token="<token>")
许可信息
引用信息
该数据集是从OpenAssistant/oasst2转换而来。



