five

ultrafeedback-binarized-preferences-cleaned

收藏
魔搭社区2025-10-09 更新2025-03-22 收录
下载链接:
https://modelscope.cn/datasets/mlabonne/ultrafeedback-binarized-preferences-cleaned
下载链接
链接失效反馈
官方服务:
资源简介:
# ultrafeedback-binarized-preferences-cleaned This is a DPO dataset based on [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned). It implements the following features: * **Intel format**: you can directly use this dataset in Axolotl with "type: chatml.intel" * **Filter out low scores**: removed samples with delta scores assistant "):-len(" ")] # Format rejected answer rejected = tokenizer.apply_chat_template(example['rejected'], tokenize=False, add_generation_prompt=False)[len(input) + len("assistant "):-len(" ")] # Calculate score difference delta_score = abs(example['chosen-rating'] - example['rejected-rating']) return { "question": example['prompt'], "chosen": chosen, "rejected": rejected, "delta_score": delta_score, } # Load tokenizer (chatml format) model_name = "mlabonne/NeuralHermes-2.5-Mistral-7B" tokenizer = AutoTokenizer.from_pretrained(model_name) # Format dataset dataset_chatml = dataset.map( chatml_format, remove_columns=['chosen-rating', 'rejected-rating', 'chosen-model', 'rejected-model', 'source', 'prompt'], ) # Remove low delta scores dataset_chatml = dataset_chatml.filter(lambda x: x["delta_score"] > 1.0) # Sort the dataset by the 'delta_score' column in descending order dataset_chatml = dataset_chatml.sort('delta_score', reverse=True) pd.DataFrame(dataset_chatml['train']).iloc[10:20] ```

# ultrafeedback-binarized-preferences-cleaned 本数据集为基于[argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned)构建的直接偏好优化(Direct Preference Optimization,DPO)数据集,具备以下特性: * **Intel格式**:可直接在Axolotl中结合`"type: chatml.intel"`配置使用该数据集 * **过滤低得分差值样本**:移除得分差值过低的样本 # 格式化拒绝候选回答 rejected = tokenizer.apply_chat_template(example['rejected'], tokenize=False, add_generation_prompt=False)[len(input) + len("assistant "):-len(" ")] # 计算得分差值 delta_score = abs(example['chosen-rating'] - example['rejected-rating']) return { "question": example['prompt'], "chosen": chosen, "rejected": rejected, "delta_score": delta_score, } # 加载ChatML格式分词器 model_name = "mlabonne/NeuralHermes-2.5-Mistral-7B" tokenizer = AutoTokenizer.from_pretrained(model_name) # 格式化数据集 dataset_chatml = dataset.map( chatml_format, remove_columns=['chosen-rating', 'rejected-rating', 'chosen-model', 'rejected-model', 'source', 'prompt'], ) # 移除低得分差值样本 dataset_chatml = dataset_chatml.filter(lambda x: x["delta_score"] > 1.0) # 按'delta_score'列降序排序数据集 dataset_chatml = dataset_chatml.sort('delta_score', reverse=True) pd.DataFrame(dataset_chatml['train']).iloc[10:20]
提供机构:
maas
创建时间:
2025-03-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作