five

chatml-OpenHermes2.5-dpo-binarized-alpha

收藏
魔搭社区2025-11-12 更新2025-03-22 收录
下载链接:
https://modelscope.cn/datasets/mlabonne/chatml-OpenHermes2.5-dpo-binarized-alpha
下载链接
链接失效反馈
官方服务:
资源简介:
# chatml-OpenHermes2.5-dpo-binarized-alpha This is a DPO dataset based on [argilla/OpenHermes2.5-dpo-binarized-alpha](https://huggingface.co/datasets/argilla/OpenHermes2.5-dpo-binarized-alpha). It implements the following features: * **Intel format**: you can directly use this dataset in Axolotl with "type: chatml.intel" * **Filter out low scores**: removed samples with delta scores < 1 (530 in the training set, 66 in the test set). * **Curriculum learning**: sort the dataset by the 'delta_score' column in descending order. ## 💻 Code Code to reproduce this dataset: ```python !pip install -qqq datasets from transformers import AutoTokenizer from datasets import load_dataset import pandas as pd # Load the dataset dataset = load_dataset('argilla/OpenHermes2.5-dpo-binarized-alpha') def chatml_format(example): # Format instruction message = {"role": "user", "content": example['input']} input = tokenizer.apply_chat_template([message], tokenize=False, add_generation_prompt=False) # Format chosen answer chosen = tokenizer.apply_chat_template(example['chosen'], tokenize=False, add_generation_prompt=False)[len(input) + len("<|im_start|>assistant "):-len("<|im_end|> ")] # Format rejected answer rejected = tokenizer.apply_chat_template(example['rejected'], tokenize=False, add_generation_prompt=False)[len(input) + len("<|im_start|>assistant "):-len("<|im_end|> ")] # Calculate score difference delta_score = abs(example['rating'][0] - example['rating'][1]) return { "question": example["input"], "chosen": chosen, "rejected": rejected, "delta_score": delta_score, } # Load tokenizer (chatml format) model_name = "mlabonne/NeuralHermes-2.5-Mistral-7B" tokenizer = AutoTokenizer.from_pretrained(model_name) # Format dataset dataset_chatml = dataset.map( chatml_format, remove_columns=['input', 'conversations', 'generation_model', 'generation_prompt', 'raw_generation_responses', 'generations', 'views', 'system_prompt', 'model_name', 'language', 'id', 'hash', 'model', 'avatarUrl', 'custom_instruction', 'topic', 'title', 'idx', 'rejected_score', 'chosen_score', 'source', 'skip_prompt_formatting', 'category', 'rating', 'chosen_model', 'rejected_model'] ) # Remove low delta scores dataset_chatml = dataset_chatml.filter(lambda x: x["delta_score"] > 1.0) # Sort the dataset by the 'delta_score' column in descending order dataset_chatml = dataset_chatml.sort('delta_score', reverse=True) pd.DataFrame(dataset_chatml['train']).iloc[10] ```

# chatml-OpenHermes2.5-dpo-binarized-alpha 本数据集为基于[argilla/OpenHermes2.5-dpo-binarized-alpha](https://huggingface.co/datasets/argilla/OpenHermes2.5-dpo-binarized-alpha)构建的直接偏好优化(DPO, Direct Preference Optimization)数据集,具有如下特性: * **英特尔适配格式**:可直接在Axolotl框架中以`"type: chatml.intel"`的配置加载本数据集。 * **低差值评分过滤**:移除了`delta_score`(评分差值)小于1的样本,其中训练集共移除530条,测试集移除66条。 * **课程学习适配**:按照`delta_score`列的降序规则对数据集进行重新排序。 ## 💻 复现代码 python # 安装依赖库 !pip install -qqq datasets # 导入依赖模块 from transformers import AutoTokenizer from datasets import load_dataset import pandas as pd # 加载原始数据集 dataset = load_dataset('argilla/OpenHermes2.5-dpo-binarized-alpha') # 定义ChatML格式转换函数 def chatml_format(example): # 格式化用户指令 message = {"role": "user", "content": example['input']} input = tokenizer.apply_chat_template([message], tokenize=False, add_generation_prompt=False) # 格式化优选回复 chosen = tokenizer.apply_chat_template(example['chosen'], tokenize=False, add_generation_prompt=False)[len(input) + len("<|im_start|>assistant "):-len("<|im_end|> ")] # 格式化非优选回复 rejected = tokenizer.apply_chat_template(example['rejected'], tokenize=False, add_generation_prompt=False)[len(input) + len("<|im_start|>assistant "):-len("<|im_end|> ")] # 计算评分差值 delta_score = abs(example['rating'][0] - example['rating'][1]) return { "question": example["input"], "chosen": chosen, "rejected": rejected, "delta_score": delta_score, } # 加载ChatML格式分词器 model_name = "mlabonne/NeuralHermes-2.5-Mistral-7B" tokenizer = AutoTokenizer.from_pretrained(model_name) # 对数据集进行格式转换 dataset_chatml = dataset.map( chatml_format, remove_columns=['input', 'conversations', 'generation_model', 'generation_prompt', 'raw_generation_responses', 'generations', 'views', 'system_prompt', 'model_name', 'language', 'id', 'hash', 'model', 'avatarUrl', 'custom_instruction', 'topic', 'title', 'idx', 'rejected_score', 'chosen_score', 'source', 'skip_prompt_formatting', 'category', 'rating', 'chosen_model', 'rejected_model'] ) # 过滤低评分差值样本 dataset_chatml = dataset_chatml.filter(lambda x: x["delta_score"] > 1.0) # 按照delta_score列降序排序数据集 dataset_chatml = dataset_chatml.sort('delta_score', reverse=True) # 查看训练集第11条样本(索引从0开始) pd.DataFrame(dataset_chatml['train']).iloc[10]
提供机构:
maas
创建时间:
2025-03-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作