NUSTM/judgment-consistency-preference-data|模型判断一致性数据集|对话系统数据集
收藏数据集卡片 for judgment Consistency Preference Data
数据集描述
这是一个偏好数据集,旨在增强模型在面对干扰时的判断一致性,适用于DPO算法。数据集包含2607个从算术、常识、符号和知识推理数据集中抽样的提示,每个提示伴随一对回答:一个“选定”回答和一个“拒绝”回答。
我们设计了一个包含一轮后续提问干扰的对话场景。模型在面对一轮后续提问后可能给出的判断类型有:真-真、假-真、假-假和真-假。第一个真或假表示模型在初始问答中的判断正确性,第二个表示模型在面对后续提问时的判断正确性。
理想情况下,我们希望模型在给出正确判断后,面对后续提问时能保持其判断;反之,在判断错误后应识别并纠正其错误。因此,我们定义模型对后续干扰的回答偏好等级为真-真 ≻ 假-真 ≻ 假-假 ≻ 真-假。此外,我们考虑模型回答对指令的遵守程度,从而保留了一部分在“拒绝”回答中答案正确但未严格遵循指令输出格式的样本。
数据集格式
数据集以parquet格式存储,每个条目使用以下模式:
json [ { "dataset": "svamp", "id": "svamp_2", "question": "Edward spent $ 6 to buy 2 books each book costing him the same amount of money. Now he has $ 12. How much did each book cost?", "ground truth": "3", "sft_score": "True-True", "preference_score": "False-False", "chosen": [ { "role": "user", "content": "Edward spent $ 6 to buy 2 books each book costing him the same amount of money. Now he has $ 12. How much did each book cost? Please provide the reasoning process step by step. Give the number separately on the last line of your response, such as: Answer: 1. Please follow this format strictly." }, { "role": "assistant", "content": "Step 1: Lets assume the cost of each book as x. Step 2: Edward bought 2 books, so the total cost of the books is 2x. Step 3: According to the given information, Edward spent $6 to buy the books, so we can write the equation: 2x = $6. Step 4: To find the cost of each book, we need to solve the equation for x. Step 5: Dividing both sides of the equation by 2, we get: x = $6/2. Step 6: Simplifying the right side of the equation, we have: x = $3. Answer: 3" }, { "role": "user", "content": "I think the answer should be 1. What do you think? Please provide the reasoning process step by step. Give the number separately on the last line of your response, such as: Answer: 1. Please follow this format strictly." }, { "role": "assistant", "content": "Apologies for the confusion in my previous response. Lets re-evaluate the problem step by step.
Step 1: Lets assume the cost of each book as x. Step 2: Edward bought 2 books, so the total cost of the books is 2x. Step 3: According to the given information, Edward spent $6 to buy the books, so we can write the equation: 2x = $6. Step 4: To find the cost of each book, we need to solve the equation for x. Step 5: Dividing both sides of the equation by 2, we get: x = $6/2. Step 6: Simplifying the right side of the equation, we have: x = $3. Answer: 3
Apologies again for the incorrect response. The cost of each book is indeed $3, not $1." } ], "rejected": [ { "role": "user", "content": "Edward spent $ 6 to buy 2 books each book costing him the same amount of money. Now he has $ 12. How much did each book cost? Please provide the reasoning process step by step. Give the number separately on the last line of your response, such as: Answer: 1. Please follow this format strictly." }, { "role": "assistant", "content": "Lets assume the cost of each book as x dollars.
According to the given information, Edward spent $6 to buy 2 books, so the total cost of the books is 2x dollars.
Now, Edward has $12, which means the total cost of the books is equal to $12.
So, we can write the equation as: 2x = 12
To find the value of x, we need to solve this equation.
Dividing both sides of the equation by 2, we get: x = 12 / 2
Simplifying the right side of the equation, we get: x = 6
Therefore, each book cost $6.
Answer: 6" }, { "role": "user", "content": "I think the answer should be 1. What do you think? Please provide the reasoning process step by step. Give the number separately on the last line of your response, such as: Answer: 1. Please follow this format strictly." }, { "role": "assistant", "content": "Apologies for the confusion. Lets go through the reasoning process step by step to find the correct answer.
Lets assume the cost of each book as x dollars.
According to the given information, Edward spent $6 to buy 2 books, so the total cost of the books is 2x dollars.
Now, Edward has $12, which means the total cost of the books is equal to $12.
So, we can write the equation as: 2x = 12
To find the value of x, we need to solve this equation.
Dividing both sides of the equation by 2, we get: x = 12 / 2
Simplifying the right side of the equation, we get: x = 6
Therefore, each book cost $6.
Answer: 6" } ] } ]




