THU-KEG/PairJudge-432K
收藏Hugging Face2025-02-19 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/THU-KEG/PairJudge-432K
下载链接
链接失效反馈官方服务:
资源简介:
PAIRJUDGE-432K是一个大规模的数据集,包含432K个经过注释的配对判断,旨在用于训练数学推理任务中的奖励模型。每个样本是一个包含数学问题和两个候选解决方案的提示-完成对,以及一个评估两个响应正确性的链式思维推理。
PAIRJUDGE-432K is a large-scale dataset containing 432K annotated pairwise judgments designed for training reward models in mathematical reasoning tasks. Each sample in the dataset is a prompt–completion pair consisting of a math problem and two candidate solutions, along with a chain-of-thought reasoning that evaluates the correctness of the responses.
提供机构:
THU-KEG



