YuchenLi01/Math-Step-DPO-10K-augmented-Qwen2.5MathPRM72B
收藏Hugging Face2025-04-10 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/YuchenLi01/Math-Step-DPO-10K-augmented-Qwen2.5MathPRM72B
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含文本和评分数据,用于训练模型进行文本理解和评分。具体包括数据集名称、提示信息、初始推理步骤、选择和拒绝的答案及其完整版本、不同答案的步骤、以及初始推理步骤、选择答案、拒绝答案、完整选择答案、完整拒绝答案、拒绝步骤选择答案、选择步骤拒绝答案、选择步骤替代答案、拒绝步骤替代答案的评分。数据集分为训练集,包含10795个示例。
The dataset includes text and scoring data for training models on text understanding and scoring. It consists of dataset name, prompt information, initial reasoning steps, chosen and rejected answers along with their full versions, steps for different answers, and scores for initial reasoning steps, chosen answer, rejected answer, full chosen answer, full rejected answer, rejected steps chosen answer, chosen steps rejected answer, chosen steps substitute answer, and rejected steps substitute answer. The dataset is split into a training set with 10795 examples.
提供机构:
YuchenLi01



