five

prm800k-trl-dedup

收藏
魔搭社区2025-11-27 更新2025-02-15 收录
下载链接:
https://modelscope.cn/datasets/HuggingFaceH4/prm800k-trl-dedup
下载链接
链接失效反馈
官方服务:
资源简介:
## Dataset description This dataset is a version of [tasksource/PRM800K](https://huggingface.co/datasets/tasksource/PRM800K) prepared to fine tune a PRM model using [TRL](https://github.com/huggingface/trl). To replicate the dataset creation, go to [main.py](https://huggingface.co/datasets/HuggingFaceH4/prm800k-trl-dedup/blob/main/main.py) script in the files directory. Following the guide to the data at [openai/prm800k](https://github.com/openai/prm800k/tree/main?tab=readme-ov-file#data) repository, it contains for each `prompt`, the list of `completions`, where each one was obtained by exctracting the text from the list of steps, and for the cases with multiple alternative steps, a new completion was created. Sample row: ```json { "prompt": "A right cylindrical oil tank is $15$ feet tall and its circular bases have diameters of $4$ feet each. When the tank is lying flat on its side (not on one of the circular ends), the oil inside is $3$ feet deep. How deep, in feet, would the oil have been if the tank had been standing upright on one of its bases? Express your answer as a decimal to the nearest tenth.", "completions": [ "I need to find the volume of the oil in the tank and then divide it by the area of the base to get the height of the oil when the tank is upright.", "To find the volume of the oil, I can think of it as a segment of a cylinder, where the central angle of the segment is determined by the depth of the oil.", "If I draw a right triangle inside the tank, where the hypotenuse is the diameter of the base, the adjacent side is the depth of the oil, and the opposite side is half the length of the chord that cuts the oil segment, I can use trigonometry to find the central angle." ], "labels": [ true, true, false ], "index": 1, } ``` Each prompt corresponds to a row in the original dataset (both phase1 and phase2 were merged), and for a given `index`, only the last step in the `completions` and `labels` may be different. ### Deduplication The deduplication was done against the training examples present in [Qwen/ProcessBench](https://huggingface.co/datasets/Qwen/ProcessBench) using the following script: [dedup.py](https://huggingface.co/datasets/HuggingFaceH4/prm800k-trl-dedup/blob/main/dedup.py)

### 数据集说明 本数据集为[tasksource/PRM800K](https://huggingface.co/datasets/tasksource/PRM800K)的适配版本,旨在借助[TRL](https://github.com/huggingface/trl)对偏好奖励模型(PRM)进行微调。 若需复现数据集构建流程,可前往文件目录下的[main.py](https://huggingface.co/datasets/HuggingFaceH4/prm800k-trl-dedup/blob/main/main.py)脚本查看。 参考[openai/prm800k](https://github.com/openai/prm800k/tree/main?tab=readme-ov-file#data)仓库中的数据使用指南,本数据集为每个`提示(prompt)`提供了`补全序列(completions)`列表:每条补全序列均从步骤列表中提取文本生成;当存在多组可选步骤时,则会生成一条新的补全序列。 示例数据行如下: json { "prompt": "一个直圆柱油罐高15英尺,其圆形底座的直径均为4英尺。当油罐侧躺(未以圆形底座着地)时,内部油量深度为3英尺。若将油罐竖直立在其中一个底座上,此时内部油量的深度为多少英尺?请将答案保留至小数点后一位。", "completions": [ "我需要先求出油罐内的油量体积,再除以底座面积,即可得到油罐竖直放置时的油量深度。", "为计算油量体积,可将其视为圆柱的弓形截面区域,该截面的圆心角由油量深度决定。", "若在油罐内部绘制一个直角三角形,其中斜边为底座直径,邻边为油量深度,对边为切割弓形截面的弦长的一半,即可通过三角函数求解圆心角。" ], "labels": [ true, true, false ], "index": 1 } 每条提示文本均对应原始数据集中的一行(已合并第一阶段与第二阶段数据);对于指定`索引`,仅`补全序列`与`标签`中的最后一步可能存在差异。 ### 去重处理 本次去重操作针对[Qwen/ProcessBench](https://huggingface.co/datasets/Qwen/ProcessBench)中的训练样本展开,所用脚本为:[dedup.py](https://huggingface.co/datasets/HuggingFaceH4/prm800k-trl-dedup/blob/main/dedup.py)
提供机构:
maas
创建时间:
2025-02-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作