five

DiffLean/DPO_Dataset

收藏
Hugging Face2026-04-13 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/DiffLean/DPO_Dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en task_categories: - text-generation tags: - dpo - preference-dataset - formal-reasoning - lean - math dataset_info: features: - name: Question dtype: string - name: Source dtype: string - name: Formal Statement dtype: string - name: Problem Type dtype: string - name: Chosen Formal Proof dtype: string - name: Chosen Reasoning dtype: string - name: Rejected Formal Proof dtype: string - name: Rejected Reasoning dtype: string splits: - name: train num_bytes: 6430157648 num_examples: 128595 download_size: 1442082212 dataset_size: 6430157648 configs: - config_name: default data_files: - split: train path: data/train-* --- # DPO_Dataset ## Dataset Description This repository contains the merged DiffLean DPO preference dataset used for formal reasoning and Lean proof generation experiments. Each training row includes a problem statement, its formal Lean statement, and a chosen/rejected pair of natural-language reasoning traces and formal proofs. ## Data Sources This upload merges the following local files: - `dpo_dataset/paired_pass_fail_dataset.json` - `dpo_dataset/formal_proof_verified_paired_pass_fail_dataset.json` ## Dataset Structure - Split: `train` - Number of rows: `128595` - Number of columns: `8` ### Fields - `Question`: Natural-language math problem statement, including the final answer text. - `Source`: Dataset/source identifier for the example. - `Formal Statement`: Lean theorem statement associated with the problem. - `Problem Type`: Problem category label carried over from the source dataset. - `Chosen Formal Proof`: Preferred Lean proof/program for the statement. - `Chosen Reasoning`: Preferred natural-language reasoning paired with the chosen proof. - `Rejected Formal Proof`: Rejected Lean proof/program for the same statement. - `Rejected Reasoning`: Rejected natural-language reasoning paired with the rejected proof. ### Source Composition - `dart-math-hard`: 61113 - `synthetic`: 54107 - `openr1`: 8755 - `deepscaler`: 4620 ## Intended Use The dataset is intended for preference modeling, reward modeling, or DPO-style training on mathematical reasoning and formal proof generation tasks. ## Limitations The examples are aggregated from generated and filtered pipelines and may still contain noise, formatting issues, or imperfect preference labels. Chosen responses should not be treated as guaranteed ground truth, and rejected responses are not necessarily universally incorrect. ## Ethical Considerations This dataset contains model-generated mathematical reasoning and formal proofs. Please validate downstream outputs before using them in high-stakes settings, especially when correctness matters.
提供机构:
DiffLean
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作