ChristianZ97/SATP-Config-Buffer

Name: ChristianZ97/SATP-Config-Buffer
Creator: ChristianZ97
Published: 2026-04-22 11:42:35
License: 暂无描述

Hugging Face2026-04-22 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/ChristianZ97/SATP-Config-Buffer

下载链接

链接失效反馈

官方服务：

资源简介：

SATP-Config-Buffer数据集是在SATP（Steering Aesop for Theorem Proving）训练过程中收集的Aesop策略配置集合。该数据集用作训练SATP-aesop-policy模型的正负回放缓冲区。每条数据包含一个定理和一个Aesop配置对，并带有奖励标签：+1.0表示配置成功证明了定理（正例/成功），-1.0表示配置未能证明定理（负例/扰动）。数据集包含以下列：theorem_uuid（定理的唯一标识符）、config_uuid（配置的唯一标识符）、formal_statement（Lean 4定理陈述，包括`import Mathlib`头）、tactic_string（完整的Aesop配置块）、reward（奖励值）、lemma_names（检索到的前提全名列表）和lemma_scores（检索相似度分数列表）。数据集的定理来源于NuminaMath-LEAN数据集，而本数据集仅贡献了SATP训练过程中发现的每个定理的Aesop配置。

The SATP-Config-Buffer dataset is a collection of Aesop tactic configurations gathered during SATP (Steering Aesop for Theorem Proving) training. This dataset serves as the positive and negative replay buffer for training the SATP-aesop-policy model. Each row consists of a (theorem, aesop config) pair labeled with a reward: +1.0 indicates the config successfully proved the theorem (positive/success), while -1.0 indicates it failed (negative/perturbation). The dataset includes columns such as theorem_uuid (unique identifier for the theorem), config_uuid (unique identifier for the config), formal_statement (Lean 4 theorem statement including `import Mathlib` header), tactic_string (full Aesop config block), reward (reward value), lemma_names (list of top-K retrieved premise full names), and lemma_scores (retrieval similarity scores). The theorems are sourced from the NuminaMath-LEAN dataset, and this dataset specifically provides the Aesop configurations discovered during SATP training.

提供机构：

ChristianZ97

5,000+

优质数据集

54 个

任务类型

进入经典数据集