lehduong/proof-pile-2
收藏Hugging Face2025-06-19 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/lehduong/proof-pile-2
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含三个子数据集:代数堆(algebraic-stack)、arxiv论文数据集和开放网络数学(open-web-math)。每个子数据集都包含文本数据,并划分为训练集、验证集和测试集。代数堆包含2831527个训练示例,500000个验证示例和74851个测试示例;arxiv包含3965390个训练示例,250000个验证示例和31577个测试示例;开放网络数学包含2908808个训练示例,252080个验证示例和500000个测试示例。
The dataset consists of three sub-datasets: algebraic-stack, arxiv, and open-web-math. Each sub-dataset contains text data and is divided into training, validation, and test sets. Algebraic-stack includes 2831527 training examples, 500000 validation examples, and 74851 test examples; arxiv includes 3965390 training examples, 250000 validation examples, and 31577 test examples; open-web-math includes 2908808 training examples, 252080 validation examples, and 500000 test examples.
提供机构:
lehduong



