sagea-ai/SAGE-MathInstruct
收藏Hugging Face2026-01-05 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/sagea-ai/SAGE-MathInstruct
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
dataset_info:
features:
- name: problem
dtype: string
- name: generated_solution
dtype: string
- name: expected_answer
dtype: string
- name: problem_source
dtype: string
splits:
- name: train
num_bytes: 15558412976
num_examples: 13972791
- name: train_1M
num_bytes: 1350383003
num_examples: 1000000
- name: train_2M
num_bytes: 2760009675
num_examples: 2000000
- name: train_5M
num_bytes: 6546496157
num_examples: 5000000
download_size: 12557258602
dataset_size: 26215301811
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: train_1M
path: data/train_1M-*
- split: train_2M
path: data/train_2M-*
- split: train_5M
path: data/train_5M-*
---
许可证:Apache-2.0
数据集详情:
数据特征:
- 字段名:问题(problem),数据类型:字符串
- 字段名:生成式解答(generated_solution),数据类型:字符串
- 字段名:预期答案(expected_answer),数据类型:字符串
- 字段名:问题来源(problem_source),数据类型:字符串
数据集划分:
- 划分名称:训练集(train),字节大小:15558412976,样本数量:13972791
- 划分名称:百万级训练子集(train_1M),字节大小:1350383003,样本数量:1000000
- 划分名称:两百万级训练子集(train_2M),字节大小:2760009675,样本数量:2000000
- 划分名称:五百万级训练子集(train_5M),字节大小:6546496157,样本数量:5000000
下载大小:12557258602
数据集总大小:26215301811
配置项:
- 配置名称:默认配置(default),数据文件路径:
- 对应划分train,路径为data/train-*
- 对应划分train_1M,路径为data/train_1M-*
- 对应划分train_2M,路径为data/train_2M-*
- 对应划分train_5M,路径为data/train_5M-*
提供机构:
sagea-ai



