five

mtybilly/MetaMathQA

收藏
Hugging Face2026-03-28 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/mtybilly/MetaMathQA
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - text-generation - question-answering language: - en tags: - math - reasoning - metamath source_datasets: - meta-math/MetaMathQA configs: - config_name: full data_files: - split: train path: full/train-*.parquet - config_name: MATH data_files: - split: train path: MATH/train-*.parquet - config_name: MATH-50K data_files: - split: train path: MATH-50K/train-*.parquet default_config_name: full --- # MetaMathQA Subsets Curated subsets of [meta-math/MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA) for mathematical reasoning experiments. ## Subsets | Subset | Samples | Description | |--------|---------|-------------| | `full` | 395,000 | All MetaMathQA samples (unchanged) | | `MATH` | 155,000 | MATH_* types only (AnsAug, Rephrased, FOBAR, SV) | | `MATH-50K` | 50,000 | Stratified 50K sample from MATH subset | ### MATH-50K Type Distribution | Type | Count | Proportion | |------|-------|------------| | MATH_AnsAug | 24,194 | 48.4% | | MATH_Rephrased | 16,129 | 32.3% | | MATH_FOBAR | 4,839 | 9.7% | | MATH_SV | 4,838 | 9.7% | ## Columns | Column | Description | |--------|-------------| | `id` | Unique identifier in format `<type>_<global_idx>` | | `query` | The question text | | `response` | Chain-of-thought solution ending with "The answer is: ..." | | `type` | Question type (e.g., MATH_AnsAug, GSM_Rephrased) | | `original_question` | Original question from the source dataset | ## Usage ```python from datasets import load_dataset # Load the MATH-50K subset ds = load_dataset("mtybilly/MetaMathQA", "MATH-50K", split="train") # Load full dataset ds = load_dataset("mtybilly/MetaMathQA", "full", split="train") ``` ## Source Derived from [MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA) (Yu et al., 2024). Stratified sampling uses seed=42 for reproducibility.
提供机构:
mtybilly
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作