anuragsbaghel/MathNet
收藏Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/anuragsbaghel/MathNet
下载链接
链接失效反馈官方服务:
资源简介:
MathNet是一个高质量、大规模、多模态、多语言的奥林匹克数学问题数据集,包含30,676个专家编写的问题及其解决方案,涵盖17种语言和47个国家。数据集旨在评估生成模型在数学推理和基于嵌入的系统中数学检索的能力。它包含多样化的数学领域,如几何、代数、数论、组合数学等,并提供了三个基准任务:问题解决、数学感知检索和检索增强的问题解决。数据来源于官方问题手册,经过多阶段的LLM管道提取和验证,确保高质量和一致性。
MathNet is a high-quality, large-scale, multimodal, and multilingual dataset of Olympiad-level math problems, comprising 30,676 expert-authored problems with solutions across 17 languages and 47 countries. It serves as a benchmark for evaluating mathematical reasoning in generative models and mathematical retrieval in embedding-based systems. The dataset covers diverse mathematical domains such as geometry, algebra, number theory, and combinatorics, and includes three benchmark tasks: problem solving, math-aware retrieval, and retrieval-augmented problem solving. Data is sourced from official problem booklets, processed through a multi-stage LLM pipeline for extraction and verification, ensuring high quality and consistency.
提供机构:
anuragsbaghel



