JEE-Main-2025-Math

Name: JEE-Main-2025-Math
Creator: maas
Published: 2025-12-05 16:43:01
License: 暂无描述

魔搭社区2025-12-05 更新2025-07-26 收录

下载链接：

https://modelscope.cn/datasets/PhysicsWallahAI/JEE-Main-2025-Math

下载链接

链接失效反馈

官方服务：

资源简介：

# JEE Mains 2025 Math Evaluation Set ## 🧾 Dataset Summary This dataset contains **475 math questions** from the **official JEE Mains 2025** examination, covering both **January** and **April** sessions. It is curated to benchmark mathematical reasoning models under high-stakes exam conditions. --- ## 🚀 How to Load the Dataset You can load the evaluation data using the `datasets` library from Hugging Face: ```python from datasets import load_dataset # Load January session evaluation set jan_data = load_dataset("PhysicsWallahAI/JEE-Main-2025-Math", "jan", split="test") # Load April session evaluation set apr_data = load_dataset("PhysicsWallahAI/JEE-Main-2025-Math", "apr", split="test") ``` --- ## 📂 Dataset Structure Each sample is stored as a JSON object with the following fields: | Field Name | Type | Description | |--------------------|-----------|-----------------------------------------------------------------------------| | `question` | `string` | Math problem text (can include LaTeX) | | `answer` | `string` | Final answer (NAT or symbolic form) | | `question_type` | `int` | `0 = Numerical Answer Type`, `1 = Multiple Choice Question` | | `options` | `list` | List of answer choices (present only for MCQ) | | `correct_options` | `list` | Indices of correct options in `options[]` (for MCQ only) | | `additional_data` | `dict` | Placeholder for extended fields used during model training or evaluation. | | `metadata` | `dict` | Optional metadata providing contextual information about the question. | --- ## 📊 Dataset Statistics | Split | Papers | Questions | | ------------ | ------ | --------- | | January 2025 | 10 | 250 | | April 2025 | 9 | 225 | | **Total** | 19 | **475** | * **MCQs**: \~80% * **NATs**: \~20% --- ## 📥 Source All questions were sourced from **official JEE Mains 2025** mathematics papers publicly released by **NTA**. Answer keys were cross-verified with NTA final answer releases. --- ## 💼 Intended Uses * Benchmarking Indian math LLMs * Evaluating symbolic + numeric reasoning * Comparing SFT/RLHF/retrieval-based models on real exams --- ## ⚠️ Limitations * Limited to mathematics domain --- ## 📄 Citation ```bibtex @misc{jee2025math, title = {JEE Mains 2025 Math Evaluation Set}, author = {Physics Wallah AI Research}, year = {2025}, note = {Official JEE Mains 2025 math questions curated for evaluating educational language models}, howpublished = {\url{https://huggingface.co/datasets/PhysicsWallahAI/JEE-Main-2025-Math}}, } ```

# JEE Main 2025数学评估数据集 ## 🧾 数据集概述本数据集包含来自官方2025年JEE Main考试的**475道数学试题**，涵盖**1月与4月**两场考试场次。本数据集专为在高利害考试场景下对数学推理模型进行性能基准测试而精心构建。 --- ## 🚀 数据集加载方式您可通过Hugging Face的`datasets`库加载该评估数据集： python from datasets import load_dataset # 加载1月场评估数据集 jan_data = load_dataset("PhysicsWallahAI/JEE-Main-2025-Math", "jan", split="test") # 加载4月场评估数据集 apr_data = load_dataset("PhysicsWallahAI/JEE-Main-2025-Math", "apr", split="test") --- ## 📂 数据集结构每个样本以JSON对象形式存储，包含以下字段： | 字段名 | 数据类型 | 说明 | |--------------------|-----------|---------------------------------------------------------------------| | `question` | `string` | 数学试题文本（可包含LaTeX格式内容） | | `answer` | `string` | 最终答案（数值答案型或符号形式） | | `question_type` | `int` | `0 = 数值答案型试题（Numerical Answer Type，简称NAT）`，`1 = 单项选择题（Multiple Choice Question，简称MCQ）` | | `options` | `list` | 候选答案列表（仅单项选择题包含该字段） | | `correct_options` | `list` | `options[]` 中正确选项的索引（仅适用于单项选择题） | | `additional_data` | `dict` | 预留扩展字段，用于模型训练或评估过程中的补充信息。 | | `metadata` | `dict` | 可选元数据，用于提供试题的相关上下文信息。 | --- ## 📊 数据集统计信息 | 数据集划分 | 试卷数量 | 试题数量 | | ---------- | -------- | -------- | | 2025年1月场 | 10 | 250 | | 2025年4月场 | 9 | 225 | | **总计** | **19** | **475** | * **单项选择题（MCQs）**：约占80% * **数值答案型试题（NATs）**：约占20% --- ## 📥 数据来源所有试题均来自**国家测试机构（National Testing Agency，简称NTA）** 官方公开发布的2025年JEE Main数学试卷。答案密钥已与NTA官方最终公布的答案进行交叉验证。 --- ## 💼 预期应用场景 * 印度数学类大语言模型（Large Language Model，简称LLM）的性能基准测试 * 符号推理与数值推理能力的联合评估 * 在真实考试场景下对比监督微调（Supervised Fine-Tuning，SFT）、基于人类反馈的强化学习（Reinforcement Learning from Human Feedback，RLHF）以及检索增强型模型的表现 --- ## ⚠️ 局限性 * 仅覆盖数学学科领域 --- ## 📄 引用格式 bibtex @misc{jee2025math, title = {JEE Mains 2025 Math Evaluation Set}, author = {Physics Wallah AI Research}, year = {2025}, note = {官方2025年JEE Main数学试题，专为教育语言模型评估而精心整理}, howpublished = {url{https://huggingface.co/datasets/PhysicsWallahAI/JEE-Main-2025-Math}}, }

提供机构：

maas

创建时间：

2025-07-23

5,000+

优质数据集

54 个

任务类型

进入经典数据集