five

DaydreamerMZM/SenseMath

收藏
Hugging Face2026-04-01 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/DaydreamerMZM/SenseMath
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - question-answering language: - en tags: - math - number-sense - benchmark - shortcuts - numerical-reasoning size_categories: - 1K<n<10K --- # SenseMath: Evaluating Number Sense in Large Language Models **SenseMath** is a controlled benchmark for measuring whether LLMs can exploit number-sense shortcuts. ## Dataset Description - **1,600 item families** across 8 categories and 4 digit scales - **3 variants per family**: strong-shortcut, weak-shortcut, control - **4,800 total items** - **Categories**: Magnitude Estimation, Structural Shortcuts, Relative Distance, Cancellation, Compatible Numbers, Landmark Comparison, Equation Reasoning, Option Elimination - **Digit scales**: d=2, 4, 8, 16 ## Files | File | Description | |------|-------------| | `data/sensemath_v2_d2.json` | 400 families, 2-digit operands | | `data/sensemath_v2_d4.json` | 400 families, 4-digit operands | | `data/sensemath_v2_d8.json` | 400 families, 8-digit operands | | `data/sensemath_v2_d16.json` | 400 families, 16-digit operands | | `data/judge_j1.json` | J1 task: shortcut recognition (251 items) | | `data/judge_j2.json` | J2 task: strategy identification (80 items) | | `data/judge_j3.json` | J3 task items | ## Usage ```python from datasets import load_dataset ds = load_dataset("DaydreamerMZM/SenseMath", split="train") # Or load directly import json with open("data/sensemath_v2_d4.json") as f: families = json.load(f) ``` ## Citation ```bibtex @article{zhuang2025sensemath, title={SenseMath: Evaluating Number Sense in Large Language Models}, author={Zhuang, Haomin and Wang, Xiangqi and Shen, Yili and Cheng, Ying and Zhang, Xiangliang}, journal={arXiv preprint arXiv:XXXX.XXXXX}, year={2025} } ``` ## Links - **Paper**: [arXiv](https://arxiv.org/abs/XXXX.XXXXX) - **Code**: [GitHub](https://github.com/zhmzm/SenseMath) - **Project Page**: [zhmzm.github.io/SenseMath](https://zhmzm.github.io/SenseMath/)
提供机构:
DaydreamerMZM
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作