TIGER-Lab/TheoremExplainBench

Name: TIGER-Lab/TheoremExplainBench
Creator: TIGER-Lab
Published: 2025-03-31 21:05:45
License: 暂无描述

Hugging Face2025-03-31 更新2025-04-12 收录

下载链接：

https://hf-mirror.com/datasets/TIGER-Lab/TheoremExplainBench

下载链接

链接失效反馈

官方服务：

资源简介：

TheoremExplainBench是一个为了评估和提升大型语言模型（LLM）理解和解释数学和科学定理的能力而设计的数据库。该数据库包含240个定理，按照难度和学科领域分类，以支持结构化的基准测试。每个定理都有一个描述，但这个描述不一定完全阐述定理，它主要是为LLM提供上下文帮助区分使用场景。

TheoremExplainBench is a dataset designed to evaluate and improve the ability of large language models (LLMs) to understand and explain mathematical and scientific theorems across multiple domains, through long-form multimodal content (e.g. Manim Videos). It consists of 240 theorems, categorized by difficulty and subject area to enable structured benchmarking. For each theorem, a description is provided, which does not necessarily fully illustrate the theorem. It is merely for the context to help LLMs distinguish the contextual uses.

提供机构：

TIGER-Lab

5,000+

优质数据集

54 个

任务类型

进入经典数据集