GaussMath/GAUSS
收藏Hugging Face2025-09-27 更新2025-11-01 收录
下载链接:
https://hf-mirror.com/datasets/GaussMath/GAUSS
下载链接
链接失效反馈官方服务:
资源简介:
GAUSS(数学基础结构技能综合评估)是一个新一代的基准测试,旨在评估大型语言模型(LLM)的数学能力。它将数学熟练度分解为12个结构化技能维度,使得可以在知识理解、问题解决与交流、学习、元技能和创造力方面对模型进行细致的分析。GAUSS数据集包含由数学家和研究人员贡献的精选问题、标准解决方案、评分量表和评分标准,旨在为AI系统在数学领域的评估提供一个框架。
GAUSS (**G**eneral **A**ssessment of **U**nderlying **S**tructured **S**kills) is a next-generation benchmark designed to evaluate mathematical ability in Large Language Models (LLMs). It decomposes mathematical proficiency into **12 structured skill dimensions**, enabling fine-grained profiling of models across **knowledge and understanding, problem solving and communication, learning, meta skills, and creativity**. The GAUSS dataset contains curated problems, standard solutions, rubrics, and scoring criteria contributed by mathematicians and researchers, aiming to provide an evaluation framework for AI systems in mathematics.
提供机构:
GaussMath



