ReasonMind/UTMath
收藏Hugging Face2025-01-14 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/ReasonMind/UTMath
下载链接
链接失效反馈官方服务:
资源简介:
UTMath是一个用于评估大型语言模型数学推理能力的全面基准。它包含1053个数学问题,每个问题都有平均68个测试用例。这些问题涵盖了9个数学领域,要求模型生成代码来解决通用问题,而不是特定问题。数据集采用了Reasoning-to-Coding Thoughts (RCoT)方法,该方法鼓励模型在生成代码之前进行显式推理,从而提高解决方案的效率和效果。
UTMath is a comprehensive benchmark designed to evaluate the mathematical reasoning abilities of large language models. It consists of 1,053 math problems, each with an average of 68 test cases. The problems span across nine mathematical domains, and the dataset requires models to generate code for general solutions rather than problem-specific ones. The Reasoning-to-Coding Thoughts (RCoT) approach is used to encourage explicit reasoning before code generation, enhancing the efficiency and effectiveness of the solutions.
提供机构:
ReasonMind



