INSAIT-Institute/BrokenMath
收藏Hugging Face2025-10-07 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/INSAIT-Institute/BrokenMath
下载链接
链接失效反馈官方服务:
资源简介:
BrokenMath是一个针对大型语言模型在自然语言定理证明领域中奉承行为的评估基准。它包含了具有故意错误前提的数学问题,用于测试模型对错误前提的认同倾向。
BrokenMath is a benchmark designed to evaluate sycophancy in Large Language Models (LLMs) within the domain of natural language theorem proving, containing math problems with deliberately flawed premises to test the tendency of models to agree with incorrect premises.
提供机构:
INSAIT-Institute



