ai4bharat/RiddleBench
收藏Hugging Face2025-11-03 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/ai4bharat/RiddleBench
下载链接
链接失效反馈官方服务:
资源简介:
RiddleBench是一个精心策划的推理能力评估基准,包含1737个挑战性的谜题,旨在测试超越简单模式匹配的各种推理技能。这些谜题涉及编码-解码、座位安排、序列预测和血亲关系分析等问题。研究人员可以通过在RiddleBench上评估模型,深入了解模型处理抽象推理、常识推理和结构化问题解决的能力,这些技能对于构建健壮和值得信赖的人工智能系统至关重要。
RiddleBench is a meticulously curated benchmark of 1,737 challenging puzzles designed to test diverse reasoning skills beyond simple pattern matching. The puzzles involve coding-decoding, seating arrangements, sequence prediction, and blood relation analysis. By evaluating models on RiddleBench, researchers can gain deeper insights into their ability to handle abstract reasoning, commonsense inference, and structured problem solving, which are essential skills for robust and trustworthy AI systems.
提供机构:
ai4bharat



