ScaleAI/MultiNRC
收藏Hugging Face2025-07-23 更新2025-08-09 收录
下载链接:
https://hf-mirror.com/datasets/ScaleAI/MultiNRC
下载链接
链接失效反馈官方服务:
资源简介:
MultiNRC是一个针对大型语言模型的多语种推理能力评估挑战性基准,包含法语、西班牙语和中文三种语言。数据集中包含超过1000个由母语者创作的推理问题,涵盖了语言特定语言推理、文字游戏与谜语、文化推理与传统以及具有文化相关性的数学推理等类别。对于文化和传统以及数学推理类别的任务,还提供了人工翻译的英文版本以便直接比较。每个问题都附带简短客观的最终答案,用于自动评估。
MultiNRC is a challenging evaluation benchmark for large language models designed to assess multilingual reasoning ability in French, Spanish, and Chinese. The dataset consists of over 1,000 native-authored reasoning questions that capture linguistic and cultural nuances. Categories include language-specific linguistic reasoning, wordplay & riddles, cultural reasoning & traditions, and math reasoning with cultural relevance. For tasks in the Cultural/Tradition and Math Reasoning categories, human-translated English versions of prompts and answers are provided for direct comparison. Each prompt is accompanied by a short, objective final answer for automatic evaluation.
提供机构:
ScaleAI



