MultiNRC

Name: MultiNRC
Creator: maas
Published: 2025-12-05 16:51:02
License: 暂无描述

魔搭社区2025-12-05 更新2025-12-06 收录

下载链接：

https://modelscope.cn/datasets/ScaleAI/MultiNRC

下载链接

链接失效反馈

官方服务：

资源简介：

# MultiNRC: Multilingual Native Reasoning Challenge MultiNRC is a challenging evaluation benchmark for large language models, designed to assess multilingual reasoning ability in French, Spanish, and Chinese. Unlike existing benchmarks that simply translate English-centric content, MultiNRC consists of over 1,000 native-authored reasoning questions, crafted by native speakers to capture linguistic and cultural nuances. ## Features - **Languages:** French, Spanish, Chinese - **Categories:** - Language-specific Linguistic Reasoning - Wordplay & Riddles - Cultural Reasoning & Traditions - Math Reasoning with Cultural Relevance - **English Equivalents:** For Cultural/Tradition and Math, human-translated English versions are provided for direct comparison. - **Ground Truth Final Answers:** Short, objective answers accompany each prompt for automatic evaluation. ## Dataset Structure Each entry includes: - A native-language prompt and answer (`i18n_prompt`, `i18n_gtfa`) - (For Math Reasoning and Cultural Reasoning category tasks) An English-equivalent prompt and answer (`english_prompt`, `english_gtfa`) - Metadata: `task_id`, `language`, `category` ## Citation If you use MultiNRC in your research, please cite: ```bibtex @article{fabbri2025multinrc, title = {MultiNRC: A Challenging Native Multilingual Reasoning Evaluation Benchmark for LLMs}, author = {Fabbri, Alexander R. and Mares, Diego and Flores, Jorge and Mankikar, Meher and Hernandez, Ernesto and Lee, Dean and Liu, Bing and Xing, Chen}, year = {2025}, note = {arXiv preprint, arXiv:XXXX.XXXXX} }

# MultiNRC：多语言原生推理评测基准 MultiNRC 是一款面向大语言模型（Large Language Model，LLM）的高挑战性评测基准，旨在评估法语、西班牙语及中文场景下的多语言推理能力。与仅对以英语为中心的内容进行简单翻译的现有基准不同，MultiNRC 包含超过1000道由母语使用者原创的推理题目，以精准捕捉语言与文化层面的细微差异。 ## 核心特性 - **覆盖语言**：法语、西班牙语、中文 - **任务分类**： - 特定语言的语言推理（Language-specific Linguistic Reasoning） - 文字游戏与谜语 - 文化推理与传统习俗 - 带文化关联的数学推理（Math Reasoning with Cultural Relevance） - **英文对照版本**：针对文化推理与传统、数学推理两类任务，提供人工翻译的英文版本以供直接对比评测。 - **标准答案**：每道提示题均配有简短客观的标准答案，支持自动化评测。 ## 数据集结构每条数据条目包含： - 母语语言的提示文本与标准答案（对应字段：`i18n_prompt`、`i18n_gtfa`） - （针对数学推理与文化推理类任务）附带英文对照的提示文本与标准答案（对应字段：`english_prompt`、`english_gtfa`） - 元数据：`task_id`（任务ID）、`language`（语言）、`category`（任务分类） ## 引用说明若您在研究中使用 MultiNRC 数据集，请引用以下文献： bibtex @article{fabbri2025multinrc, title = {MultiNRC：面向大语言模型的原生多语言推理评测基准}, author = {Fabbri, Alexander R. and Mares, Diego and Flores, Jorge and Mankikar, Meher and Hernandez, Ernesto and Lee, Dean and Liu, Bing and Xing, Chen}, year = {2025}, note = {arXiv预印本，arXiv:XXXX.XXXXX} }

提供机构：

maas

创建时间：

2025-09-23

5,000+

优质数据集

54 个

任务类型

进入经典数据集