BSC-LT/arc_es

Name: BSC-LT/arc_es
Creator: BSC-LT
Published: 2025-12-09 10:26:31
License: 暂无描述

Hugging Face2025-12-09 更新2026-01-03 收录

下载链接：

https://hf-mirror.com/datasets/BSC-LT/arc_es

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-sa-4.0 task_categories: - question-answering task_ids: - open-domain-qa - multiple-choice-qa language: - es pretty_name: ARC_es size_categories: - n<1K dataset_info: - config_name: ARC_Easy_es features: - name: id dtype: string - name: question dtype: string - name: answer_a dtype: string - name: answer_b dtype: string - name: answer_c dtype: string - name: answer_d dtype: string - name: answerkey dtype: string splits: - name: validation num_examples: 570 - config_name: ARC_Challenge_es features: - name: id dtype: string - name: question dtype: string - name: answer_a dtype: string - name: answer_b dtype: string - name: answer_c dtype: string - name: answer_d dtype: string - name: answerkey dtype: string splits: - name: validation num_examples: 299 --- # Dataset Card for ARC (Spanish Version) ## Dataset summary This dataset provides the Spanish translation and adaptation of the ARC (AI2 Reasoning Challenge) validation set. The original dataset was designed to evaluate scientific reasoning in LLMs by presenting a collection of authentic, grade-school science multiple choice questions split into two sets of varying difficulty: ARC Easy and ARC Challenging. This Spanish adaptation enables evaluation of the ARC task in Spanish while preserving the original Easy and Challenge splits and maintaining the conceptual difficulty, structure, and intent of the source material. - **Curated by:** Barcelona Supercomputing Center (BSC) - **Funded by:** Modelos del Lenguaje - **Language(s) (NLP):** es (Spanish) - **License:** CC BY SA ## Dataset Details ### Dataset Description The dataset preserves the original structure, features, and overall format of the original ARC validation set. Only minor adjustments were made during translation. Key adaptations include: 1) Ensuring factual correctness, as some factual mistakes were found in the original 3) Preserving linguistic and cultural appropriateness in Spanish-speaking contexts 4) Adapting proper names to Spanish equivalents 5) Maintaning gender neutrality in items where the English source did not mark gender, in order to avoid introducing gender bias ## Dataset Structure ### ARC_Easy_es An example of instance looks as follows: ```json { "id": "NYSEDREGENTS_2014_8_3", "question": "¿Qué tipo de relación hay cuando las raíces de un tipo determinado de árbol necesitan la presencia de un hongo para crecer con normalidad?", "answer_a": "Beneficiosa.", "answer_b": "Competitiva.", "answer_c": "Nociva.", "answer_d": "Infecciosa.", "answerkey": "A"} ``` ### ARC_Challenge_es An example of instance looks as follows: ```json { "id": "Mercury_7235813", "question": "El carbono circula por diversos depósitos de la Tierra. El tiempo que tardan en formarse estos depósitos varía considerablemente. ¿Qué proceso del ciclo del carbono tarda millones de años en formar el depósito indicado?", "answer_a": "La incorporación del carbono a través de la digestión en los tejidos animales.", "answer_b": "La liberación de carbono en la atmósfera al respirar.", "answer_c": "El carbono atmosférico que se incorpora en los lípidos de las plantas.", "answer_d": "La descomposición del carbono en tejidos vegetales para generar petróleo.", "answerkey": "D" } ``` ### Dataset Sources - **Repository:** [ARC](https://huggingface.co/datasets/allenai/ai2_arc) - **Paper:** Clark, P. et al. (2018). *Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge*. [arXiv:1803.05457](https://arxiv.org/abs/1803.05457) #### Who are the source data producers? All credits go to the creators of the original ARC dataset, Peter Clark et al. ## Citation **APA:** Clark, P. et al. (2018). *Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge*. [arXiv:1803.05457](https://arxiv.org/abs/1803.05457) ## Uses ### Direct Use This dataset can be used to: - Evaluate scientific reasoning and multiple-choice question-answering in Spanish. - Study multilingual performance variation in scientific question-answering. - Assess model robustness on Easy and Challenge splits in a non-English context. ### Out-of-Scope Use This dataset is not intended for: - Training or fine-tuning models, as it is intended strictly for evaluation. - Drawing conclusions about real-world scientific expertise or student assessment. - Generating outputs in scientific contexts without human oversight. - Use in non-Spanish contexts, as the dataset has been culturally and linguistically localized. ## More information This work has been supported and funded by the Ministerio para la Transformación Digital y de la Función Pública and the Plan de Recuperación, Transformación y Resiliencia – funded by the EU through NextGenerationEU, within the framework of the Modelos del Lenguaje project. ## Contact point Language Technologies Lab (langtech@bsc.es) at the Barcelona Supercomputing Center (BSC).

提供机构：

BSC-LT

5,000+

优质数据集

54 个

任务类型

进入经典数据集