BSC-LT/arc_es
收藏Hugging Face2025-12-09 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/BSC-LT/arc_es
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-sa-4.0
task_categories:
- question-answering
task_ids:
- open-domain-qa
- multiple-choice-qa
language:
- es
pretty_name: ARC_es
size_categories:
- n<1K
dataset_info:
- config_name: ARC_Easy_es
features:
- name: id
dtype: string
- name: question
dtype: string
- name: answer_a
dtype: string
- name: answer_b
dtype: string
- name: answer_c
dtype: string
- name: answer_d
dtype: string
- name: answerkey
dtype: string
splits:
- name: validation
num_examples: 570
- config_name: ARC_Challenge_es
features:
- name: id
dtype: string
- name: question
dtype: string
- name: answer_a
dtype: string
- name: answer_b
dtype: string
- name: answer_c
dtype: string
- name: answer_d
dtype: string
- name: answerkey
dtype: string
splits:
- name: validation
num_examples: 299
---
# Dataset Card for ARC (Spanish Version)
## Dataset summary
This dataset provides the Spanish translation and adaptation of the ARC (AI2 Reasoning Challenge) validation set. The original dataset was designed to evaluate scientific reasoning in LLMs by presenting a collection of authentic, grade-school science multiple choice questions split into two sets of varying difficulty: ARC Easy and ARC Challenging.
This Spanish adaptation enables evaluation of the ARC task in Spanish while preserving the original Easy and Challenge splits and maintaining the conceptual difficulty, structure, and intent of the source material.
- **Curated by:** Barcelona Supercomputing Center (BSC)
- **Funded by:** Modelos del Lenguaje
- **Language(s) (NLP):** es (Spanish)
- **License:** CC BY SA
## Dataset Details
### Dataset Description
The dataset preserves the original structure, features, and overall format of the original ARC validation set.
Only minor adjustments were made during translation. Key adaptations include:
1) Ensuring factual correctness, as some factual mistakes were found in the original
3) Preserving linguistic and cultural appropriateness in Spanish-speaking contexts
4) Adapting proper names to Spanish equivalents
5) Maintaning gender neutrality in items where the English source did not mark gender, in order to avoid introducing gender bias
## Dataset Structure
### ARC_Easy_es
An example of instance looks as follows:
```json
{
"id": "NYSEDREGENTS_2014_8_3",
"question": "¿Qué tipo de relación hay cuando las raíces de un tipo determinado de árbol necesitan la presencia de un hongo para crecer con normalidad?",
"answer_a": "Beneficiosa.",
"answer_b": "Competitiva.",
"answer_c": "Nociva.",
"answer_d": "Infecciosa.",
"answerkey": "A"}
```
### ARC_Challenge_es
An example of instance looks as follows:
```json
{
"id": "Mercury_7235813",
"question": "El carbono circula por diversos depósitos de la Tierra. El tiempo que tardan en formarse estos depósitos varía considerablemente. ¿Qué proceso del ciclo del carbono tarda millones de años en formar el depósito indicado?",
"answer_a": "La incorporación del carbono a través de la digestión en los tejidos animales.",
"answer_b": "La liberación de carbono en la atmósfera al respirar.",
"answer_c": "El carbono atmosférico que se incorpora en los lípidos de las plantas.",
"answer_d": "La descomposición del carbono en tejidos vegetales para generar petróleo.",
"answerkey": "D"
}
```
### Dataset Sources
- **Repository:** [ARC](https://huggingface.co/datasets/allenai/ai2_arc)
- **Paper:** Clark, P. et al. (2018). *Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge*. [arXiv:1803.05457](https://arxiv.org/abs/1803.05457)
#### Who are the source data producers?
All credits go to the creators of the original ARC dataset, Peter Clark et al.
## Citation
**APA:**
Clark, P. et al. (2018). *Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge*. [arXiv:1803.05457](https://arxiv.org/abs/1803.05457)
## Uses
### Direct Use
This dataset can be used to:
- Evaluate scientific reasoning and multiple-choice question-answering in Spanish.
- Study multilingual performance variation in scientific question-answering.
- Assess model robustness on Easy and Challenge splits in a non-English context.
### Out-of-Scope Use
This dataset is not intended for:
- Training or fine-tuning models, as it is intended strictly for evaluation.
- Drawing conclusions about real-world scientific expertise or student assessment.
- Generating outputs in scientific contexts without human oversight.
- Use in non-Spanish contexts, as the dataset has been culturally and linguistically localized.
## More information
This work has been supported and funded by the Ministerio para la Transformación Digital y de la Función Pública and the Plan de Recuperación, Transformación y Resiliencia – funded by the EU through NextGenerationEU, within the framework of the Modelos del Lenguaje project.
## Contact point
Language Technologies Lab (langtech@bsc.es) at the Barcelona Supercomputing Center (BSC).
提供机构:
BSC-LT



