five

sapienzanlp/piqa_italian

收藏
Hugging Face2025-12-02 更新2024-07-22 收录
下载链接:
https://hf-mirror.com/datasets/sapienzanlp/piqa_italian
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: afl-3.0 task_categories: - text-generation language: - it - en size_categories: - 10K<n<100K configs: - config_name: default data_files: - split: train path: piqa.train.json - split: validation path: piqa.validation.json --- # PIQA - Italian (IT) This dataset is an Italian translation of [PIQA](https://arxiv.org/abs/1911.11641). PIQA stands for Physical Interaction Question Answering, a dataset of questions about common scenarios that require an understanding of the physical world. ## Dataset Details The dataset consists of questions about common scenarios that require an understanding of the physical world. Each question is associated with a correct answer and a distractor. The task is to predict the correct answer to the question. The dataset includes the following splits: * Train: 16,044 rows * Validation: 1,830 rows ### Differences with the original dataset * **Did you know that not all questions in PIQA are questions?** In the original dataset, some instances are not questions but text completions, statements, or even single words. In this version, we categorize all instances so as to give the possibility to filter out non-question instances or treat them differently. * The number of instances in this dataset is smaller than the original dataset due to the translation process, during which some instances were filtered out. ### Languages This dataset is **fully parallel** between English and Italian. This allows us to have comparable evaluation setups and results across the two languages. ### Translation Process The translation has been carried out using [🍱 OBenTO-LLM](https://github.com/c-simone/llm-data-translation), an open-source tool for LLM-based translation. The main motivation for using an open-source LLM is to encourage free, open, reproducible, and transparent research in LLM evaluation. See [🍱 OBenTO-LLM](https://github.com/c-simone/llm-data-translation) for more details on the translation process. **Model used to translate:** [Unbabel/TowerInstruct-7B-v0.2] (https://huggingface.co/Unbabel/TowerInstruct-7B-v0.2) ### Other Information - **Original dataset by:** [Bisk et al.](https://arxiv.org/abs/1911.11641) - **Translation by:** [Simone Conia](https://scholar.google.com/citations?user=S1tqbTcAAAAJ) - **Languages:** Italian, English - **License:** AFL 3.0 ## Dataset Format This is an example that shows the format of the dataset, where: * `id`: a unique ID for each sample; * `category`: type of task; * `input_text`: the original English sentence in the dataset; * `input_text_translation`: the translation of the sentence in Italian; * `choices`: the original English choices; * `choice_translations`: the translation of the choices in Italian; * `gold_index`: the index of the correct answer. #### Example of a question in PIQA ```json { "id": "piqa_3", "category": "question", "input_text": "How do you shake something?", "input_text_translation": "Come si fa a scuotere qualcosa?", "choices": [ "Move it up and down and side to side quickly.", "Stir it very quickly." ], "choice_translations": [ "Si deve muovere rapidamente avanti e indietro e da un lato all'altro.", "Si mescola molto velocemente." ], "gold_index": 0 } ``` #### Example of a text completion in PIQA ```json { "id": "piqa_1", "category": "text_completion", "input_text": "To permanently attach metal legs to a chair, you can", "input_text_translation": "Per fissare in modo permanente le gambe di metallo a una sedia, si può", "choices": [ "weld the metal together to get it to stay firmly in place.", "nail the metal together to get it to stay firmly in place." ], "choice_translations": [ "saldare il metallo per farlo rimanere saldamente in posizione.", "incollare il metallo per farlo rimanere saldamente in posizione." ], "gold_index": 0 } ``` #### Example of a "topic" in PIQA ```json { "id": "piqa_29", "category": "topic", "input_text": "Soothe a painful sunburn.", "input_text_translation": "Alleviare una scottatura solare dolorosa.", "choices": [ "Wait until brewed tea bag is cool, then apply on burn.", "Wait until brewed tea bag is hot, then apply on burn." ], "choice_translations": [ "Attendere fino a quando il sacchetto del tè in infusione è freddo, quindi applicarlo sulla scottatura.", "Attendere fino a quando il sacchetto del tè in infusione è caldo, quindi applicarlo sulla scottatura." ], "gold_index": 0 } ``` #### Example of a "property" in PIQA ```json { "id": "piqa_855", "category": "property", "input_text": "Sleeves:", "input_text_translation": "Maniche:", "choices": [ "Can be cut by sciscors with ease.", "Can be cut by a knife with ease." ], "choice_translations": [ "possono essere tagliate facilmente con le forbici.", "possono essere tagliate con facilità con un coltello." ], "gold_index": 0 } ``` ## License The dataset is distributed under the AFL 3.0 license. ## Acknowledgements I would like to thank the authors of the original dataset for making it available to the research community. I would also like to thank [Future AI Research](https://future-ai-research.it/) for supporting this work and funding my research. ### Special Thanks My special thanks go to: * Pere-Lluís Huguet Cabot and Riccardo Orlando for their help with [🍱 OBenTO-LLM](https://github.com/c-simone/llm-data-translation). ## Dataset Card Authors * [Simone Conia](https://scholar.google.com/citations?user=S1tqbTcAAAAJ): simone.conia@uniroma1.it

This dataset is an Italian translation of PIQA (Physical Interaction Question Answering), containing questions about common scenarios requiring an understanding of the physical world. Each question has a correct answer and a distractor, and the task is to predict the correct answer. The dataset is fully parallel between English and Italian, allowing for comparable evaluations. It includes train and validation splits with a total of 17,874 rows. The translation was done using an open-source tool called OBenTO-LLM, aiming for free, open, reproducible, and transparent research. The dataset format includes fields such as id, category, input_text, input_text_translation, choices, choice_translations, and gold_index.
提供机构:
sapienzanlp
原始信息汇总

PIQA - Italian (IT)

数据集概述

  • 名称: PIQA - Italian (IT)
  • 来源: 意大利语翻译自PIQA
  • 任务类别: 文本生成
  • 语言: 意大利语, 英语
  • 规模: 10K<n<100K
  • 配置:
    • default
      • train: piqa.train.json
      • validation: piqa.validation.json

数据集详情

  • 内容: 包含关于常见场景的问题,需要理解物理世界。每个问题关联一个正确答案和一个干扰项。
  • 分割:
    • train: 16,044 行
    • validation: 1,830 行

与原始数据集的差异

  • 原始数据集中的一些实例不是问题,而是文本补全、陈述或单个词。本版本对所有实例进行了分类,以便过滤非问题实例或以不同方式处理它们。
  • 由于翻译过程,本数据集的实例数量少于原始数据集。

语言

  • 数据集在英语和意大利语之间完全平行,允许在两种语言之间进行可比较的评估设置和结果。

翻译过程

  • 使用🍱 OBenTO-LLM进行翻译,旨在鼓励自由、开放、可重复和透明的LLM评估研究。

数据集格式

  • 字段:
    • id: 每个样本的唯一ID
    • category: 任务类型
    • input_text: 原始英语句子
    • input_text_translation: 意大利语翻译
    • choices: 原始英语选项
    • choice_translations: 意大利语选项翻译
    • gold_index: 正确答案的索引

示例

  • 问题示例: json { "id": "piqa_3", "category": "question", "input_text": "How do you shake something?", "input_text_translation": "Come si fa a scuotere qualcosa?", "choices": [ "Move it up and down and side to side quickly.", "Stir it very quickly." ], "choice_translations": [ "Si deve muovere rapidamente avanti e indietro e da un lato allaltro.", "Si mescola molto velocemente." ], "gold_index": 0 }

  • 文本补全示例: json { "id": "piqa_1", "category": "text_completion", "input_text": "To permanently attach metal legs to a chair, you can", "input_text_translation": "Per fissare in modo permanente le gambe di metallo a una sedia, si può", "choices": [ "weld the metal together to get it to stay firmly in place.", "nail the metal together to get it to stay firmly in place." ], "choice_translations": [ "saldare il metallo per farlo rimanere saldamente in posizione.", "incollare il metallo per farlo rimanere saldamente in posizione." ], "gold_index": 0 }

  • 主题示例: json { "id": "piqa_29", "category": "topic", "input_text": "Soothe a painful sunburn.", "input_text_translation": "Alleviare una scottatura solare dolorosa.", "choices": [ "Wait until brewed tea bag is cool, then apply on burn.", "Wait until brewed tea bag is hot, then apply on burn." ], "choice_translations": [ "Attendere fino a quando il sacchetto del tè in infusione è freddo, quindi applicarlo sulla scottatura.", "Attendere fino a quando il sacchetto del tè in infusione è caldo, quindi applicarlo sulla scottatura." ], "gold_index": 0 }

  • 属性示例: json { "id": "piqa_855", "category": "property", "input_text": "Sleeves:", "input_text_translation": "Maniche:", "choices": [ "Can be cut by sciscors with ease.", "Can be cut by a knife with ease." ], "choice_translations": [ "possono essere tagliate facilmente con le forbici.", "possono essere tagliate con facilità con un coltello." ], "gold_index": 0 }

许可证

  • 许可证: AFL 3.0
搜集汇总
数据集介绍
main_image_url
构建方式
在物理交互问答领域,sapienzanlp/piqa_italian 数据集的构建采用了基于大型语言模型的翻译方法。该数据集源自英文原版PIQA,通过开源工具OBenTO-LLM,利用Unbabel/TowerInstruct-7B-v0.2模型进行自动化翻译,实现了从英语到意大利语的精准转换。翻译过程中,为确保数据质量,部分不符合要求的实例被过滤,导致最终规模略小于原始数据集。每个样本均保留了原始的结构,包括问题、选项及正确答案索引,同时新增了类别标签以区分问题、文本补全等不同形式,从而为多语言自然语言理解研究提供了结构化的双语平行语料。
特点
该数据集的核心特点在于其完全平行的双语结构,英语与意大利语文本一一对应,为跨语言模型评估提供了理想基准。不同于原始版本,本数据集对所有实例进行了系统化分类,明确标注为问题、文本补全、主题或属性等类别,增强了数据的可解释性与灵活性。内容聚焦于日常物理场景的理解,涉及从物体操作到生活常识的多样情境,通过正误选项的对比设计,有效考察模型对物理世界的推理能力。这种细粒度标注与场景多样性相结合,使其成为评估模型多语言物理常识理解的重要资源。
使用方法
使用该数据集时,研究人员可将其直接应用于多语言自然语言理解任务的训练与评估。数据以标准JSON格式组织,包含训练集与验证集,用户可通过加载相应文件获取双语文本对及类别信息。在模型评估中,可利用`gold_index`字段验证预测准确性,或根据`category`字段筛选特定类型实例进行针对性分析。该数据集兼容主流机器学习框架,支持零样本、少样本学习等实验设置,尤其适合用于比较模型在英语与意大利语上的性能差异,推动跨语言物理推理研究的发展。
背景与挑战
背景概述
物理交互问答数据集PIQA由Bisk等人于2019年提出,旨在评估模型对日常物理场景的理解能力。该数据集聚焦于需要物理常识推理的问题,如物体操作与日常活动,填补了自然语言处理在具身智能与物理推理领域的空白。sapienzanlp/piqa_italian是Simone Conia及其合作者基于开源工具OBenTO-LLM构建的意大利语平行翻译版本,不仅扩展了PIQA的多语言覆盖范围,还通过标注类别增强了数据结构的清晰度,为跨语言模型评估提供了重要资源。
当前挑战
PIQA数据集的核心挑战在于其任务要求模型具备对物理世界的隐式常识推理能力,例如判断物体属性或预测动作后果,这超越了传统的文本模式匹配。在构建意大利语版本时,翻译过程面临语义保真度与文化适配性的双重考验,部分实例因语言转换中的信息损耗而被过滤,导致数据规模缩减。此外,原始数据中非问题类实例的混杂性,如文本补全与主题陈述,增加了任务定义的复杂性,需通过分类标注实现精准处理。
常用场景
经典使用场景
在自然语言处理领域,物理常识推理任务对模型理解现实世界提出了挑战。sapienzanlp/piqa_italian数据集作为PIQA的意大利语平行版本,其经典使用场景在于评估和训练语言模型对日常物理交互的认知能力。通过呈现诸如“如何摇晃物体”或“永久固定金属椅腿”等情境,模型需在两项描述中选择符合物理规律的答案,从而检验其基于常识的推理性能。这一过程不仅涉及语言理解,更要求模型具备对现实世界运作方式的隐含知识。
解决学术问题
该数据集有效解决了人工智能研究中模型缺乏物理世界常识的瓶颈问题。传统语言模型往往在形式化文本任务中表现优异,却难以应对需要基础物理知识的情境。sapienzanlp/piqa_italian通过提供意大利语与英语的双语平行数据,使研究者能够跨语言探究模型的常识推理泛化能力。其意义在于为评估模型的物理交互理解建立了标准化基准,推动了更具鲁棒性和人性化智能系统的发展,填补了非英语语言在常识推理评估资源上的空白。
衍生相关工作
围绕该数据集衍生的经典工作主要集中在多语言常识推理评估框架的构建上。研究者利用其双语平行特性,开展了跨语言模型迁移能力的系统性分析,探究知识在不同语言间的传递效率。相关工作还包括改进翻译方法以保持原始语义细微差别,以及开发能够同时处理问题、文本补全、主题和属性等多种范畴的统一评估协议。这些研究深化了对模型常识表征本质的理解,并促进了如OBenTO-LLM等开源翻译工具的发展,推动了透明、可复现的多语言NLP评估生态。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作