PopQA_robustness
收藏魔搭社区2025-11-27 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/ibm-research/PopQA_robustness
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Card for "PopQA-robustness"
### Dataset Summary
PopQS-robustness is an expanded version of the PopQA dataset (https://aclanthology.org/2023.acl-long.546/) but with perturbations of the original input questions.
It is intended for use as a benchmark for evaluating model robustness on question-answering to these perturbations.
### Data Instances
#### popqa_robustness
- **Size of downloaded dataset file:** 26.4 MB
### Data Fields
#### boolq_robustness
- `id` (integer): original question grouping ID
- `question` (string): variant of question from BoolQ.
- `variant_id` (integer): identifier of the variant. 0 indicates it is the original unperturbed question.
- `variant_type` (string): name of the expansion variant type. "original" is the original question; "simple" is a superficial non-semantic perturbation; "paraphrase" is a semantic paraphrase of the question.
- `possible_answers` (string): list of strings of possible answers.
### Citation Information
```
@misc{ackerman2024novelmetricmeasuringrobustness,
title={A Novel Metric for Measuring the Robustness of Large Language Models in Non-adversarial Scenarios},
author={Samuel Ackerman and Ella Rabinovich and Eitan Farchi and Ateret Anaby-Tavor},
year={2024},
eprint={2408.01963},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2408.01963},
}
```
# “PopQA-鲁棒性”数据集卡片
### 数据集概述
PopQA-鲁棒性(原文此处为PopQS-robustness,疑似笔误,与后续数据集实例名`popqa_robustness`一致)是PopQA数据集(https://aclanthology.org/2023.acl-long.546/)的扩展版本,针对原始输入问句添加了各类扰动变换。本数据集旨在作为基准测试集,用于评估大语言模型(Large Language Model, LLM)在应对此类问句扰动时的问答鲁棒性。
### 数据实例
#### popqa_robustness
- **数据集文件下载大小:** 26.4 MB
### 数据字段
#### boolq_robustness
- `id`(整数型):原始问句分组标识符
- `question`(字符串型):源自BoolQ的问句变体
- `variant_id`(整数型):变体标识符,取值为0时代表未受扰动的原始问句
- `variant_type`(字符串型):扩展变体类型名称,其中“original”表示原始问句;“simple”表示表层非语义扰动;“paraphrase”表示问句的语义复述变体
- `possible_answers`(字符串型):候选答案字符串列表
### 引用信息
@misc{ackerman2024novelmetricmeasuringrobustness,
title={非对抗场景下大语言模型鲁棒性评测的新型度量指标},
author={塞缪尔·阿克曼、埃拉·拉宾诺维奇、埃坦·法尔奇、阿泰雷特·阿纳比-塔沃尔},
year={2024},
eprint={2408.01963},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2408.01963},
}
提供机构:
maas
创建时间:
2025-10-03



