ibm-research/BoolQ_robustness
收藏Hugging Face2024-08-19 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/ibm-research/BoolQ_robustness
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- question-answering
language:
- en
---
# Dataset Card for "BoolQ-robustness"
### Dataset Summary
BoolQ-robustness is an expanded version of the BoolQ dataset (https://arxiv.org/abs/1905.10044) but with perturbations of the original input questions and passages.
It is intended for use as a benchmark for evaluating model robustness on question-answering to these perturbations.
### Data Instances
#### boolq_robustness
- **Size of downloaded dataset file:** 21.8 MB
### Data Fields
#### boolq_robustness
- `id` (integer): original question grouping ID
- `question` (string): variant of question from BoolQ.
- `variant_id` (integer): identifier of the variant. 0 indicates it is the original unperturbed question.
- `variant_type` (string): name of the expansion variant type. "original" is the original question; "simple" is a superficial non-semantic perturbation; "distraction" is the insertion of a distraction sentence in the passage, while retaining the original question.
- `answer` (string): the true answer
- `passage`(string): a passage based on which the question is to be answered.
### Citation Information
```
@misc{ackerman2024novelmetricmeasuringrobustness,
title={A Novel Metric for Measuring the Robustness of Large Language Models in Non-adversarial Scenarios},
author={Samuel Ackerman and Ella Rabinovich and Eitan Farchi and Ateret Anaby-Tavor},
year={2024},
eprint={2408.01963},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2408.01963},
}
```
提供机构:
ibm-research



