ibm-research/BoolQ_robustness

Name: ibm-research/BoolQ_robustness
Creator: ibm-research
Published: 2024-08-19 14:56:37
License: 暂无描述

Hugging Face2024-08-19 更新2025-04-12 收录

下载链接：

https://hf-mirror.com/datasets/ibm-research/BoolQ_robustness

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit task_categories: - question-answering language: - en --- # Dataset Card for "BoolQ-robustness" ### Dataset Summary BoolQ-robustness is an expanded version of the BoolQ dataset (https://arxiv.org/abs/1905.10044) but with perturbations of the original input questions and passages. It is intended for use as a benchmark for evaluating model robustness on question-answering to these perturbations. ### Data Instances #### boolq_robustness - **Size of downloaded dataset file:** 21.8 MB ### Data Fields #### boolq_robustness - `id` (integer): original question grouping ID - `question` (string): variant of question from BoolQ. - `variant_id` (integer): identifier of the variant. 0 indicates it is the original unperturbed question. - `variant_type` (string): name of the expansion variant type. "original" is the original question; "simple" is a superficial non-semantic perturbation; "distraction" is the insertion of a distraction sentence in the passage, while retaining the original question. - `answer` (string): the true answer - `passage`(string): a passage based on which the question is to be answered. ### Citation Information ``` @misc{ackerman2024novelmetricmeasuringrobustness, title={A Novel Metric for Measuring the Robustness of Large Language Models in Non-adversarial Scenarios}, author={Samuel Ackerman and Ella Rabinovich and Eitan Farchi and Ateret Anaby-Tavor}, year={2024}, eprint={2408.01963}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2408.01963}, } ```

提供机构：

ibm-research

5,000+

优质数据集

54 个

任务类型

进入经典数据集