five

ibm-research/BoolQ_robustness

收藏
Hugging Face2024-08-19 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/ibm-research/BoolQ_robustness
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - question-answering language: - en --- # Dataset Card for "BoolQ-robustness" ### Dataset Summary BoolQ-robustness is an expanded version of the BoolQ dataset (https://arxiv.org/abs/1905.10044) but with perturbations of the original input questions and passages. It is intended for use as a benchmark for evaluating model robustness on question-answering to these perturbations. ### Data Instances #### boolq_robustness - **Size of downloaded dataset file:** 21.8 MB ### Data Fields #### boolq_robustness - `id` (integer): original question grouping ID - `question` (string): variant of question from BoolQ. - `variant_id` (integer): identifier of the variant. 0 indicates it is the original unperturbed question. - `variant_type` (string): name of the expansion variant type. "original" is the original question; "simple" is a superficial non-semantic perturbation; "distraction" is the insertion of a distraction sentence in the passage, while retaining the original question. - `answer` (string): the true answer - `passage`(string): a passage based on which the question is to be answered. ### Citation Information ``` @misc{ackerman2024novelmetricmeasuringrobustness, title={A Novel Metric for Measuring the Robustness of Large Language Models in Non-adversarial Scenarios}, author={Samuel Ackerman and Ella Rabinovich and Eitan Farchi and Ateret Anaby-Tavor}, year={2024}, eprint={2408.01963}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2408.01963}, } ```
提供机构:
ibm-research
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作