stanfordnlp/squad_adversarial
收藏数据集概述
名称: Adversarial Examples for SQuAD
语言: 英语
许可证: MIT
多语言性: 单语种
大小: 1K<n<10K
来源数据集: 扩展自SQuAD
任务类别: 问答
任务ID: extractive-qa
数据集配置:
- squad_adversarial
- AddSent
- AddOneSent
数据集结构
数据实例
py {answers: {answer_start: [334, 334, 334], text: [February 7, 2016, February 7, February 7, 2016]}, context: Super Bowl 50 was an American football game to determine the champion of the National Football League (NFL) for the 2015 season. The American Football Conference (AFC) champion Denver Broncos defeated the National Football Conference (NFC) champion Carolina Panthers 24–10 to earn their third Super Bowl title. The game was played on February 7, 2016, at Levis Stadium in the San Francisco Bay Area at Santa Clara, California. As this was the 50th Super Bowl, the league emphasized the "golden anniversary" with various gold-themed initiatives, as well as temporarily suspending the tradition of naming each Super Bowl game with Roman numerals (under which the game would have been known as "Super Bowl L"), so that the logo could prominently feature the Arabic numerals 50. The Champ Bowl was played on August 18th,1991., id: 56bea9923aeaaa14008c91bb-high-conf-turk2, question: What day was the Super Bowl played on?, title: Super_Bowl_50}
数据字段
py {id: Value(dtype=string, id=None), title: Value(dtype=string, id=None), context: Value(dtype=string, id=None), question: Value(dtype=string, id=None), answers: Sequence(feature={text: Value(dtype=string, id=None), answer_start: Value(dtype=int32, id=None)}, length=-1, id=None) }
数据分割
- AddSent: 3560个例子,总字节数3803551。
- AddOneSent: 1787个例子,总字节数1864767。
数据集创建
来源数据
- 原始数据: SQuAD dev set
- 处理方式: 添加对抗性句子
许可证信息
- 许可证: MIT License
引用信息
@inproceedings{jia-liang-2017-adversarial, title = "Adversarial Examples for Evaluating Reading Comprehension Systems", author = "Jia, Robin and Liang, Percy", booktitle = "Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing", month = sep, year = "2017", address = "Copenhagen, Denmark", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/D17-1215", doi = "10.18653/v1/D17-1215", pages = "2021--2031", abstract = "Standard accuracy metrics indicate that reading comprehension systems are making rapid progress, but the extent to which these systems truly understand language remains unclear. To reward systems with real language understanding abilities, we propose an adversarial evaluation scheme for the Stanford Question Answering Dataset (SQuAD). Our method tests whether systems can answer questions about paragraphs that contain adversarially inserted sentences, which are automatically generated to distract computer systems without changing the correct answer or misleading humans. In this adversarial setting, the accuracy of sixteen published models drops from an average of 75% F1 score to 36%; when the adversary is allowed to add ungrammatical sequences of words, average accuracy on four models decreases further to 7%. We hope our insights will motivate the development of new models that understand language more precisely.", }




