swag

Name: swag
Creator: maas
Published: 2025-11-27 16:35:08
License: 暂无描述

魔搭社区2025-11-27 更新2025-05-31 收录

下载链接：

https://modelscope.cn/datasets/allenai/swag

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for Situations With Adversarial Generations ## Table of Contents - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Data Splits](#data-splits) - [Dataset Creation](#dataset-creation) - [Curation Rationale](#curation-rationale) - [Source Data](#source-data) - [Annotations](#annotations) - [Personal and Sensitive Information](#personal-and-sensitive-information) - [Considerations for Using the Data](#considerations-for-using-the-data) - [Social Impact of Dataset](#social-impact-of-dataset) - [Discussion of Biases](#discussion-of-biases) - [Other Known Limitations](#other-known-limitations) - [Additional Information](#additional-information) - [Dataset Curators](#dataset-curators) - [Licensing Information](#licensing-information) - [Citation Information](#citation-information) - [Contributions](#contributions) ## Dataset Description - **Homepage:** [SWAG AF](https://rowanzellers.com/swag/) - **Repository:** [Github repository](https://github.com/rowanz/swagaf/tree/master/data) - **Paper:** [SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference](https://arxiv.org/abs/1808.05326) - **Leaderboard:** [SWAG Leaderboard](https://leaderboard.allenai.org/swag) - **Point of Contact:** [Rowan Zellers](https://rowanzellers.com/#contact) ### Dataset Summary Given a partial description like "she opened the hood of the car," humans can reason about the situation and anticipate what might come next ("then, she examined the engine"). SWAG (Situations With Adversarial Generations) is a large-scale dataset for this task of grounded commonsense inference, unifying natural language inference and physically grounded reasoning. The dataset consists of 113k multiple choice questions about grounded situations (73k training, 20k validation, 20k test). Each question is a video caption from LSMDC or ActivityNet Captions, with four answer choices about what might happen next in the scene. The correct answer is the (real) video caption for the next event in the video; the three incorrect answers are adversarially generated and human verified, so as to fool machines but not humans. SWAG aims to be a benchmark for evaluating grounded commonsense NLI and for learning representations. ### Supported Tasks and Leaderboards The dataset introduces the task of grounded commonsense inference, unifying natural language inference and commonsense reasoning. ### Languages The text in the dataset is in English. The associated BCP-47 code is `en`. ## Dataset Structure ### Data Instances The `regular` configuration should be used for modeling. An example looks like this: ``` { "video-id": "anetv_dm5WXFiQZUQ", "fold-ind": "18419", "startphrase", "He rides the motorcycle down the hall and into the elevator. He", "sent1": "He rides the motorcycle down the hall and into the elevator." "sent2": "He", "gold-source": "gold", "ending0": "looks at a mirror in the mirror as he watches someone walk through a door.", "ending1": "stops, listening to a cup of coffee with the seated woman, who's standing.", "ending2": "exits the building and rides the motorcycle into a casino where he performs several tricks as people watch.", "ending3": "pulls the bag out of his pocket and hands it to someone's grandma.", "label": 2, } ``` Note that the test are reseved for blind submission on the leaderboard. The full train and validation sets provide more information regarding the collection process. ### Data Fields - `video-id`: identification - `fold-ind`: identification - `startphrase`: the context to be filled - `sent1`: the first sentence - `sent2`: the start of the second sentence (to be filled) - `gold-source`: generated or comes from the found completion - `ending0`: first proposition - `ending1`: second proposition - `ending2`: third proposition - `ending3`: fourth proposition - `label`: the correct proposition More info concerning the fields can be found [on the original repo](https://github.com/rowanz/swagaf/tree/master/data). ### Data Splits The dataset consists of 113k multiple choice questions about grounded situations: 73k for training, 20k for validation, and 20k for (blind) test. ## Dataset Creation ### Curation Rationale The authors seek dataset diversity while minimizing annotation artifacts, conditional stylistic patterns such as length and word-preference biases. To avoid introducing easily “gamed” patterns, they introduce Adversarial Filtering (AF), a generally- applicable treatment involving the iterative refinement of a set of assignments to increase the entropy under a chosen model family. The dataset is then human verified by paid crowdsourcers. ### Source Data This section describes the source data (e.g. news text and headlines, social media posts, translated sentences,...) #### Initial Data Collection and Normalization The dataset is derived from pairs of consecutive video captions from [ActivityNet Captions](https://cs.stanford.edu/people/ranjaykrishna/densevid/) and the [Large Scale Movie Description Challenge](https://sites.google.com/site/describingmovies/). The two datasets are slightly different in nature and allow us to achieve broader coverage: ActivityNet contains 20k YouTube clips containing one of 203 activity types (such as doing gymnastics or playing guitar); LSMDC consists of 128k movie captions (audio descriptions and scripts). #### Who are the source language producers? [More Information Needed] ### Annotations #### Annotation process Annotations are first machine generated and then adversarially filtered. Finally, the remaining examples are human-verified by paid crowdsourcers. #### Who are the annotators? [More Information Needed] ### Personal and Sensitive Information [More Information Needed] ## Considerations for Using the Data ### Social Impact of Dataset [More Information Needed] ### Discussion of Biases [More Information Needed] ### Other Known Limitations [More Information Needed] ## Additional Information ### Dataset Curators [More Information Needed] ### Licensing Information Unknown ### Citation Information ``` @inproceedings{zellers2018swagaf, title={SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference}, author={Zellers, Rowan and Bisk, Yonatan and Schwartz, Roy and Choi, Yejin}, booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP)", year={2018} } ``` ### Contributions Thanks to [@VictorSanh](https://github.com/VictorSanh) for adding this dataset.

# 对抗生成情境数据集卡片 ## 目录 - [数据集描述](#dataset-description) - [数据集摘要](#dataset-summary) - [支持任务与评测基准榜](#supported-tasks-and-leaderboards) - [语言情况](#languages) - [数据集结构](#dataset-structure) - [数据实例](#data-instances) - [数据字段](#data-fields) - [数据划分](#data-splits) - [数据集构建](#dataset-creation) - [构建初衷](#curation-rationale) - [源数据](#source-data) - [标注信息](#annotations) - [个人与敏感信息](#personal-and-sensitive-information) - [数据集使用注意事项](#considerations-for-using-the-data) - [数据集的社会影响](#social-impact-of-dataset) - [偏差分析](#discussion-of-biases) - [已知其他限制](#other-known-limitations) - [附加信息](#additional-information) - [数据集维护者](#dataset-curators) - [许可信息](#licensing-information) - [引用信息](#citation-information) - [贡献致谢](#contributions) ## 数据集描述 - **主页:** [SWAG AF](https://rowanzellers.com/swag/) - **代码仓库:** [Github代码仓库](https://github.com/rowanz/swagaf/tree/master/data) - **相关论文:** [SWAG: 面向基于现实情境的常识推理的大规模对抗数据集](https://arxiv.org/abs/1808.05326) - **评测基准榜:** [SWAG评测基准榜](https://leaderboard.allenai.org/swag) - **联系方式:** [Rowan Zellers](https://rowanzellers.com/#contact) ### 数据集摘要给定“她打开了汽车引擎盖”这类情境片段，人类可通过情境推理预判后续事件（如“随后，她检查了发动机”）。SWAG（Situations With Adversarial Generations，对抗生成情境数据集）是一款面向此类基于现实情境的常识推理任务的大规模数据集，融合了自然语言推理与基于物理情境的推理。本数据集包含11.3万道基于现实情境的多项选择题（7.3万训练集、2万验证集、2万测试集）。每个问题均取自ActivityNet Captions或大规模电影描述挑战赛（Large Scale Movie Description Challenge, LSMDC）的视频字幕，配有四个关于场景后续事件的答案选项。正确答案为视频下一事件的真实字幕；其余三个错误答案均经过对抗生成与人工验证，旨在迷惑模型但无法骗过人类。SWAG旨在作为评估基于现实情境的常识自然语言推理以及学习表征的基准测试集。 ### 支持任务与评测基准榜本数据集提出了基于现实情境的常识推理任务，融合自然语言推理与常识推理。 ### 语言情况数据集文本语言为英语，对应的BCP-47代码为`en`。 ## 数据集结构 ### 数据实例建模时应使用`regular`配置。示例如下： { "video-id": "anetv_dm5WXFiQZUQ", "fold-ind": "18419", "startphrase": "He rides the motorcycle down the hall and into the elevator. He", "sent1": "He rides the motorcycle down the hall and into the elevator.", "sent2": "He", "gold-source": "gold", "ending0": "looks at a mirror in the mirror as he watches someone walk through a door.", "ending1": "stops, listening to a cup of coffee with the seated woman, who's standing.", "ending2": "exits the building and rides the motorcycle into a casino where he performs several tricks as people watch.", "ending3": "pulls the bag out of his pocket and hands it to someone's grandma.", "label": 2, } 注意：测试集仅用于基准榜盲测提交。完整的训练集与验证集提供了更多关于数据收集过程的信息。 ### 数据字段 - `video-id`: 视频标识符 - `fold-ind`: 折次标识符 - `startphrase`: 待补全上下文 - `sent1`: 第一句文本 - `sent2`: 第二句文本的起始部分（待补全） - `gold-source`: 标注来源 - `ending0`: 第一候选结尾 - `ending1`: 第二候选结尾 - `ending2`: 第三候选结尾 - `ending3`: 第四候选结尾 - `label`: 正确答案索引更多关于数据字段的详细信息可参见[原始代码仓库](https://github.com/rowanz/swagaf/tree/master/data)。 ### 数据划分本数据集包含11.3万道基于现实情境的多项选择题：7.3万用于训练，2万用于验证，2万用于（盲测）测试。 ## 数据集构建 ### 构建初衷作者旨在构建多样化的数据集，同时最小化标注 artifacts、条件式文体模式（如长度与词汇偏好偏差）。为避免引入易于被模型利用的规律，他们提出了对抗过滤（Adversarial Filtering, AF）方法，这是一种通用的处理手段，通过迭代优化模型族下的分配集合以提升熵值。随后，数据集通过付费众包人员进行人工验证。 ### 源数据本节描述源数据（例如新闻文本与标题、社交媒体帖子、翻译句子等） #### 初始数据收集与标准化本数据集源自[ActivityNet Captions](https://cs.stanford.edu/people/ranjaykrishna/densevid/)与[大规模电影描述挑战赛（Large Scale Movie Description Challenge, LSMDC）](https://sites.google.com/site/describingmovies/)的连续视频字幕对。两个数据集在性质上略有差异，可实现更广泛的覆盖范围：ActivityNet包含2万个YouTube片段，涵盖203种活动类型（如体操表演、吉他演奏）；LSMDC包含12.8万个电影字幕（音频描述与剧本）。 #### 源文本的生产者是谁？ [More Information Needed] ### 标注信息 #### 标注流程标注首先通过机器生成，随后经过对抗过滤。最终剩余的样本由付费众包人员进行人工验证。 #### 标注人员是谁？ [More Information Needed] ### 个人与敏感信息 [More Information Needed] ## 数据集使用注意事项 ### 数据集的社会影响 [More Information Needed] ### 偏差分析 [More Information Needed] ### 已知其他限制 [More Information Needed] ## 附加信息 ### 数据集维护者 [More Information Needed] ### 许可信息未知 ### 引用信息 @inproceedings{zellers2018swagaf, title={SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference}, author={Zellers, Rowan and Bisk, Yonatan and Schwartz, Roy and Choi, Yejin}, booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP)", year={2018} } ### 贡献致谢感谢[@VictorSanh](https://github.com/VictorSanh)添加本数据集。

提供机构：

maas

创建时间：

2025-05-27

搜集汇总

数据集介绍

背景与挑战

背景概述

SWAG（Situations With Adversarial Generations）是一个大规模数据集，专注于基于常识的推理任务，结合了自然语言推理和物理基础推理。它包含11.3万个多选问题，基于视频字幕构建，正确答案对应真实的下一个事件描述，而错误答案则通过对抗生成和人工验证来挑战机器理解。该数据集旨在作为评估常识推理能力和学习表示的基准。

以上内容由遇见数据集搜集并总结生成