nlphuji/winogavil

Name: nlphuji/winogavil
Creator: nlphuji
Published: 2022-11-26 19:56:27
License: 暂无描述

Hugging Face2022-11-26 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/nlphuji/winogavil

下载链接

链接失效反馈

官方服务：

资源简介：

WinoGAViL是一个用于评估视觉和语言常识推理能力的挑战性数据集。给定一组图像、一个提示和一个数字K，任务是从图像中选择最符合提示的K张图像。该数据集通过WinoGAViL在线游戏收集，灵感来源于流行的卡牌游戏Codenames。数据集旨在挑战现有的视觉和语言模型，同时确保人类玩家能够解决。数据集包含多个字段，如候选图像列表、提示、用户选择的关联图像等。数据集仅包含一个测试集，并提供了不同难度级别的候选图像数量。数据集的创建过程包括通过Amazon Mechanical Turk Workers进行注释。数据集的使用需遵循CC-By 4.0许可。

WinoGAViL is a challenging dataset for evaluating visual and language commonsense reasoning capabilities. Given a set of images, a prompt, and an integer K, the task is to select the K images from the provided set that best align with the given prompt. This dataset is collected through the online WinoGAViL game, which draws inspiration from the popular board card game Codenames. The dataset is intended to challenge state-of-the-art visual and language models, while guaranteeing that human players can successfully complete the tasks. It encompasses multiple fields, including a list of candidate images, prompts, user-selected associated images, and so forth. The dataset only consists of a test split, and provides candidate image counts corresponding to different difficulty levels. The dataset's creation involved annotation performed by Amazon Mechanical Turk Workers. The use of this dataset is subject to the CC-By 4.0 license.

提供机构：

nlphuji

原始信息汇总

数据集概述

名称: WinoGAViL

语言: 英语

许可证: CC-By 4.0

数据集大小: 10K<n<100K

数据来源: 原始数据

多语言性: 单语

标签:

常识推理
视觉推理

数据集描述

WinoGAViL是一个用于评估视觉与语言常识推理能力的挑战性数据集。任务涉及从一组图像中，根据给定的提示和数字K，选择最适合关联的K张图像。该数据集通过WinoGAViL在线游戏收集，灵感来源于流行的卡牌游戏Codenames。游戏中，一名“间谍大师”提供与多个视觉候选相关的文本提示，另一名玩家需识别这些候选。人类玩家因创建对竞争对手AI模型具有挑战性但仍可由其他人类玩家解决的关联而获得奖励。

数据集结构

数据字段

candidates (列表): 图像候选列表。
cue (字符串): 生成的提示。
associations (字符串): 用户选择的与提示关联的图像。
score_fool_the_ai (整数): 愚弄AI的间谍大师得分。
num_associations (整数): 选择的关联图像数量。
num_candidates (整数): 总候选数量。
solvers_jaccard_mean (浮点数): 解题者得分的平均值。
solvers_jaccard_std (浮点数): 解题者得分的标准差。
ID (整数): 关联ID。

数据分割

单一测试分割。
不同数量的候选创建不同难度级别。

数据集创建

数据集灵感来源于Codenames游戏，通过支付Amazon Mechanical Turk工人参与游戏来收集数据。

使用数据的考虑

所有关联均由人类标注者获取。
数据集遵循CC-By 4.0许可证。

引用信息

@article{bitton2022winogavil, title={WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models}, author={Bitton, Yonatan and Guetta, Nitzan Bitton and Yosef, Ron and Elovici, Yuval and Bansal, Mohit and Stanovsky, Gabriel and Schwartz, Roy}, journal={arXiv preprint arXiv:2207.12576}, year={2022} }

5,000+

优质数据集

54 个

任务类型

进入经典数据集