asuglia/compguesswhat

Name: asuglia/compguesswhat
Creator: asuglia
Published: 2024-02-07 17:39:43
License: 暂无描述

Hugging Face2024-02-07 更新2024-05-25 收录

下载链接：

https://hf-mirror.com/datasets/asuglia/compguesswhat

下载链接

链接失效反馈

官方服务：

资源简介：

--- annotations_creators: - machine-generated language_creators: - found language: - en license: - unknown multilinguality: - monolingual size_categories: - 100K<n<1M source_datasets: - extended|other-guesswhat task_categories: - visual-question-answering task_ids: - visual-question-answering paperswithcode_id: compguesswhat pretty_name: CompGuessWhat?! dataset_info: - config_name: compguesswhat-original features: - name: id dtype: int32 - name: target_id dtype: int32 - name: timestamp dtype: string - name: status dtype: string - name: image struct: - name: id dtype: int32 - name: file_name dtype: string - name: flickr_url dtype: string - name: coco_url dtype: string - name: height dtype: int32 - name: width dtype: int32 - name: visual_genome struct: - name: width dtype: int32 - name: height dtype: int32 - name: url dtype: string - name: coco_id dtype: int32 - name: flickr_id dtype: string - name: image_id dtype: string - name: qas sequence: - name: question dtype: string - name: answer dtype: string - name: id dtype: int32 - name: objects sequence: - name: id dtype: int32 - name: bbox sequence: float32 length: 4 - name: category dtype: string - name: area dtype: float32 - name: category_id dtype: int32 - name: segment sequence: sequence: float32 splits: - name: train num_bytes: 123556580 num_examples: 46341 - name: validation num_bytes: 25441428 num_examples: 9738 - name: test num_bytes: 25369227 num_examples: 9621 download_size: 105349759 dataset_size: 174367235 - config_name: compguesswhat-zero_shot features: - name: id dtype: int32 - name: target_id dtype: string - name: status dtype: string - name: image struct: - name: id dtype: int32 - name: file_name dtype: string - name: coco_url dtype: string - name: height dtype: int32 - name: width dtype: int32 - name: license dtype: int32 - name: open_images_id dtype: string - name: date_captured dtype: string - name: objects sequence: - name: id dtype: string - name: bbox sequence: float32 length: 4 - name: category dtype: string - name: area dtype: float32 - name: category_id dtype: int32 - name: IsOccluded dtype: int32 - name: IsTruncated dtype: int32 - name: segment sequence: - name: MaskPath dtype: string - name: LabelName dtype: string - name: BoxID dtype: string - name: BoxXMin dtype: string - name: BoxXMax dtype: string - name: BoxYMin dtype: string - name: BoxYMax dtype: string - name: PredictedIoU dtype: string - name: Clicks dtype: string splits: - name: nd_valid num_bytes: 13510589 num_examples: 5343 - name: nd_test num_bytes: 36228021 num_examples: 13836 - name: od_valid num_bytes: 14051972 num_examples: 5372 - name: od_test num_bytes: 32950869 num_examples: 13300 download_size: 6548812 dataset_size: 96741451 configs: - config_name: compguesswhat-original data_files: - split: train path: compguesswhat-original/train-* - split: validation path: compguesswhat-original/validation-* - split: test path: compguesswhat-original/test-* - config_name: compguesswhat-zero_shot data_files: - split: nd_valid path: compguesswhat-zero_shot/nd_valid-* - split: nd_test path: compguesswhat-zero_shot/nd_test-* - split: od_valid path: compguesswhat-zero_shot/od_valid-* - split: od_test path: compguesswhat-zero_shot/od_test-* --- # Dataset Card for "compguesswhat" ## Table of Contents - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Data Splits](#data-splits) - [Dataset Creation](#dataset-creation) - [Curation Rationale](#curation-rationale) - [Source Data](#source-data) - [Annotations](#annotations) - [Personal and Sensitive Information](#personal-and-sensitive-information) - [Considerations for Using the Data](#considerations-for-using-the-data) - [Social Impact of Dataset](#social-impact-of-dataset) - [Discussion of Biases](#discussion-of-biases) - [Other Known Limitations](#other-known-limitations) - [Additional Information](#additional-information) - [Dataset Curators](#dataset-curators) - [Licensing Information](#licensing-information) - [Citation Information](#citation-information) - [Contributions](#contributions) ## Dataset Description - **Homepage:** [https://compguesswhat.github.io/](https://compguesswhat.github.io/) - **Repository:** [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) - **Paper:** https://arxiv.org/abs/2006.02174 - **Paper:** https://doi.org/10.18653/v1/2020.acl-main.682 - **Point of Contact:** [Alessandro Suglia](mailto:alessandro.suglia@gmail.com) - **Size of downloaded dataset files:** 112.05 MB - **Size of the generated dataset:** 271.11 MB - **Total amount of disk used:** 383.16 MB ### Dataset Summary CompGuessWhat?! is an instance of a multi-task framework for evaluating the quality of learned neural representations, in particular concerning attribute grounding. Use this dataset if you want to use the set of games whose reference scene is an image in VisualGenome. Visit the website for more details: https://compguesswhat.github.io ### Supported Tasks and Leaderboards [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) ### Languages [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) ## Dataset Structure ### Data Instances #### compguesswhat-original - **Size of downloaded dataset files:** 107.21 MB - **Size of the generated dataset:** 174.37 MB - **Total amount of disk used:** 281.57 MB An example of 'validation' looks as follows. ``` This example was too long and was cropped: { "id": 2424, "image": "{\"coco_url\": \"http://mscoco.org/images/270512\", \"file_name\": \"COCO_train2014_000000270512.jpg\", \"flickr_url\": \"http://farm6.stat...", "objects": "{\"area\": [1723.5133056640625, 4838.5361328125, 287.44476318359375, 44918.7109375, 3688.09375, 522.1935424804688], \"bbox\": [[5.61...", "qas": { "answer": ["Yes", "No", "No", "Yes"], "id": [4983, 4996, 5006, 5017], "question": ["Is it in the foreground?", "Does it have wings?", "Is it a person?", "Is it a vehicle?"] }, "status": "success", "target_id": 1197044, "timestamp": "2016-07-08 15:07:38" } ``` #### compguesswhat-zero_shot - **Size of downloaded dataset files:** 4.84 MB - **Size of the generated dataset:** 96.74 MB - **Total amount of disk used:** 101.59 MB An example of 'nd_valid' looks as follows. ``` This example was too long and was cropped: { "id": 0, "image": { "coco_url": "https://s3.amazonaws.com/nocaps/val/004e21eb2e686f40.jpg", "date_captured": "2018-11-06 11:04:33", "file_name": "004e21eb2e686f40.jpg", "height": 1024, "id": 6, "license": 0, "open_images_id": "004e21eb2e686f40", "width": 768 }, "objects": "{\"IsOccluded\": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], \"IsTruncated\": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], \"area\": [3...", "status": "incomplete", "target_id": "004e21eb2e686f40_30" } ``` ### Data Fields The data fields are the same among all splits. #### compguesswhat-original - `id`: a `int32` feature. - `target_id`: a `int32` feature. - `timestamp`: a `string` feature. - `status`: a `string` feature. - `id`: a `int32` feature. - `file_name`: a `string` feature. - `flickr_url`: a `string` feature. - `coco_url`: a `string` feature. - `height`: a `int32` feature. - `width`: a `int32` feature. - `width`: a `int32` feature. - `height`: a `int32` feature. - `url`: a `string` feature. - `coco_id`: a `int32` feature. - `flickr_id`: a `string` feature. - `image_id`: a `string` feature. - `qas`: a dictionary feature containing: - `question`: a `string` feature. - `answer`: a `string` feature. - `id`: a `int32` feature. - `objects`: a dictionary feature containing: - `id`: a `int32` feature. - `bbox`: a `list` of `float32` features. - `category`: a `string` feature. - `area`: a `float32` feature. - `category_id`: a `int32` feature. - `segment`: a dictionary feature containing: - `feature`: a `float32` feature. #### compguesswhat-zero_shot - `id`: a `int32` feature. - `target_id`: a `string` feature. - `status`: a `string` feature. - `id`: a `int32` feature. - `file_name`: a `string` feature. - `coco_url`: a `string` feature. - `height`: a `int32` feature. - `width`: a `int32` feature. - `license`: a `int32` feature. - `open_images_id`: a `string` feature. - `date_captured`: a `string` feature. - `objects`: a dictionary feature containing: - `id`: a `string` feature. - `bbox`: a `list` of `float32` features. - `category`: a `string` feature. - `area`: a `float32` feature. - `category_id`: a `int32` feature. - `IsOccluded`: a `int32` feature. - `IsTruncated`: a `int32` feature. - `segment`: a dictionary feature containing: - `MaskPath`: a `string` feature. - `LabelName`: a `string` feature. - `BoxID`: a `string` feature. - `BoxXMin`: a `string` feature. - `BoxXMax`: a `string` feature. - `BoxYMin`: a `string` feature. - `BoxYMax`: a `string` feature. - `PredictedIoU`: a `string` feature. - `Clicks`: a `string` feature. ### Data Splits #### compguesswhat-original | |train|validation|test| |----------------------|----:|---------:|---:| |compguesswhat-original|46341| 9738|9621| #### compguesswhat-zero_shot | |nd_valid|od_valid|nd_test|od_test| |-----------------------|-------:|-------:|------:|------:| |compguesswhat-zero_shot| 5343| 5372| 13836| 13300| ## Dataset Creation ### Curation Rationale [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) ### Source Data #### Initial Data Collection and Normalization [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) #### Who are the source language producers? [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) ### Annotations #### Annotation process [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) #### Who are the annotators? [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) ### Personal and Sensitive Information [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) ## Considerations for Using the Data ### Social Impact of Dataset [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) ### Discussion of Biases [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) ### Other Known Limitations [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) ## Additional Information ### Dataset Curators [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) ### Licensing Information [More Information Needed](https://github.com/huggingface/datasets/blob/master/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards) ### Citation Information ``` @inproceedings{suglia-etal-2020-compguesswhat, title = "{C}omp{G}uess{W}hat?!: A Multi-task Evaluation Framework for Grounded Language Learning", author = "Suglia, Alessandro and Konstas, Ioannis and Vanzo, Andrea and Bastianelli, Emanuele and Elliott, Desmond and Frank, Stella and Lemon, Oliver", booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics", month = jul, year = "2020", address = "Online", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/2020.acl-main.682", pages = "7625--7641", abstract = "Approaches to Grounded Language Learning are commonly focused on a single task-based final performance measure which may not depend on desirable properties of the learned hidden representations, such as their ability to predict object attributes or generalize to unseen situations. To remedy this, we present GroLLA, an evaluation framework for Grounded Language Learning with Attributes based on three sub-tasks: 1) Goal-oriented evaluation; 2) Object attribute prediction evaluation; and 3) Zero-shot evaluation. We also propose a new dataset CompGuessWhat?! as an instance of this framework for evaluating the quality of learned neural representations, in particular with respect to attribute grounding. To this end, we extend the original GuessWhat?! dataset by including a semantic layer on top of the perceptual one. Specifically, we enrich the VisualGenome scene graphs associated with the GuessWhat?! images with several attributes from resources such as VISA and ImSitu. We then compare several hidden state representations from current state-of-the-art approaches to Grounded Language Learning. By using diagnostic classifiers, we show that current models{'} learned representations are not expressive enough to encode object attributes (average F1 of 44.27). In addition, they do not learn strategies nor representations that are robust enough to perform well when novel scenes or objects are involved in gameplay (zero-shot best accuracy 50.06{\%}).", } ``` ### Contributions Thanks to [@thomwolf](https://github.com/thomwolf), [@aleSuglia](https://github.com/aleSuglia), [@lhoestq](https://github.com/lhoestq) for adding this dataset.

提供机构：

asuglia

原始信息汇总

数据集概述

数据集基本信息

名称: CompGuessWhat?!
语言: 英语（en）
许可证: 未知
多语言性: 单语
大小: 100K<n<1M
来源数据集: 扩展自 GuessWhat
任务类别: 视觉问答

数据集结构

数据实例

compguesswhat-original

下载大小: 107.21 MB
生成数据集大小: 174.37 MB
总磁盘使用: 281.57 MB

compguesswhat-zero_shot

下载大小: 4.84 MB
生成数据集大小: 96.74 MB
总磁盘使用: 101.59 MB

数据字段

compguesswhat-original

id: int32
target_id: int32
timestamp: string
status: string
image: struct
- id: int32
- file_name: string
- flickr_url: string
- coco_url: string
- height: int32
- width: int32
- visual_genome: struct
  - width: int32
  - height: int32
  - url: string
  - coco_id: int32
  - flickr_id: string
  - image_id: string
qas: dictionary
- question: string
- answer: string
- id: int32
objects: dictionary
- id: int32
- bbox: list of float32
- category: string
- area: float32
- category_id: int32
- segment: dictionary
  - feature: float32

compguesswhat-zero_shot

id: int32
target_id: string
status: string
image: struct
- id: int32
- file_name: string
- coco_url: string
- height: int32
- width: int32
- license: int32
- open_images_id: string
- date_captured: string
objects: dictionary
- id: string
- bbox: list of float32
- category: string
- area: float32
- category_id: int32
- IsOccluded: int32
- IsTruncated: int32
- segment: dictionary
  - MaskPath: string
  - LabelName: string
  - BoxID: string
  - BoxXMin: string
  - BoxXMax: string
  - BoxYMin: string
  - BoxYMax: string
  - PredictedIoU: string
  - Clicks: string

数据分割

compguesswhat-original

分割	实例数量
train	46341
validation	9738
test	9621

compguesswhat-zero_shot

分割	实例数量
nd_valid	5343
od_valid	5372
nd_test	13836
od_test	13300

数据集创建

注释过程

注释创建者: 机器生成
语言创建者: 发现

个人和敏感信息

未提供具体信息

使用数据的考虑

社会影响

未提供具体信息

偏见讨论

未提供具体信息

其他已知限制

未提供具体信息

5,000+

优质数据集

54 个

任务类型

进入经典数据集