floschne/marvl

Name: floschne/marvl
Creator: floschne
Published: 2024-05-16 09:58:22
License: 暂无描述

Hugging Face2024-05-16 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/floschne/marvl

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - id - sw - ta - tr - zh - en license: cc-by-4.0 size_categories: - 1K<n<10K task_categories: - visual-question-answering pretty_name: MaRVL dataset_info: features: - name: id dtype: string - name: hypothesis dtype: string - name: hypo_en dtype: string - name: language dtype: string - name: label dtype: bool - name: chapter dtype: string - name: concept dtype: string - name: annotator_info struct: - name: age dtype: int64 - name: annotator_id dtype: string - name: country_of_birth dtype: string - name: country_of_residence dtype: string - name: gender dtype: string - name: left_img_id dtype: string - name: right_img_id dtype: string - name: left_img struct: - name: bytes dtype: binary - name: path dtype: 'null' - name: right_img struct: - name: bytes dtype: binary - name: path dtype: 'null' - name: resized_left_img struct: - name: bytes dtype: binary - name: path dtype: 'null' - name: resized_right_img struct: - name: bytes dtype: binary - name: path dtype: 'null' - name: vertically_stacked_img struct: - name: bytes dtype: binary - name: path dtype: 'null' - name: horizontally_stacked_img struct: - name: bytes dtype: binary - name: path dtype: 'null' splits: - name: id num_bytes: 2079196646 num_examples: 1128 - name: sw num_bytes: 899838181 num_examples: 1108 - name: ta num_bytes: 801784098 num_examples: 1242 - name: tr num_bytes: 1373652829 num_examples: 1180 - name: zh num_bytes: 1193602152 num_examples: 1012 download_size: 6234764237 dataset_size: 6348073906 configs: - config_name: default data_files: - split: id path: data/id-* - split: sw path: data/sw-* - split: ta path: data/ta-* - split: tr path: data/tr-* - split: zh path: data/zh-* --- # MaRVL ### This is a copy from the original repo: https://github.com/marvl-challenge/marvl-code If you use this dataset, please cite the original authors: ```bibtex @inproceedings{liu-etal-2021-visually, title = "Visually Grounded Reasoning across Languages and Cultures", author = "Liu, Fangyu and Bugliarello, Emanuele and Ponti, Edoardo Maria and Reddy, Siva and Collier, Nigel and Elliott, Desmond", booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing", month = nov, year = "2021", address = "Online and Punta Cana, Dominican Republic", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.emnlp-main.818", pages = "10467--10485", } ``` ### Additional data In addition to the data available in the original repo, this dataset contains the following columns * `en_translation` --> English translation of the `hypothesis` created using Bing Translate * `left_img` --> PIL Image * `right_img`--> PIL Image * `resized_left_img` --> PIL Image resized * `resized_right_img` --> PIL Image resized * `vertically_stacked_img` --> PIL image that contains the left and right resized images stacked vertically with a black gutter of `10px` * `horizontally_stacked_img` --> PIL image that contains the left and right resized images stacked horizontally with a black gutter of `10px` The images were resized using [`img2dataset`](https://github.com/rom1504/img2dataset/blob/main/img2dataset/resizer.py): <details> <summary>Show code snippet</summary> ```python Resizer( image_size=640, resize_mode=ResizeMode.keep_ratio, resize_only_if_bigger=True, ) ``` </details> ### How to read the images Due to a [bug](https://github.com/huggingface/datasets/issues/4796), the images cannot be stored as PIL.Image.Images directly but need to be converted to dataset.Images-. Hence, to load them, this additional step is required: ```python from datasets import Image, load_dataset ds = load_dataset("floschne/marvl", split="sw") ds.map( lambda sample: { "left_img_t": [Image().decode_example(img) for img in sample["left_img"]], "right_img_t": [Image().decode_example(img) for img in sample["right_img"]], "resized_left_img_t": [ Image().decode_example(img) for img in sample["resized_left_img"] ], "resized_right_img_t": [ Image().decode_example(img) for img in sample["resized_right_img"] ], "vertically_stacked_img_t": [ Image().decode_example(img) for img in sample["vertically_stacked_img"] ], "horizontally_stacked_img_t": [ Image().decode_example(img) for img in sample["horizontally_stacked_img"] ], }, remove_columns=[ "left_img", "right_img", "resized_left_img", "resized_right_img", "vertically_stacked_img", "horizontally_stacked_img", ], ).rename_columns( { "left_img_t": "left_img", "right_img_t": "right_img", "resized_left_img_t": "resized_left_img", "resized_right_img_t": "resized_right_img", "vertically_stacked_img_t": "vertically_stacked_img", "horizontally_stacked_img_t": "horizontally_stacked_img", } ) ```

提供机构：

floschne

原始信息汇总

数据集概述

数据集名称

MaRVL

数据集语言

支持的语言包括：印尼语（id）、斯瓦希里语（sw）、泰米尔语（ta）、土耳其语（tr）、中文（zh）和英语（en）。

许可证

CC-BY-4.0

数据集大小

下载大小：6234764237字节
数据集大小：6348073906字节

任务类别

视觉问答（visual-question-answering）

数据集特征

基本特征：
- id: 字符串类型
- hypothesis: 字符串类型
- hypo_en: 字符串类型
- language: 字符串类型
- label: 布尔类型
- chapter: 字符串类型
- concept: 字符串类型
注释者信息：
- age: 整数类型
- annotator_id: 字符串类型
- country_of_birth: 字符串类型
- country_of_residence: 字符串类型
- gender: 字符串类型
图像相关特征：
- left_img_id: 字符串类型
- right_img_id: 字符串类型
- left_img: 包含bytes（二进制类型）和path（空类型）
- right_img: 包含bytes（二进制类型）和path（空类型）
- resized_left_img: 包含bytes（二进制类型）和path（空类型）
- resized_right_img: 包含bytes（二进制类型）和path（空类型）
- vertically_stacked_img: 包含bytes（二进制类型）和path（空类型）
- horizontally_stacked_img: 包含bytes（二进制类型）和path（空类型）