five

floschne/marvl

收藏
Hugging Face2024-05-16 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/floschne/marvl
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - id - sw - ta - tr - zh - en license: cc-by-4.0 size_categories: - 1K<n<10K task_categories: - visual-question-answering pretty_name: MaRVL dataset_info: features: - name: id dtype: string - name: hypothesis dtype: string - name: hypo_en dtype: string - name: language dtype: string - name: label dtype: bool - name: chapter dtype: string - name: concept dtype: string - name: annotator_info struct: - name: age dtype: int64 - name: annotator_id dtype: string - name: country_of_birth dtype: string - name: country_of_residence dtype: string - name: gender dtype: string - name: left_img_id dtype: string - name: right_img_id dtype: string - name: left_img struct: - name: bytes dtype: binary - name: path dtype: 'null' - name: right_img struct: - name: bytes dtype: binary - name: path dtype: 'null' - name: resized_left_img struct: - name: bytes dtype: binary - name: path dtype: 'null' - name: resized_right_img struct: - name: bytes dtype: binary - name: path dtype: 'null' - name: vertically_stacked_img struct: - name: bytes dtype: binary - name: path dtype: 'null' - name: horizontally_stacked_img struct: - name: bytes dtype: binary - name: path dtype: 'null' splits: - name: id num_bytes: 2079196646 num_examples: 1128 - name: sw num_bytes: 899838181 num_examples: 1108 - name: ta num_bytes: 801784098 num_examples: 1242 - name: tr num_bytes: 1373652829 num_examples: 1180 - name: zh num_bytes: 1193602152 num_examples: 1012 download_size: 6234764237 dataset_size: 6348073906 configs: - config_name: default data_files: - split: id path: data/id-* - split: sw path: data/sw-* - split: ta path: data/ta-* - split: tr path: data/tr-* - split: zh path: data/zh-* --- # MaRVL ### This is a copy from the original repo: https://github.com/marvl-challenge/marvl-code If you use this dataset, please cite the original authors: ```bibtex @inproceedings{liu-etal-2021-visually, title = "Visually Grounded Reasoning across Languages and Cultures", author = "Liu, Fangyu and Bugliarello, Emanuele and Ponti, Edoardo Maria and Reddy, Siva and Collier, Nigel and Elliott, Desmond", booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing", month = nov, year = "2021", address = "Online and Punta Cana, Dominican Republic", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.emnlp-main.818", pages = "10467--10485", } ``` ### Additional data In addition to the data available in the original repo, this dataset contains the following columns * `en_translation` --> English translation of the `hypothesis` created using Bing Translate * `left_img` --> PIL Image * `right_img`--> PIL Image * `resized_left_img` --> PIL Image resized * `resized_right_img` --> PIL Image resized * `vertically_stacked_img` --> PIL image that contains the left and right resized images stacked vertically with a black gutter of `10px` * `horizontally_stacked_img` --> PIL image that contains the left and right resized images stacked horizontally with a black gutter of `10px` The images were resized using [`img2dataset`](https://github.com/rom1504/img2dataset/blob/main/img2dataset/resizer.py): <details> <summary>Show code snippet</summary> ```python Resizer( image_size=640, resize_mode=ResizeMode.keep_ratio, resize_only_if_bigger=True, ) ``` </details> ### How to read the images Due to a [bug](https://github.com/huggingface/datasets/issues/4796), the images cannot be stored as PIL.Image.Images directly but need to be converted to dataset.Images-. Hence, to load them, this additional step is required: ```python from datasets import Image, load_dataset ds = load_dataset("floschne/marvl", split="sw") ds.map( lambda sample: { "left_img_t": [Image().decode_example(img) for img in sample["left_img"]], "right_img_t": [Image().decode_example(img) for img in sample["right_img"]], "resized_left_img_t": [ Image().decode_example(img) for img in sample["resized_left_img"] ], "resized_right_img_t": [ Image().decode_example(img) for img in sample["resized_right_img"] ], "vertically_stacked_img_t": [ Image().decode_example(img) for img in sample["vertically_stacked_img"] ], "horizontally_stacked_img_t": [ Image().decode_example(img) for img in sample["horizontally_stacked_img"] ], }, remove_columns=[ "left_img", "right_img", "resized_left_img", "resized_right_img", "vertically_stacked_img", "horizontally_stacked_img", ], ).rename_columns( { "left_img_t": "left_img", "right_img_t": "right_img", "resized_left_img_t": "resized_left_img", "resized_right_img_t": "resized_right_img", "vertically_stacked_img_t": "vertically_stacked_img", "horizontally_stacked_img_t": "horizontally_stacked_img", } ) ```
提供机构:
floschne
原始信息汇总

数据集概述

数据集名称

  • MaRVL

数据集语言

  • 支持的语言包括:印尼语(id)、斯瓦希里语(sw)、泰米尔语(ta)、土耳其语(tr)、中文(zh)和英语(en)。

许可证

  • CC-BY-4.0

数据集大小

  • 下载大小:6234764237字节
  • 数据集大小:6348073906字节

任务类别

  • 视觉问答(visual-question-answering)

数据集特征

  • 基本特征
    • id: 字符串类型
    • hypothesis: 字符串类型
    • hypo_en: 字符串类型
    • language: 字符串类型
    • label: 布尔类型
    • chapter: 字符串类型
    • concept: 字符串类型
  • 注释者信息
    • age: 整数类型
    • annotator_id: 字符串类型
    • country_of_birth: 字符串类型
    • country_of_residence: 字符串类型
    • gender: 字符串类型
  • 图像相关特征
    • left_img_id: 字符串类型
    • right_img_id: 字符串类型
    • left_img: 包含bytes(二进制类型)和path(空类型)
    • right_img: 包含bytes(二进制类型)和path(空类型)
    • resized_left_img: 包含bytes(二进制类型)和path(空类型)
    • resized_right_img: 包含bytes(二进制类型)和path(空类型)
    • vertically_stacked_img: 包含bytes(二进制类型)和path(空类型)
    • horizontally_stacked_img: 包含bytes(二进制类型)和path(空类型)

数据集分割

  • 分割详情
    • id: 1128个样本,2079196646字节
    • sw: 1108个样本,899838181字节
    • ta: 1242个样本,801784098字节
    • tr: 1180个样本,1373652829字节
    • zh: 1012个样本,1193602152字节

配置信息

  • 默认配置
    • 数据文件路径根据语言分割,如data/id-*data/sw-*等。

图像处理

  • 图像使用img2dataset工具进行处理,保持比例并仅在图像大于所需尺寸时进行调整。

图像加载方法

  • 由于技术限制,图像需通过特定代码转换后才能加载,具体转换方法见README文件中的代码示例。
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集是一个多语言视觉问答数据集,包含印尼语、斯瓦希里语、泰米尔语等多种语言的文本和图像数据,支持视觉问答任务。数据集提供了多种图像格式,包括原始图像、调整大小后的图像以及堆叠图像,便于进行多模态研究。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作