neulab/PangeaBench-maxm

Name: neulab/PangeaBench-maxm
Creator: neulab
Published: 2024-10-31 20:29:56
License: 暂无描述

Hugging Face2024-10-31 更新2025-04-12 收录

下载链接：

https://hf-mirror.com/datasets/neulab/PangeaBench-maxm

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en - fr - hi - ro - th - he - zh size_categories: - 1K<n<10K task_categories: - visual-question-answering pretty_name: MaXM dataset_info: features: - name: image_id dtype: string - name: image_url dtype: string - name: image struct: - name: bytes dtype: binary - name: path dtype: string - name: image_locale dtype: string - name: image_captions sequence: string - name: question_id dtype: string - name: question dtype: string - name: answers sequence: string - name: processed_answers sequence: string - name: language dtype: string - name: is_collection dtype: bool - name: method dtype: string splits: - name: hi num_bytes: 23640810 num_examples: 260 - name: th num_bytes: 23960076 num_examples: 268 - name: zh num_bytes: 24634226 num_examples: 277 - name: fr num_bytes: 23188830 num_examples: 264 - name: en num_bytes: 23067651 num_examples: 257 - name: iw num_bytes: 25044532 num_examples: 280 - name: ro num_bytes: 26229952 num_examples: 284 download_size: 106887693 dataset_size: 169766077 configs: - config_name: default data_files: - split: hi path: data/hi-* - split: th path: data/th-* - split: zh path: data/zh-* - split: fr path: data/fr-* - split: en path: data/en-* - split: iw path: data/iw-* - split: ro path: data/ro-* --- # MaXM ### This is a clone of the MaXM dataset by Google LLC ("Google")! Please find the original repository here: https://github.com/google-research-datasets/maxm If you use this dataset, please cite the original authors: ```bibtex @inproceedings{changpinyo2023maxm, title = {{MaXM}: Towards Multilingual Visual Question Answering}, author = {Changpinyo, Soravit and Xue, Linting and Yarom, Michal and Thapliyal, Ashish V. and Szpektor, Idan and Amelot, Julien and Chen, Xi and Soricut, Radu}, booktitle={Findings of the Association for Computational Linguistics: EMNLP}, year = {2023}, } ``` ### It additionally contains the captions and image locales from the respective XM3600 images. ### How to read the image Due to a [bug](https://github.com/huggingface/datasets/issues/4796), the images cannot be stored as PIL.Image.Images directly but need to be converted to dataset.Images-. Hence, to load them, this additional step is required: ```python from datasets import Image, load_dataset ds = load_dataset("floschne/maxm", split="en") ds.map( lambda sample: { "image_t": [Image().decode_example(img) for img in sample["image"]], }, remove_columns=["image"], ).rename_columns({"image_t": "image"}) ```

--- language: - 英语（en） - 法语（fr） - 印地语（hi） - 罗马尼亚语（ro） - 泰语（th） - 希伯来语（he，旧代码iw） - 中文（zh） size_categories: - 1K < n < 10K task_categories: - 视觉问答（Visual Question Answering） pretty_name: MaXM dataset_info: features: - name: image_id（图像ID），数据类型：字符串 - name: image_url（图像URL），数据类型：字符串 - name: image（图像），为结构体，包含： - 字段名：bytes（字节流），数据类型：二进制 - 字段名：path（路径），数据类型：字符串 - name: image_locale（图像区域语言），数据类型：字符串 - name: image_captions（图像标题），数据类型：字符串序列 - name: question_id（问题ID），数据类型：字符串 - name: question（问题文本），数据类型：字符串 - name: answers（原始答案），数据类型：字符串序列 - name: processed_answers（处理后答案），数据类型：字符串序列 - name: language（语言），数据类型：字符串 - name: is_collection（是否为集合），数据类型：布尔值 - name: method（实现方法），数据类型：字符串 splits: - 划分标识：hi（印地语），字节占用：23640810，样本数量：260 - 划分标识：th（泰语），字节占用：23960076，样本数量：268 - 划分标识：zh（中文），字节占用：24634226，样本数量：277 - 划分标识：fr（法语），字节占用：23188830，样本数量：264 - 划分标识：en（英语），字节占用：23067651，样本数量：257 - 划分标识：iw（希伯来语），字节占用：25044532，样本数量：280 - 划分标识：ro（罗马尼亚语），字节占用：26229952，样本数量：284 download_size: 106887693 dataset_size: 169766077 configs: - config_name: default data_files: - split: hi，路径：data/hi-* - split: th，路径：data/th-* - split: zh，路径：data/zh-* - split: fr，路径：data/fr-* - split: en，路径：data/en-* - split: iw，路径：data/iw-* - split: ro，路径：data/ro-* --- # MaXM ### 本数据集为谷歌有限责任公司（Google LLC，以下简称"谷歌"）发布的MaXM数据集的克隆版本！ ### 原始数据集仓库地址为：https://github.com/google-research-datasets/maxm ### 若您在研究工作中使用该数据集，请引用原作者的学术成果： bibtex @inproceedings{changpinyo2023maxm, title = {{MaXM}: 迈向多语言视觉问答}, author = {Changpinyo, Soravit and Xue, Linting and Yarom, Michal and Thapliyal, Ashish V. and Szpektor, Idan and Amelot, Julien and Chen, Xi and Soricut, Radu}, booktitle={Findings of the Association for Computational Linguistics: EMNLP}, year = {2023}, } ### 此外，本数据集额外包含了对应XM3600图像的标题与图像区域语言标注信息。 ### 图像加载注意事项 ### 由于存在[已知漏洞](https://github.com/huggingface/datasets/issues/4796)，图像无法直接以PIL.Image.Image格式存储，需转换为Hugging Face数据集专用图像格式。因此加载图像时需执行以下额外步骤： python from datasets import Image, load_dataset ds = load_dataset("floschne/maxm", split="en") ds.map( lambda sample: { "image_t": [Image().decode_example(img) for img in sample["image"]], }, remove_columns=["image"], ).rename_columns({"image_t": "image"})

提供机构：

neulab

5,000+

优质数据集

54 个

任务类型

进入经典数据集