neulab/PangeaBench-maxm
收藏Hugging Face2024-10-31 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/neulab/PangeaBench-maxm
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
- fr
- hi
- ro
- th
- he
- zh
size_categories:
- 1K<n<10K
task_categories:
- visual-question-answering
pretty_name: MaXM
dataset_info:
features:
- name: image_id
dtype: string
- name: image_url
dtype: string
- name: image
struct:
- name: bytes
dtype: binary
- name: path
dtype: string
- name: image_locale
dtype: string
- name: image_captions
sequence: string
- name: question_id
dtype: string
- name: question
dtype: string
- name: answers
sequence: string
- name: processed_answers
sequence: string
- name: language
dtype: string
- name: is_collection
dtype: bool
- name: method
dtype: string
splits:
- name: hi
num_bytes: 23640810
num_examples: 260
- name: th
num_bytes: 23960076
num_examples: 268
- name: zh
num_bytes: 24634226
num_examples: 277
- name: fr
num_bytes: 23188830
num_examples: 264
- name: en
num_bytes: 23067651
num_examples: 257
- name: iw
num_bytes: 25044532
num_examples: 280
- name: ro
num_bytes: 26229952
num_examples: 284
download_size: 106887693
dataset_size: 169766077
configs:
- config_name: default
data_files:
- split: hi
path: data/hi-*
- split: th
path: data/th-*
- split: zh
path: data/zh-*
- split: fr
path: data/fr-*
- split: en
path: data/en-*
- split: iw
path: data/iw-*
- split: ro
path: data/ro-*
---
# MaXM
### This is a clone of the MaXM dataset by Google LLC ("Google")!
Please find the original repository here: https://github.com/google-research-datasets/maxm
If you use this dataset, please cite the original authors:
```bibtex
@inproceedings{changpinyo2023maxm,
title = {{MaXM}: Towards Multilingual Visual Question Answering},
author = {Changpinyo, Soravit and Xue, Linting and Yarom, Michal and Thapliyal, Ashish V. and Szpektor, Idan and Amelot, Julien and Chen, Xi and Soricut, Radu},
booktitle={Findings of the Association for Computational Linguistics: EMNLP},
year = {2023},
}
```
### It additionally contains the captions and image locales from the respective XM3600 images.
### How to read the image
Due to a [bug](https://github.com/huggingface/datasets/issues/4796), the images cannot be stored as PIL.Image.Images directly but need to be converted to dataset.Images-. Hence, to load them, this additional step is required:
```python
from datasets import Image, load_dataset
ds = load_dataset("floschne/maxm", split="en")
ds.map(
lambda sample: {
"image_t": [Image().decode_example(img) for img in sample["image"]],
},
remove_columns=["image"],
).rename_columns({"image_t": "image"})
```
---
language:
- 英语(en)
- 法语(fr)
- 印地语(hi)
- 罗马尼亚语(ro)
- 泰语(th)
- 希伯来语(he,旧代码iw)
- 中文(zh)
size_categories:
- 1K < n < 10K
task_categories:
- 视觉问答(Visual Question Answering)
pretty_name: MaXM
dataset_info:
features:
- name: image_id(图像ID),数据类型:字符串
- name: image_url(图像URL),数据类型:字符串
- name: image(图像),为结构体,包含:
- 字段名:bytes(字节流),数据类型:二进制
- 字段名:path(路径),数据类型:字符串
- name: image_locale(图像区域语言),数据类型:字符串
- name: image_captions(图像标题),数据类型:字符串序列
- name: question_id(问题ID),数据类型:字符串
- name: question(问题文本),数据类型:字符串
- name: answers(原始答案),数据类型:字符串序列
- name: processed_answers(处理后答案),数据类型:字符串序列
- name: language(语言),数据类型:字符串
- name: is_collection(是否为集合),数据类型:布尔值
- name: method(实现方法),数据类型:字符串
splits:
- 划分标识:hi(印地语),字节占用:23640810,样本数量:260
- 划分标识:th(泰语),字节占用:23960076,样本数量:268
- 划分标识:zh(中文),字节占用:24634226,样本数量:277
- 划分标识:fr(法语),字节占用:23188830,样本数量:264
- 划分标识:en(英语),字节占用:23067651,样本数量:257
- 划分标识:iw(希伯来语),字节占用:25044532,样本数量:280
- 划分标识:ro(罗马尼亚语),字节占用:26229952,样本数量:284
download_size: 106887693
dataset_size: 169766077
configs:
- config_name: default
data_files:
- split: hi,路径:data/hi-*
- split: th,路径:data/th-*
- split: zh,路径:data/zh-*
- split: fr,路径:data/fr-*
- split: en,路径:data/en-*
- split: iw,路径:data/iw-*
- split: ro,路径:data/ro-*
---
# MaXM
### 本数据集为谷歌有限责任公司(Google LLC,以下简称"谷歌")发布的MaXM数据集的克隆版本!
### 原始数据集仓库地址为:https://github.com/google-research-datasets/maxm
### 若您在研究工作中使用该数据集,请引用原作者的学术成果:
bibtex
@inproceedings{changpinyo2023maxm,
title = {{MaXM}: 迈向多语言视觉问答},
author = {Changpinyo, Soravit and Xue, Linting and Yarom, Michal and Thapliyal, Ashish V. and Szpektor, Idan and Amelot, Julien and Chen, Xi and Soricut, Radu},
booktitle={Findings of the Association for Computational Linguistics: EMNLP},
year = {2023},
}
### 此外,本数据集额外包含了对应XM3600图像的标题与图像区域语言标注信息。
### 图像加载注意事项
### 由于存在[已知漏洞](https://github.com/huggingface/datasets/issues/4796),图像无法直接以PIL.Image.Image格式存储,需转换为Hugging Face数据集专用图像格式。因此加载图像时需执行以下额外步骤:
python
from datasets import Image, load_dataset
ds = load_dataset("floschne/maxm", split="en")
ds.map(
lambda sample: {
"image_t": [Image().decode_example(img) for img in sample["image"]],
},
remove_columns=["image"],
).rename_columns({"image_t": "image"})
提供机构:
neulab



