five

neulab/PangeaBench-maxm

收藏
Hugging Face2024-10-31 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/neulab/PangeaBench-maxm
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en - fr - hi - ro - th - he - zh size_categories: - 1K<n<10K task_categories: - visual-question-answering pretty_name: MaXM dataset_info: features: - name: image_id dtype: string - name: image_url dtype: string - name: image struct: - name: bytes dtype: binary - name: path dtype: string - name: image_locale dtype: string - name: image_captions sequence: string - name: question_id dtype: string - name: question dtype: string - name: answers sequence: string - name: processed_answers sequence: string - name: language dtype: string - name: is_collection dtype: bool - name: method dtype: string splits: - name: hi num_bytes: 23640810 num_examples: 260 - name: th num_bytes: 23960076 num_examples: 268 - name: zh num_bytes: 24634226 num_examples: 277 - name: fr num_bytes: 23188830 num_examples: 264 - name: en num_bytes: 23067651 num_examples: 257 - name: iw num_bytes: 25044532 num_examples: 280 - name: ro num_bytes: 26229952 num_examples: 284 download_size: 106887693 dataset_size: 169766077 configs: - config_name: default data_files: - split: hi path: data/hi-* - split: th path: data/th-* - split: zh path: data/zh-* - split: fr path: data/fr-* - split: en path: data/en-* - split: iw path: data/iw-* - split: ro path: data/ro-* --- # MaXM ### This is a clone of the MaXM dataset by Google LLC ("Google")! Please find the original repository here: https://github.com/google-research-datasets/maxm If you use this dataset, please cite the original authors: ```bibtex @inproceedings{changpinyo2023maxm, title = {{MaXM}: Towards Multilingual Visual Question Answering}, author = {Changpinyo, Soravit and Xue, Linting and Yarom, Michal and Thapliyal, Ashish V. and Szpektor, Idan and Amelot, Julien and Chen, Xi and Soricut, Radu}, booktitle={Findings of the Association for Computational Linguistics: EMNLP}, year = {2023}, } ``` ### It additionally contains the captions and image locales from the respective XM3600 images. ### How to read the image Due to a [bug](https://github.com/huggingface/datasets/issues/4796), the images cannot be stored as PIL.Image.Images directly but need to be converted to dataset.Images-. Hence, to load them, this additional step is required: ```python from datasets import Image, load_dataset ds = load_dataset("floschne/maxm", split="en") ds.map( lambda sample: { "image_t": [Image().decode_example(img) for img in sample["image"]], }, remove_columns=["image"], ).rename_columns({"image_t": "image"}) ```

--- language: - 英语(en) - 法语(fr) - 印地语(hi) - 罗马尼亚语(ro) - 泰语(th) - 希伯来语(he,旧代码iw) - 中文(zh) size_categories: - 1K < n < 10K task_categories: - 视觉问答(Visual Question Answering) pretty_name: MaXM dataset_info: features: - name: image_id(图像ID),数据类型:字符串 - name: image_url(图像URL),数据类型:字符串 - name: image(图像),为结构体,包含: - 字段名:bytes(字节流),数据类型:二进制 - 字段名:path(路径),数据类型:字符串 - name: image_locale(图像区域语言),数据类型:字符串 - name: image_captions(图像标题),数据类型:字符串序列 - name: question_id(问题ID),数据类型:字符串 - name: question(问题文本),数据类型:字符串 - name: answers(原始答案),数据类型:字符串序列 - name: processed_answers(处理后答案),数据类型:字符串序列 - name: language(语言),数据类型:字符串 - name: is_collection(是否为集合),数据类型:布尔值 - name: method(实现方法),数据类型:字符串 splits: - 划分标识:hi(印地语),字节占用:23640810,样本数量:260 - 划分标识:th(泰语),字节占用:23960076,样本数量:268 - 划分标识:zh(中文),字节占用:24634226,样本数量:277 - 划分标识:fr(法语),字节占用:23188830,样本数量:264 - 划分标识:en(英语),字节占用:23067651,样本数量:257 - 划分标识:iw(希伯来语),字节占用:25044532,样本数量:280 - 划分标识:ro(罗马尼亚语),字节占用:26229952,样本数量:284 download_size: 106887693 dataset_size: 169766077 configs: - config_name: default data_files: - split: hi,路径:data/hi-* - split: th,路径:data/th-* - split: zh,路径:data/zh-* - split: fr,路径:data/fr-* - split: en,路径:data/en-* - split: iw,路径:data/iw-* - split: ro,路径:data/ro-* --- # MaXM ### 本数据集为谷歌有限责任公司(Google LLC,以下简称"谷歌")发布的MaXM数据集的克隆版本! ### 原始数据集仓库地址为:https://github.com/google-research-datasets/maxm ### 若您在研究工作中使用该数据集,请引用原作者的学术成果: bibtex @inproceedings{changpinyo2023maxm, title = {{MaXM}: 迈向多语言视觉问答}, author = {Changpinyo, Soravit and Xue, Linting and Yarom, Michal and Thapliyal, Ashish V. and Szpektor, Idan and Amelot, Julien and Chen, Xi and Soricut, Radu}, booktitle={Findings of the Association for Computational Linguistics: EMNLP}, year = {2023}, } ### 此外,本数据集额外包含了对应XM3600图像的标题与图像区域语言标注信息。 ### 图像加载注意事项 ### 由于存在[已知漏洞](https://github.com/huggingface/datasets/issues/4796),图像无法直接以PIL.Image.Image格式存储,需转换为Hugging Face数据集专用图像格式。因此加载图像时需执行以下额外步骤: python from datasets import Image, load_dataset ds = load_dataset("floschne/maxm", split="en") ds.map( lambda sample: { "image_t": [Image().decode_example(img) for img in sample["image"]], }, remove_columns=["image"], ).rename_columns({"image_t": "image"})
提供机构:
neulab
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作