five

m-WildVision

收藏
魔搭社区2026-01-08 更新2025-03-08 收录
下载链接:
https://modelscope.cn/datasets/CohereForAI/m-WildVision
下载链接
链接失效反馈
官方服务:
资源简介:
## Dataset Card for m-WildVision <img src="mwildvision2.png" width="650" style="margin-left:'auto' margin-right:'auto' display:'block'"/> ### Dataset Details The m-WildVision dataset is a multilingual multimodal LLM evaluation set covering **23 languages**. It was created by translating prompts from the original English-only [WildVision (vision_bench_0617)](https://arxiv.org/abs/2406.11069) test set. The original prompts, developed by [Lu et al. (2024)](https://arxiv.org/abs/2406.11069) , consist of 500 challenging user queries sourced from the WildVision-Arena platform. The authors demonstrated that these prompts enable automatic LLM judge evaluations, which strongly correlate with WildVision-Arena rankings. ### Languages: To ensure multilingual coverage, the non-English portion of the dataset was generated by translating the English subset into 22 additional languages using Google Translate API v3. The dataset includes a diverse range of language families and scripts, ensuring a comprehensive evaluation of model generalizability and robustness. The languages included are: Arabic (arb_Arab), Chinese (zho_Hans), Czech (ces_Latn), Dutch (nld_Latn), English (eng_Latn), French (fra_Latn), German (deu_Latn), Greek (ell_Grek), Hebrew (heb_Hebr), Hindi (hin_Deva), Indonesian (ind_Latn), Italian (ita_Latn), Japanese (jpn_Jpan), Korean (kor_Hang), Persian (fas_Arab), Polish (pol_Latn), Portuguese (por_Latn), Romanian (ron_Latn), Russian (rus_Cyrl), Spanish (spa_Latn), Turkish (tur_Latn), Ukrainian (ukr_Cyrl), and Vietnamese (vie_Latn). By incorporating languages from different families and scripts, this benchmark enables a **comprehensive assessment of vision-language models**, particularly their ability to generalize across diverse languages. ### Load with Datasets To load this dataset with Datasets, you'll need to install Datasets as `pip install datasets --upgrade` and then use the following code: ```python from datasets import load_dataset dataset = load_dataset("CohereLabs/m-WildVision", "eng_Latn") ``` The above code block will load only the English subset of the entire dataset. You can load other subsets by specifying other supported languages of interest or the entire dataset by leaving that argument as blank. ### Dataset Structure An instance of the data from the English subset looks as follows: ```python {'question_id': a711a80b19c040c2a98364b5e181b020, 'language': 'eng_Latn', 'question': 'How many workers are working in the construction site? Are all of them wearing the safety equipments? If no, who are not wearing them?' 'image': [PIL.Image], } ``` ### Dataset Fields The following are the fields in the dataset: - question_id: a unique ID for the example - language: The language of the sample, indicating the subset to which it belongs. - instruction: text of the prompt (question or instruction) - image: The raw image data in .jpg format. All language subsets of the dataset share the same fields as above. ### Authorship - Publishing Organization: [Cohere Labs](https://cohere.com/research) - Industry Type: Not-for-profit - Tech - Contact Details: https://cohere.com/research/aya ### Licensing Information This dataset can be used for any purpose, whether academic or commercial, under the terms of the Apache 2.0 License.

## m-WildVision 数据集卡片 <img src="mwildvision2.png" width="650" style="margin-left:'auto' margin-right:'auto' display:'block'"/> ### 数据集详情 m-WildVision 数据集是一个覆盖**23种语言**的多模态大语言模型(Large Language Model,LLM)评测基准集。该数据集通过翻译仅包含英语的原始 [WildVision(vision_bench_0617)](https://arxiv.org/abs/2406.11069) 测试集的提示词构建而成。 由[Lu等人(2024)](https://arxiv.org/abs/2406.11069) 开发的原始提示词共包含500条来自WildVision-Arena平台的高难度用户查询。该研究团队证实,这些提示词可用于自动化大语言模型评审评估,且评估结果与WildVision-Arena平台的排名具有强相关性。 ### 语言覆盖 为实现多语言覆盖,数据集的非英语部分通过使用Google Translate API v3将英语子集翻译为另外22种语言生成。该数据集涵盖了多样的语系与书写系统,可全面评测模型的泛化能力与鲁棒性。 包含的语言如下:阿拉伯语(arb_Arab)、中文(zho_Hans)、捷克语(ces_Latn)、荷兰语(nld_Latn)、英语(eng_Latn)、法语(fra_Latn)、德语(deu_Latn)、希腊语(ell_Grek)、希伯来语(heb_Hebr)、印地语(hin_Deva)、印度尼西亚语(ind_Latn)、意大利语(ita_Latn)、日语(jpn_Jpan)、韩语(kor_Hang)、波斯语(fas_Arab)、波兰语(pol_Latn)、葡萄牙语(por_Latn)、罗马尼亚语(ron_Latn)、俄语(rus_Cyrl)、西班牙语(spa_Latn)、土耳其语(tur_Latn)、乌克兰语(ukr_Cyrl)以及越南语(vie_Latn)。 通过纳入不同语系与书写系统的语言,该基准可对**视觉语言模型(Vision-Language Model)**开展全面评估,尤其是其跨多样语言的泛化能力。 ### 使用Datasets库加载 若要使用Datasets库加载该数据集,请先通过`pip install datasets --upgrade`命令安装并升级Datasets库,随后使用如下代码: python from datasets import load_dataset dataset = load_dataset("CohereLabs/m-WildVision", "eng_Latn") 上述代码仅加载完整数据集中的英语子集。若需加载其他受支持的语言子集,可将参数替换为对应语言代码;若需加载完整数据集,则可留空该参数。 ### 数据集结构 英语子集的一条数据示例如下: python {'question_id': a711a80b19c040c2a98364b5e181b020, 'language': 'eng_Latn', 'question': 'How many workers are working in the construction site? Are all of them wearing the safety equipments? If no, who are not wearing them?' 'image': [PIL.Image], } ### 数据集字段 数据集包含以下字段: - question_id:样本的唯一标识符 - language:样本所属语言,用于标识其所在子集 - instruction:提示词文本,包含问题或指令 - image:.jpg格式的原始图像数据 所有语言子集均包含上述相同字段。 ### 作者信息 - 发布机构:[Cohere Labs](https://cohere.com/research) - 行业类型:非营利性科技行业 - 联系方式:https://cohere.com/research/aya ### 许可信息 根据Apache 2.0许可证条款,该数据集可用于学术或商业等任何用途。
提供机构:
maas
创建时间:
2025-03-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作