five

MBZUAI/ALM-Bench

收藏
Hugging Face2025-02-28 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/MBZUAI/ALM-Bench
下载链接
链接失效反馈
官方服务:
资源简介:
All Languages Matter Benchmark (ALM-Bench) 是一个旨在评估大型多模态模型(LMMs)在100种语言中表现的数据集,特别是对低资源语言和文化多样性的理解。该数据集包含22.7K个高质量的问题-答案对,涵盖100种语言和24种文字。数据集的结构包括图像、语言、类别、问题类型、英文问题和答案、翻译后的问题和答案以及图像URL。通过多样化的提问形式(如选择题、判断题、简答题和长答题),ALM-Bench 提供了一个全面的评估框架,确保模型在视觉和语言推理方面的能力得到全面测试。数据集还特别关注了13种不同的文化方面,从传统和仪式到名人和庆祝活动,强调了文化和语言的包容性,鼓励开发能够有效服务全球多样化人群的模型。

The All Languages Matter Benchmark (ALM-Bench) is a dataset designed to evaluate the performance of Large Multimodal Models (LMMs) across 100 languages, with a particular focus on understanding low-resource languages and cultural diversity. The dataset contains 22.7K high-quality question-answer pairs, covering 100 languages and 24 scripts. The dataset structure includes images, languages, categories, question types, English questions and answers, translated questions and answers, and image URLs. Through diverse question formats (such as multiple-choice, true/false, short-answer, and long-answer questions), ALM-Bench provides a comprehensive evaluation framework, ensuring a thorough assessment of models abilities in visual and linguistic reasoning. The dataset also pays special attention to 13 distinct cultural aspects, ranging from traditions and rituals to famous personalities and celebrations, emphasizing cultural and linguistic inclusivity and encouraging the development of models that can effectively serve diverse global populations.
提供机构:
MBZUAI
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作