vector-institute/HumaniBench
收藏Hugging Face2026-04-29 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/vector-institute/HumaniBench
下载链接
链接失效反馈官方服务:
资源简介:
HumaniBench是一个以人为中心的基准测试集,用于评估大型多模态模型(LMMs)。它包含32,000多个图像-问题对,覆盖7个任务:开放/封闭式视觉问答(VQA)、多语言问答、视觉定位、共情字幕生成、以及鲁棒性、推理和伦理。每个示例都经过GPT-4o草稿标注和专家验证,确保高质量和一致性。数据集支持多种语言(英语、泰米尔语、乌尔都语、西班牙语、波斯语、葡萄牙语、韩语和法语),旨在用于多模态模型的基准测试、鲁棒性研究和多语言推理评估。
HumaniBench is a human-centric benchmark for evaluating large multimodal models (LMMs). It consists of 32,000+ image–question pairs across 7 tasks: open/closed VQA, multilingual QA, visual grounding, empathetic captioning, and robustness, reasoning, and ethics. Each example is annotated with GPT-4o drafts and verified by experts to ensure quality and alignment. The dataset supports multiple languages (English, Tamil, Urdu, Spanish, Persian, Portuguese, Korean, and French) and is intended for benchmarking multimodal models, studying robustness, and evaluating multilingual reasoning.
提供机构:
vector-institute



