CII-Bench
收藏arXiv2025-09-30 收录
下载链接:
https://huggingface.co/datasets/m-a-p/cii-bench
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为CII-Bench,旨在评估多模态大型语言模型(MLLMs)对中文图像的高阶感知和理解能力,特别是融入了中国传统文化元素的图像。CII-Bench数据集包含了描绘中国传统文化的大量图片,这些图片经过人工精心筛选,以确保其真实性。该基准测试揭示了人类与MLLMs在理解文化细微差别方面的显著性能差距。该数据集涵盖了参数规模从70亿到1000亿不等的多种MLLMs。其任务是对MLLMs理解中文视觉内容和文化含义的能力进行评估。
This dataset, named CII-Bench, is designed to evaluate the high-order perception and understanding capabilities of Multimodal Large Language Models (MLLMs) toward Chinese images, especially those incorporating elements of traditional Chinese culture. CII-Bench includes a large collection of images depicting traditional Chinese culture, which have been meticulously manually curated to ensure their authenticity. This benchmark reveals a significant performance gap between humans and MLLMs in understanding cultural nuances. The dataset covers a variety of MLLMs with parameter sizes ranging from 7 billion to 100 billion. Its core purpose is to evaluate the ability of MLLMs to comprehend Chinese visual content and their associated cultural implications.



