CII-Bench

arXiv2025-09-30 收录

下载链接：

https://huggingface.co/datasets/m-a-p/cii-bench

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集名为CII-Bench，旨在评估多模态大型语言模型（MLLMs）对中文图像的高阶感知和理解能力，特别是融入了中国传统文化元素的图像。CII-Bench数据集包含了描绘中国传统文化的大量图片，这些图片经过人工精心筛选，以确保其真实性。该基准测试揭示了人类与MLLMs在理解文化细微差别方面的显著性能差距。该数据集涵盖了参数规模从70亿到1000亿不等的多种MLLMs。其任务是对MLLMs理解中文视觉内容和文化含义的能力进行评估。

This dataset, named CII-Bench, is designed to evaluate the high-order perception and understanding capabilities of Multimodal Large Language Models (MLLMs) toward Chinese images, especially those incorporating elements of traditional Chinese culture. CII-Bench includes a large collection of images depicting traditional Chinese culture, which have been meticulously manually curated to ensure their authenticity. This benchmark reveals a significant performance gap between humans and MLLMs in understanding cultural nuances. The dataset covers a variety of MLLMs with parameter sizes ranging from 7 billion to 100 billion. Its core purpose is to evaluate the ability of MLLMs to comprehend Chinese visual content and their associated cultural implications.

5,000+

优质数据集

54 个

任务类型

进入经典数据集