"ChartNexus"
收藏DataCite Commons2026-02-28 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/chartnexus
下载链接
链接失效反馈官方服务:
资源简介:
"While Multimodal Large Language Models (MLLMs) achieve remarkable success on single-chart question-answering tasks, reaching over 90% accuracy on benchmarks like PlotQA, this apparent success masks a critical limitation in complex, real-world analytical scenarios. To bridge this gap, ChartNexus introduces a novel, highly challenging benchmark specifically designed to evaluate the multi-chart reasoning capabilities of MLLMs within authentic document contexts. Unlike existing single-chart datasets, ChartNexus requires complex cross-modal synthesis between visual elements and surrounding text, reflecting how users integrate information in professional workflows. It comprises 1,370 high-quality, human-annotated question-answering pairs derived from 6,793 real-world charts spanning 18 domains, such as scientific papers, government reports, and industry analyses. The benchmark utilizes a comprehensive taxonomy featuring 4 high-level difficulty categories and 11 fine-grained sub-categories to systematically evaluate skills like comparative analysis and sequential information integration. Extensive evaluation of 23 state-of-the-art MLLMs reveals significant performance degradation, with top models dropping by more than half compared to simpler tasks. Through systematic failure analysis, ChartNexus exposes critical weaknesses in current models' working memory and contextual integration capabilities, highlighting its value as a rigorous diagnostic tool for advancing next-generation multi-modal understanding."
提供机构:
IEEE DataPort
创建时间:
2026-02-28



