five

"ChartNexus"

收藏
DataCite Commons2026-02-28 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/chartnexus
下载链接
链接失效反馈
官方服务:
资源简介:
"While Multimodal Large Language Models (MLLMs) achieve remarkable success on single-chart question-answering tasks, reaching over 90% accuracy on benchmarks like PlotQA, this apparent success masks a critical limitation in complex, real-world analytical scenarios. To bridge this gap, ChartNexus introduces a novel, highly challenging benchmark specifically designed to evaluate the multi-chart reasoning capabilities of MLLMs within authentic document contexts. Unlike existing single-chart datasets, ChartNexus requires complex cross-modal synthesis between visual elements and surrounding text, reflecting how users integrate information in professional workflows. It comprises 1,370 high-quality, human-annotated question-answering pairs derived from 6,793 real-world charts spanning 18 domains, such as scientific papers, government reports, and industry analyses. The benchmark utilizes a comprehensive taxonomy featuring 4 high-level difficulty categories and 11 fine-grained sub-categories to systematically evaluate skills like comparative analysis and sequential information integration. Extensive evaluation of 23 state-of-the-art MLLMs reveals significant performance degradation, with top models dropping by more than half compared to simpler tasks. Through systematic failure analysis, ChartNexus exposes critical weaknesses in current models' working memory and contextual integration capabilities, highlighting its value as a rigorous diagnostic tool for advancing next-generation multi-modal understanding."
提供机构:
IEEE DataPort
创建时间:
2026-02-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作