Name: ckchaos/ChartDiff
Creator: ckchaos
Published: 2026-04-01 13:44:18
License: 暂无描述

下载链接：

https://hf-mirror.com/datasets/ckchaos/ChartDiff

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset: ChartDiff license: cc-by-4.0 task_categories: - summarization - image-text-to-text - image-to-text - tabular-to-text pretty_name: ChartDiff configs: - config_name: default data_files: - split: train path: train/metadata.json - split: validation path: validation/metadata.json - split: test path: test/metadata.json --- # ChartDiff: A Large-Scale Benchmark for Comprehending Pairs of Charts [![Project Page](https://img.shields.io/badge/Project-Page-blue)](https://ckchaos.github.io/ChartDiff) [![arXiv](https://img.shields.io/badge/arXiv-2603.28902-brightgreen)](https://arxiv.org/abs/2603.28902) ## Overview **ChartDiff** is a large-scale benchmark for **cross-chart comparative summarization**, designed to evaluate whether vision-language models can identify differences and generate coherent comparative descriptions across pairs of charts. Unlike existing chart understanding datasets that emphasize single-chart interpretation, ChartDiff requires models to compare **two charts jointly** and generate a **concise, structured summary of their differences**, including: - Overall trends - Local fluctuations - Notable anomalies ## Dataset Structure The dataset is organized into three splits: ``` ChartDiff/ ├── train/ ├── validation/ └── test/ ``` Each split contains: - `metadata.json`: data information - `{PAIR_ID}/`: a directory per pair, containing the associated chart images and their underlying CSV data ## Data Format Each entry in `metadata.json` follows: ```json { "id": "00000", "chart_A": "00000/00000_A.png", "chart_B": "00000/00000_B.png", "csv_A": "00000/00000_A.csv", "csv_B": "00000/00000_B.csv", "annotation": "......", "chart_type": "pie", "plotting_lib": "plotly" } ``` ### Field Description | Field | Description | | ------- | ------------------------------------- | | id | Unique identifier for each chart pair | | chart_A | Path to chart A image | | chart_B | Path to chart B image | | csv_A | Underlying data for chart A | | csv_B | Underlying data for chart B | | annotation | Reference comparison summary | | chart_type | Type of both chart A and chart B | | plotting_lib | Library for rendering chart A and chart B | ## Citation If you use ChartDiff, please cite: ```bibtex @misc{ye2026chartdiff, title={ChartDiff: A Large-Scale Benchmark for Comprehending Pairs of Charts}, author={Rongtian Ye}, year={2026}, eprint={2603.28902}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2603.28902}, } ```

数据集：ChartDiff 许可证：CC-BY-4.0 任务类别： - 摘要生成 - 图像-文本转文本 - 图像转文本 - 表格转文本展示名称：ChartDiff 配置项： - 配置名称：default 数据文件： - 拆分：训练集，路径：train/metadata.json - 拆分：验证集，路径：validation/metadata.json - 拆分：测试集，路径：test/metadata.json # ChartDiff：面向图表对理解的大规模基准数据集 [![Project Page](https://img.shields.io/badge/Project-Page-blue)](https://ckchaos.github.io/ChartDiff) [![arXiv](https://img.shields.io/badge/arXiv-2603.28902-brightgreen)](https://arxiv.org/abs/2603.28902) ## 概述 **ChartDiff** 是一款面向**跨图表对比摘要（cross-chart comparative summarization）**的大规模基准数据集，旨在评估视觉语言模型（Vision-Language Model）能否识别图表间差异，并生成连贯的跨图表对比描述。与现有侧重单图表解读的图表理解数据集不同，ChartDiff要求模型**联合比对两张图表**，并生成**简洁结构化的差异摘要**，涵盖： - 整体趋势 - 局部波动 - 显著异常点 ## 数据集结构该数据集分为三个拆分子集： ChartDiff/ ├── train/ ├── validation/ └── test/ 每个拆分子集包含： - `metadata.json`：数据信息文件 - `{PAIR_ID}/`：每个图表对对应的目录，存储关联的图表图像及其底层CSV数据 ## 数据格式 `metadata.json` 中的每条条目格式如下： json { "id": "00000", "chart_A": "00000/00000_A.png", "chart_B": "00000/00000_B.png", "csv_A": "00000/00000_A.csv", "csv_B": "00000/00000_B.csv", "annotation": "......", "chart_type": "pie", "plotting_lib": "plotly" } ### 字段说明 | 字段名 | 描述内容 | | -------------- | ------------------------------------------------------------ | | id | 每个图表对的唯一标识符 | | chart_A | 图表A图像的文件路径 | | chart_B | 图表B图像的文件路径 | | csv_A | 图表A对应的底层数据文件 | | csv_B | 图表B对应的底层数据文件 | | annotation | 参考对比摘要 | | chart_type | 图表A与图表B的图表类型 | | plotting_lib | 渲染图表A与图表B所用的绘图库 | ## 引用若使用ChartDiff数据集，请引用以下文献： bibtex @misc{ye2026chartdiff, title={ChartDiff: A Large-Scale Benchmark for Comprehending Pairs of Charts}, author={Rongtian Ye}, year={2026}, eprint={2603.28902}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2603.28902}, }

应用场景：