khhuang/chartve_dataset
收藏Hugging Face2024-02-18 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/khhuang/chartve_dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: apache-2.0
multilinguality:
- monolingual
size_categories:
- 100K<n<1M
tags:
- chart
- plot
- chart-to-text
- vistext
- statista
- pew
- chart-visual-entailment
- chart-understanding
- chart-captioning
- chart-summarization
- document-image
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: dev
path: data/dev-*
dataset_info:
features:
- name: image
dtype: string
- name: sentence
dtype: string
- name: label
dtype: string
- name: manipulation_type
dtype: string
- name: dataset
dtype: string
splits:
- name: train
num_bytes: 118229163.0
num_examples: 522531
- name: dev
num_bytes: 9400046.0
num_examples: 36002
download_size: 51634467
dataset_size: 127629209.0
---
# Dataset Card for ChartVE's Training Data
- [Dataset Description](https://huggingface.co/datasets/khhuang/ChartVE/blob/main/README.md#dataset-description)
- [Paper Information](https://huggingface.co/datasets/khhuang/ChartVE/blob/main/README.md#paper-information)
- [Citation](https://huggingface.co/datasets/khhuang/ChartVE/blob/main/README.md#citation)
## Dataset Description
[ChartVE](https://huggingface.co/khhuang/chartve) (Chart Visual Entailment) is a visual entailment model introduced in the paper "Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning" for evaluating the factuality of a generated caption sentence with regard to the input chart. The model takes in a chart figure and a caption sentence as input, and outputs an entailment probability. This repository hosts the training and validation data for ChartVE.
### Fields
Below, we illustrate the fields in each instance.
- `image`: The path to chart image. Images can be found in [image.zip](https://huggingface.co/datasets/khhuang/chartve_dataset/blob/main/images.zip).
- `sentence`: The sentence used as the _hypothesis_.
- `label`: An indicator about whether the chart entails the given `sentence`.
- `manipulation_type`: The type of perturbation that alters the original sentence (this is only applicable for non-entailment instances).
- `dataset`: The source dataset of the chart `image`.
## Paper Information
- Paper: https://arxiv.org/abs/2312.10160
- Code: https://github.com/khuangaf/CHOCOLATE/
- Project: https://khuangaf.github.io/CHOCOLATE
## Citation
If you use the **ChartVE** dataset/model in your work, please kindly cite the paper using this BibTeX:
```
@misc{huang-etal-2023-do,
title = "Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning",
author = "Huang, Kung-Hsiang and
Zhou, Mingyang and
Chan, Hou Pong and
Fung, Yi R. and
Wang, Zhenhailong and
Zhang, Lingyu and
Chang, Shih-Fu and
Ji, Heng",
year={2023},
eprint={2312.10160},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
提供机构:
khhuang
原始信息汇总
数据集卡片 for ChartVE 的训练数据
数据集描述
ChartVE (Chart Visual Entailment) 是一个视觉蕴涵模型,用于评估生成的图表描述句子与输入图表的事实一致性。该模型接受图表图像和描述句子作为输入,并输出蕴涵概率。本仓库托管了 ChartVE 的训练和验证数据。
字段
以下是每个实例中的字段:
image: 图表图像的路径。图像可以在 image.zip 中找到。sentence: 作为假设的句子。label: 指示图表是否蕴涵给定的sentence。manipulation_type: 改变原始句子的扰动类型(仅适用于非蕴涵实例)。dataset: 图表image的来源数据集。
数据集信息
- 语言: 英语
- 许可证: Apache 2.0
- 多语言性: 单语种
- 大小类别: 100K<n<1M
- 标签:
- chart
- plot
- chart-to-text
- vistext
- statista
- pew
- chart-visual-entailment
- chart-understanding
- chart-captioning
- chart-summarization
- document-image
配置
- 配置名称: default
- 数据文件:
- 分割: train
- 路径: data/train-*
- 分割: dev
- 路径: data/dev-*
- 分割: train
- 数据文件:
数据集特征
- 特征:
- 名称: image
- 数据类型: string
- 名称: sentence
- 数据类型: string
- 名称: label
- 数据类型: string
- 名称: manipulation_type
- 数据类型: string
- 名称: dataset
- 数据类型: string
- 名称: image
分割
- 名称: train
- 字节数: 118229163.0
- 示例数: 522531
- 名称: dev
- 字节数: 9400046.0
- 示例数: 36002
下载大小
- 下载大小: 51634467
数据集大小
- 数据集大小: 127629209.0



