Chart-to-text
收藏arXiv2022-04-14 更新2024-06-21 收录
下载链接:
https://github.com/visnlp/Chart-to-text
下载链接
链接失效反馈官方服务:
资源简介:
Chart-to-text数据集是一个大规模的图表摘要基准,包含两个子数据集,总计44,096个图表,覆盖广泛的主题和图表类型。该数据集通过爬取Statista和Pew Research等公开资源构建,图表类型多样,包括条形图、线图、饼图等。创建过程中,研究团队对图表进行了分类和标注,确保数据的质量和多样性。该数据集主要用于自动图表摘要任务,旨在帮助用户从图表中快速获取关键信息,同时也可用于视觉障碍人士通过屏幕阅读器理解图表内容,以及改进包含图表的文档的信息检索算法。
Chart-to-text dataset is a large-scale chart summarization benchmark consisting of two sub-datasets with a total of 44,096 charts, covering a wide range of topics and chart types. This dataset is constructed by crawling public resources such as Statista and Pew Research, and includes diverse chart types including bar charts, line charts, pie charts, and more. During the creation process, the research team classified and annotated the charts to ensure the quality and diversity of the dataset. This benchmark is mainly utilized for automatic chart summarization tasks, aiming to help users quickly obtain key information from charts. It can also be employed to assist visually impaired people in understanding chart content via screen readers, as well as to improve information retrieval algorithms for documents containing charts.
提供机构:
约克大学, 加拿大; 南洋理工大学, 新加坡
创建时间:
2022-03-13



