ChartSumm
收藏arXiv2023-06-11 更新2024-06-21 收录
下载链接:
https://github.com/pranonrahman/ChartSumm
下载链接
链接失效反馈官方服务:
资源简介:
ChartSumm是由伊斯兰科技大学计算机科学与工程系创建的大型数据集,包含84,363个图表及其元数据和描述,覆盖广泛的主题和图表类型,用于生成短和长摘要。数据集内容丰富,包括经济、政治、社会科学等多个领域,旨在帮助视觉障碍人士及提升信息检索算法的性能。创建过程中,数据从Knoema和Statista网站爬取,经过筛选和分类,确保数据的质量和多样性。该数据集的应用领域广泛,主要用于自动图表摘要任务,解决现有数据集资源不足的问题。
ChartSumm is a large-scale dataset developed by the Department of Computer Science and Engineering, Islamic University of Technology. It contains 84,363 charts along with their metadata and descriptions, covering a broad spectrum of topics and chart types, and is intended for generating both short and long summaries. The dataset encompasses multiple domains including economics, politics, social sciences and others, boasting rich content. Its core objectives are to assist visually impaired individuals and enhance the performance of information retrieval algorithms. During the dataset creation process, data was crawled from the Knoema and Statista websites, followed by filtering and classification to ensure data quality and diversity. This dataset has a wide range of application scenarios, primarily used for automatic chart summarization tasks, addressing the issue of insufficient resources in existing datasets.
提供机构:
伊斯兰科技大学计算机科学与工程系,孟加拉国
创建时间:
2023-04-26



