ds4sd/SynthChartNet
收藏Hugging Face2025-07-15 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/ds4sd/SynthChartNet
下载链接
链接失效反馈官方服务:
资源简介:
SynthChartNet是一个用于训练SmolDocling模型在基于图表的文档理解任务上的多模态数据集。它包含1,981,157个合成样本,每个样本都是一张图表(例如,线形图、条形图、饼图、堆叠条形图)的图像,以及以OTSL格式给出的相应真实标签。图表使用120 DPI的分辨率和多种可视化库(Matplotlib、Seaborn和Pyecharts)渲染,以实现布局、风格和颜色方案的视觉多样性。
SynthChartNet is a multimodal dataset designed for training the SmolDocling model on chart-based document understanding tasks. It consists of 1,981,157 synthetically generated samples, where each image depicts a chart (e.g., line chart, bar chart, pie chart, stacked bar chart), and the associated ground truth is given in OTSL format. Charts were rendered at 120 DPI using a diverse set of visualization libraries: Matplotlib, Seaborn, and Pyecharts, enabling visual variability in layout, style, and color schemes.
提供机构:
ds4sd



