docling-project/SynthChartNet
收藏Hugging Face2025-07-15 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/docling-project/SynthChartNet
下载链接
链接失效反馈官方服务:
资源简介:
SynthChartNet是一个多模态数据集,专为训练SmolDocling模型用于图表文档理解任务而设计。该数据集包含1,981,157个合成样本,每个样本包括一个图表(例如线图、条形图、饼图、堆叠条形图)及其OTSL格式的地面真实数据。图表使用Matplotlib、Seaborn和Pyecharts等可视化库以120 DPI渲染,以实现布局、风格和颜色方案的视觉多样性。
SynthChartNet is a multimodal dataset designed for training the SmolDocling model on chart-based document understanding tasks. It consists of 1,981,157 synthetically generated samples, each depicting a chart (e.g., line chart, bar chart, pie chart, stacked bar chart) along with the associated ground truth in OTSL format. Charts were rendered at 120 DPI using a diverse set of visualization libraries: Matplotlib, Seaborn, and Pyecharts, ensuring visual variability in layout, style, and color schemes.
提供机构:
docling-project



