five

docling-project/SynthChartNet

收藏
Hugging Face2025-07-15 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/docling-project/SynthChartNet
下载链接
链接失效反馈
官方服务:
资源简介:
SynthChartNet是一个多模态数据集,专为训练SmolDocling模型用于图表文档理解任务而设计。该数据集包含1,981,157个合成样本,每个样本包括一个图表(例如线图、条形图、饼图、堆叠条形图)及其OTSL格式的地面真实数据。图表使用Matplotlib、Seaborn和Pyecharts等可视化库以120 DPI渲染,以实现布局、风格和颜色方案的视觉多样性。

SynthChartNet is a multimodal dataset designed for training the SmolDocling model on chart-based document understanding tasks. It consists of 1,981,157 synthetically generated samples, each depicting a chart (e.g., line chart, bar chart, pie chart, stacked bar chart) along with the associated ground truth in OTSL format. Charts were rendered at 120 DPI using a diverse set of visualization libraries: Matplotlib, Seaborn, and Pyecharts, ensuring visual variability in layout, style, and color schemes.
提供机构:
docling-project
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作