SynSlide
收藏arXiv2025-09-30 收录
下载链接:
https://synslidegen.github.io/
下载链接
链接失效反馈官方服务:
资源简介:
该数据集由两部分子集组成,分别为SynDet和SynRet,是通过特定流程生成的合成幻灯片数据集。SynDet子集用于文档布局分析,包含了16个元素类别的自动边界框标注;而SynRet子集则专注于基于文本的幻灯片检索,包含了主题连贯的幻灯片图像及两种类型的摘要。该数据集规模宏大,包含了2,200张高质量的幻灯片图像,旨在支持文档布局分析以及基于文本的幻灯片检索任务。
This dataset consists of two subsets, SynDet and SynRet, which are synthetic slide datasets generated via a dedicated workflow. The SynDet subset is tailored for document layout analysis, featuring automated bounding box annotations for 16 element categories. The SynRet subset focuses on text-based slide retrieval, containing thematically coherent slide images and two types of summaries. Boasting a large scale with 2,200 high-quality slide images, this dataset aims to support both document layout analysis and text-based slide retrieval tasks.



