浮动体检测数据集(FLD)
收藏魔搭社区2026-01-08 更新2024-10-26 收录
下载链接:
https://modelscope.cn/datasets/irhawks/floating-det
下载链接
链接失效反馈官方服务:
资源简介:
浮动体是在学术文献和书籍等正式出版物中常见的一种页面元素类型。在LaTeX中,浮动体通常指的是可以包含文本、图片、表格、代码、算法等的容器。这些容器在文档中的位置可以由LaTeX自动调整以适应页面布局。为了便于索引和阅读,通常浮动体会在主体(图片、表格、代码块、算法块)之外,增加类型、编号、标题等信息,以使得阅读相对顺畅。版面结构分析(Document Layout Analysis)任务所检测出来的元素数量都极为有限,表格、图片等一般单独处理,给精细的版面分析带来了不便。为此,在现有版面结构分析的基础上,增加了浮动体位置检测和浮动体结构分析两项任务。并参考DocGenome数据,寻找arXiv文档,使用X-AnyLabel分别标注,形成浮动体检测数据集(FLD)以及浮动体结构分析数据集(FSA),各600张。
Floats are a common type of page element in formal publications such as academic papers and books. In LaTeX, floats typically refer to containers that can hold text, images, tables, code, algorithms, and other content. The positions of these containers in the document can be automatically adjusted by LaTeX to fit the page layout. To facilitate indexing and reading, floats usually add supplementary information such as type, serial number, and caption outside their main content (images, tables, code blocks, algorithm blocks) to ensure smooth reading. However, the number of elements detected by the Document Layout Analysis (DLA) task is extremely limited, and tables, images and similar elements are usually processed separately, which creates challenges for fine-grained layout analysis. To address this issue, two new tasks, float position detection and float structure analysis, are added based on the existing Document Layout Analysis framework. Drawing reference from the DocGenome dataset, we collected arXiv documents and annotated them separately using X-AnyLabel, resulting in two datasets: the Float Detection Dataset (FLD) and the Float Structure Analysis Dataset (FSA), each containing 600 samples.
提供机构:
maas
创建时间:
2024-10-19
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集名为浮动体检测数据集(FLD),旨在解决文档布局分析中浮动体检测的局限性,通过新增Float Location Detection(FLD)任务来检测文档图像中浮动体的位置和类型,包括五种类型。数据集基于arXiv文档,使用X-AnyLabel标注,共包含600张图像。
以上内容由遇见数据集搜集并总结生成



