vojtam/BiblioPage
收藏Hugging Face2025-10-22 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/vojtam/BiblioPage
下载链接
链接失效反馈官方服务:
资源简介:
BiblioPage数据集是一个包含2,118张来自捷克图书馆的扫描标题页,并带有结构化书目元数据注释的集合。该数据集用于书目元数据提取、文档理解和视觉语言模型评估的基准测试。数据集中的注释包括文本和精确的边界框,并提供了针对YOLO、DETR和VLLMs(GPT-4o、LLaMA 3)的评估结果。
The BiblioPage dataset is a collection of 2,118 scanned title pages from Czech libraries (1485–21st century), annotated with structured bibliographic metadata. The dataset serves as a benchmark for bibliographic metadata extraction, document understanding, and visual language model evaluation. Annotations include both text and precise bounding boxes, and evaluation results are available for YOLO, DETR, and VLLMs (GPT-4o, LLaMA 3).
提供机构:
vojtam



