davanstrien/ency-test
收藏Hugging Face2025-10-08 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/davanstrien/ency-test
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含了文档详细信息的的数据集,其中包括文档ID、页码、文件标识符、图像、文本内容、ALTO XML格式、是否含有图像或ALTO信息、文档元数据、版本、卷部分、出版年份、编辑者、完整标题和书架位置等字段。数据集被划分为训练集,可用于文档分析和处理任务。
This dataset contains detailed document information, including fields such as document ID, page number, file identifier, image, text content, ALTO XML format, whether it contains images or ALTO information, document metadata, edition, volume part, publication year, editor, full title, and shelf locator. The dataset is split into a training set, which can be used for document analysis and processing tasks.
提供机构:
davanstrien



