PaddlePaddle/Real5-OmniDocBench
收藏Hugging Face2026-03-11 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/PaddlePaddle/Real5-OmniDocBench
下载链接
链接失效反馈官方服务:
资源简介:
Real5-OmniDocBench是一个面向真实世界场景的全新基准数据集,基于OmniDocBench v1.5数据集构建。该数据集包含五种不同的场景:扫描、弯曲、屏幕摄影、光照和倾斜。除了扫描类别外,所有图像均通过手持移动设备手动获取,以紧密模拟真实世界条件。每个子集与原始OmniDocBench保持一一对应关系,严格遵循其真实标注和评估协议。鉴于其实证和现实性质,该数据集可作为评估文档解析模型在实际应用中鲁棒性的严格基准。
Real5-OmniDocBench is a brand-new benchmark oriented toward real-world scenarios, which we constructed based on the OmniDocBench v1.5 dataset. The dataset comprises five distinct scenarios: Scanning, Warping, Screen-Photography, Illumination, and Skew. Apart from the Scanning category, all images were manually acquired via handheld mobile devices to closely simulate real-world conditions. Each subset maintains a one-to-one correspondence with the original OmniDocBench, strictly adhering to its ground-truth annotations and evaluation protocols. Given its empirical and realistic nature, this dataset serves as a rigorous benchmark for assessing the robustness of document parsing models in practical applications.
提供机构:
PaddlePaddle



