mapo80/DocCornerDataset
收藏Hugging Face2025-12-17 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/mapo80/DocCornerDataset
下载链接
链接失效反馈官方服务:
资源简介:
DocCornerDataset是一个用于文档角点检测和四边形定位的综合数据集。该数据集包含27,860张图像,其中23,496张用于训练,4,364张用于验证,包括正样本(包含文档)和负样本(不包含文档)。数据集提供了高质量的标注,包括四个角点的归一化坐标(0-1),并涵盖了多种文档类型。数据以Parquet格式存储,包含图像字节、文件名、是否有文档以及四个角点的坐标等信息。数据集来源于多个公开数据集,如MIDV-500、AutoCapture等,并提供了加载和使用数据集的代码示例。
DocCornerDataset is a comprehensive dataset for document corner detection and quadrilateral localization. The dataset contains 27,860 images, with 23,496 training samples and 4,364 validation samples, including both positive samples (with documents) and negative samples (without documents). It provides high-quality annotations with 4-corner coordinates (TL, TR, BR, BL) in normalized format [0-1], covering various document types. The dataset is stored in Parquet format with columns such as image_bytes, filename, has_document, and the coordinates of the four corners. It aggregates images from multiple public sources like MIDV-500, AutoCapture, etc., and includes code examples for loading and using the dataset.
提供机构:
mapo80



