Shadowed Document 7K (SD7K)
收藏arXiv2023-09-01 更新2024-06-21 收录
下载链接:
https://github.com/CXH-Research/DocShadow-SD7K
下载链接
链接失效反馈官方服务:
资源简介:
SD7K是一个大规模的真实世界文档阴影移除数据集,包含超过7000对高分辨率(2462×3699)的真实文档图像,涵盖多种样本和不同光照条件。该数据集是现有数据集的10倍大小,旨在支持文档阴影移除研究。数据集创建过程中,使用Canon EOS M6相机在固定设置下捕捉文档图像,确保图像质量。SD7K适用于文档增强、光学字符识别等智能任务,旨在提高数字副本的可读性和视觉质量。
SD7K is a large-scale real-world document shadow removal dataset, containing over 7000 pairs of high-resolution (2462×3699) real document images covering diverse samples and various lighting conditions. This dataset is 10 times the size of existing datasets, aiming to support research on document shadow removal. During the dataset creation process, document images were captured using a Canon EOS M6 camera under fixed settings to ensure image quality. SD7K is applicable to intelligent tasks such as document enhancement and optical character recognition (OCR), with the goal of improving the readability and visual quality of digital document copies.
提供机构:
澳门大学
创建时间:
2023-08-28



