多源域适应文本识别数据集
收藏arXiv2021-08-25 更新2024-06-21 收录
下载链接:
https://bupt-ai-cz.github.io/Meta-SelfLearning
下载链接
链接失效反馈官方服务:
资源简介:
多源域适应文本识别数据集是由北京邮电大学创建的,包含超过520万张图像,分为五个不同的域:合成域、文档域、街景域、手写域和车牌域。该数据集是首个多域文本识别数据集,旨在解决文本识别中的域适应问题。数据集的创建过程涉及从多个来源收集和过滤图像,确保覆盖广泛的文本样式和背景。该数据集适用于深度学习模型在不同域间的适应性训练,特别是在解决字体多样性和复杂背景带来的挑战方面。
The multi-source domain adaptation text recognition dataset was created by Beijing University of Posts and Telecommunications. It contains over 5.2 million images, divided into five distinct domains: synthetic domain, document domain, street scene domain, handwritten domain, and license plate domain. This is the first multi-domain text recognition dataset, designed to address the domain adaptation problem in text recognition. The dataset creation process involves collecting and filtering images from multiple sources, ensuring coverage of a wide range of text styles and backgrounds. This dataset is applicable for adaptive training of deep learning models across different domains, especially in tackling challenges posed by font diversity and complex backgrounds.
提供机构:
北京邮电大学
创建时间:
2021-08-25



