MSDA (Multi-source domain adaptation dataset for text recognition)
收藏OpenDataLab2026-05-24 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/MSDA
下载链接
链接失效反馈官方服务:
资源简介:
近年来,基于深度学习的方法在计算机视觉领域显示出可喜的结果。但是,常见的深度学习模型需要大量的带标签的数据,收集和标记是劳动密集型的。更重要的是,由于训练数据和测试数据之间的域转换,模型可能会被破坏。文本识别是计算机视觉中一个广泛研究的领域,由于字体的多样性和复杂的背景,存在上述相同的问题。本文主要研究文本识别问题,并对这些问题做出了三点贡献。首先,我们收集了用于文本识别的多源域适应数据集,其中包括五个具有500万多个图像的不同域,这是我们所知的第一个多域文本识别数据集。其次,我们提出了一种新的方法,称为元自学习,该方法将自学习方法与元学习范式相结合,在多领域适应的场景下获得了更好的识别效果。第三,在数据集上进行了大量实验,以提供基准,并显示了我们方法的有效性。
In recent years, deep learning-based methods have achieved promising results in the field of computer vision. However, mainstream deep learning models require large amounts of labeled data, and data collection and annotation are labor-intensive. More importantly, models may suffer performance degradation due to domain shift between training and test datasets. Text recognition is a widely studied area in computer vision, and it faces the same aforementioned issues due to diverse fonts and complex backgrounds. This paper focuses on the text recognition task and makes three contributions to address these issues. First, we collect a multi-source domain adaptation dataset for text recognition, which covers five distinct domains with over 5 million images. To the best of our knowledge, this is the first multi-domain text recognition dataset. Second, we propose a novel method named Meta Self-Learning, which combines self-learning approaches with the meta-learning paradigm to achieve better recognition performance in multi-domain adaptation scenarios. Third, we conduct extensive experiments on the proposed dataset to establish benchmark results and demonstrate the effectiveness of our method.
提供机构:
OpenDataLab
创建时间:
2022-06-07
搜集汇总
数据集介绍

背景与挑战
背景概述
MSDA是一个多源域适应文本识别数据集,包含五个领域超过500万张图像,为首个此类数据集。它由北京邮电大学于2021年发布,旨在解决跨领域文本识别的挑战。
以上内容由遇见数据集搜集并总结生成



