STR-Fewer-Labels
收藏arXiv2021-06-05 更新2024-06-21 收录
下载链接:
https://github.com/ku21fan/STR-Fewer-Labels
下载链接
链接失效反馈官方服务:
资源简介:
STR-Fewer-Labels数据集是由东京大学创建的,用于场景文本识别(STR)任务的数据集。该数据集包含276,000条真实标记的数据,旨在解决在无法使用合成数据的情况下训练STR模型的问题,特别是在处理手写或艺术文本以及非英语语言时。数据集通过GitHub公开,地址为https://github.com/ku21fan/STR-Fewer-Labels。创建过程涉及整合多个公开的真实数据源,并进行了必要的数据预处理。该数据集的应用领域主要集中在提高STR模型在真实世界场景中的识别能力,特别是在标签较少的情况下。
The STR-Fewer-Labels dataset was created by The University of Tokyo for the Scene Text Recognition (STR) task. It contains 276,000 real-labeled samples, aiming to address the challenge of training STR models when synthetic data is unavailable, particularly when handling handwritten or artistic text and non-English languages. The dataset is publicly available on GitHub at https://github.com/ku21fan/STR-Fewer-Labels. Its creation involves integrating multiple publicly available real-world data sources and performing necessary data preprocessing steps. The primary application of this dataset is to improve the recognition performance of STR models in real-world scenarios, especially when labeled data is scarce.
提供机构:
东京大学
创建时间:
2021-03-08



