SVT (Street View Text Dataset)
收藏OpenDataLab2026-05-24 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/SVT
下载链接
链接失效反馈官方服务:
资源简介:
街景文本 (SVT) 数据集是从谷歌街景中获取的。此数据中的图像文本表现出高可变性,并且通常具有低分辨率。在处理户外街道图像时,我们注意到两个特征。 (1) 图像文本通常来自商业标牌,(2) 商业名称可通过地理商业搜索轻松获得。这些因素使 SVT 集特别适合在野外进行单词识别:给定街景图像,目标是识别来自附近企业的单词。有关数据集的更多详细信息,请参阅我们的论文 Word Spotting in the Wild。有关我们对这些数据的最新基准,请参阅我们的论文,端到端场景文本识别。该数据集只有单词级别的注释(没有字符边界框),应该用于(A)裁剪的词典驱动的单词识别和(B)全图像词典驱动的单词检测和识别。
The Street View Text (SVT) dataset is sourced from Google Street View. The image texts in this dataset exhibit high variability and often have low resolutions. When processing outdoor street images, we observe two key characteristics: (1) the image texts typically originate from commercial signs, and (2) commercial business names can be easily obtained through geolocation-based business searches. These factors make the SVT dataset particularly suitable for word recognition in the wild: given a street view image, the objective is to recognize words associated with nearby businesses. For more detailed information about this dataset, please refer to our paper *Word Spotting in the Wild*. For our latest benchmark on this dataset, please consult our paper *End-to-End Scene Text Recognition*. This dataset only includes word-level annotations (no character-level bounding boxes) and should be utilized for (A) cropped dictionary-driven word recognition and (B) full-image dictionary-driven word detection and recognition.
提供机构:
OpenDataLab
创建时间:
2022-04-29
搜集汇总
数据集介绍

背景与挑战
背景概述
SVT数据集是一个从谷歌街景获取的街景文本图像数据集,图像文本具有高可变性和低分辨率特点,主要来自商业标牌。该数据集仅提供单词级别注释,适用于词典驱动的单词识别和检测任务,常用于户外场景文本识别研究。
以上内容由遇见数据集搜集并总结生成



