MSRA-TD500 (MSRA Text Detection 500 Database)

Name: MSRA-TD500 (MSRA Text Detection 500 Database)
Creator: OpenDataLab
Published: 2026-05-24 03:30:03
License: 暂无描述

OpenDataLab2026-05-24 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/MSRA-TD500

下载链接

链接失效反馈

官方服务：

资源简介：

“MSRA 文本检测 500 数据库（MSRA-TD500）收集并公开发布，作为评估文本检测算法的基准，旨在跟踪自然图像文本检测领域的最新进展，特别是进展用于检测任意方向的文本。MSRA 文本检测 500 数据库 (MSRA-TD500) 包含 500 张自然图像，这些图像是使用袖珍相机从室内（办公室和商场）和室外（街道）场景中拍摄的。室内图像主要是标志, 门牌和警示牌, 而户外图像多为复杂背景下的指南牌和广告牌. 图像的分辨率从 1296x864 到 1920x1280 不等. 由于文本的多样性和背景的复杂性, 数据集具有挑战性图片。文字可能是不同的语言（中文、英文或两者的混合）、字体、大小、颜色和方向。背景可能包含植被（例如树木和公共汽车） hes) 和重复的图案（例如窗户和砖块），它们与文本的区别不大。数据集分为两部分：训练集和测试集。训练集包含从原始数据集中随机选择的 300 张图像，其余 200 张图像构成测试集。该数据集中的所有图像都已完全注释。该数据集中的基本单位是文本行（见图 1），而不是 ICDAR 数据集中使用的单词，因为很难根据间距将中文文本行划分为单个单词；即使对于英文文本行，在没有高级信息的情况下执行单词划分也很重要。"

The MSRA Text Detection 500 Database (MSRA-TD500) was collected and publicly released as a benchmark for evaluating text detection algorithms, aiming to track the latest advances in the field of natural scene text detection, particularly those for detecting text of arbitrary orientations. The MSRA Text Detection 500 Database (MSRA-TD500) contains 500 natural scene images captured using compact cameras from indoor (office and shopping mall) and outdoor (street) scenarios. Indoor images mainly feature signs, house numbers, and warning signs, while outdoor images mostly consist of guide signs and billboards against complex backgrounds. The image resolutions range from 1296x864 to 1920x1280. Due to the diversity of text and complexity of backgrounds, the images in this dataset are highly challenging. The text can be in various languages (Chinese, English, or a mixture of both), fonts, sizes, colors, and orientations. Backgrounds may contain vegetation (e.g., trees and buses) and repeating patterns (e.g., windows and bricks), which are barely distinguishable from the text. The dataset is split into two subsets: the training set and the test set. The training set consists of 300 images randomly selected from the original dataset, while the remaining 200 images form the test set. All images in this dataset have been fully annotated. The basic annotation unit in this dataset is the text line (see Figure 1), rather than the word used in ICDAR datasets, since it is difficult to split Chinese text lines into individual words based on spacing; even for English text lines, performing word segmentation without advanced information is also very important.

提供机构：

OpenDataLab

创建时间：

2022-04-29

搜集汇总

数据集介绍