TextSeg

Name: TextSeg
Creator: OpenDataLab
Published: 2026-05-17 09:30:34
License: 暂无描述

OpenDataLab2026-05-17 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/TextSeg

下载链接

链接失效反馈

官方服务：

资源简介：

现实世界中的文本极其多样化，但当前的文本数据集并不能很好地反映这种多样性。为了弥合这一差距，我们提出了 TextSeg，这是一个大规模的精细注释和多用途文本数据集，收集具有六种类型注释的场景和设计文本：单词和字符边界多边形、掩码和转录。我们还介绍了文本细化网络 (TexRNet)，这是一种新颖的文本分割方法，可适应文本的独特属性，例如非凸边界、多样纹理等，往往会给传统分割模型带来负担。 TexRNet 通过关键特征池和注意力从常见的分割方法中提炼结果，从而可以调整错误激活的文本区域。我们还引入了 trimap 和鉴别器损失，它们显示出对文本分割的显着改进。

Text in the real world is extremely diverse, but current text datasets fail to adequately reflect this diversity. To bridge this gap, we propose TextSeg, a large-scale, fine-grained annotated and multi-purpose text dataset that collects scene and design texts with six types of annotations: word and character boundary polygons, masks, and transcriptions. We also introduce the Text Refinement Network (TexRNet), a novel text segmentation method that adapts to the unique properties of text, such as non-convex boundaries and diverse textures, which often pose challenges for traditional segmentation models. TexRNet refines results from common segmentation approaches via key feature pooling and attention, enabling adjustment of misactivated text regions. We also introduce the trimap and discriminator loss, which demonstrate significant improvements for text segmentation.

提供机构：

OpenDataLab

创建时间：

2022-09-01

搜集汇总

数据集介绍