five

digitgen Sticker Dataset

收藏
arXiv2025-09-30 收录
下载链接:
https://github.com/ACRA-FL/GeoTRNet
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是一组模仿真实世界文本场景生成的贴纸,使用了名为'digitgen'的Python库进行创建。该数据集包含了一个包含1万个贴纸的训练集,以及各自包含2千个贴纸的验证集和测试集。此外,数据集还包含了各种增强处理,以复制实际常规场景文本图像中观察到的复杂性。规模上,该数据集拥有1万张训练图像,2千张验证图像以及2千张测试图像,其任务旨在进行文本识别。

This dataset is a collection of stickers generated to mimic real-world textual scenes, created using the Python library named 'digitgen'. It includes a training set with 10,000 stickers, as well as a validation set and a test set each containing 2,000 stickers. Additionally, the dataset incorporates various data augmentation techniques to replicate the complexity observed in real-world conventional scene text images. In terms of scale, the dataset consists of 10,000 training images, 2,000 validation images and 2,000 test images, and it is designed for text recognition tasks.
提供机构:
digitgen library
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作