five

LASCID: Latin and Arabic Scene Character Image Dataset

收藏
IEEE2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/lascid-latin-and-arabic-scene-character-image-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
In international contexts, natural scenes may include text in multiple languages. Especially, Latin and Arabic scene character image dataset is essential for training models to accurately detect and recognize text regions within real-world images. This is crucial for applications such as text translation, image search, content analysis, and autonomous vehicles that need to interpret text in different languages. The proposed dataset encompasses a collection of 8034 Latin and Arabic scene character images which cover a large variety of text size, style, font, brightness, resolution, and orientation commonly encountered in diverse text related real-world contexts. In fact, an important effort has been done for collecting and labeling 4284 real scene character regions manually cropped from a set of well-known benchmark datasets, including ICDAR 2003, 2013, 2015, and 2017 scene text datasets. In addition, our dataset incorporates a set of 1860 synthetic character images from the CharImageDB dataset. Moreover, a Generative Adversarial Network (GAN)-based characters generator is developed to enhance the diversity of the dataset by creating 1890 synthetic Latin and Arabic character images, ensuring learning models to be exposed to a broader range of text visual information within real-world complex environments. Such a Latin and Arabic scene character image dataset is an important resource for advancing research and development in computer vision, OCR, and related fields, ensuring that technology can effectively process and understand textual information in diverse scripts and languages.
提供机构:
Drira, Fadoua; Walha, Rim; Harizi, Riadh
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作