Urdu-VQA-Dataset
收藏arXiv2024-05-21 更新2024-06-21 收录
下载链接:
https://github.com/Hiba-MeiRuan/Urdu-VQA-Dataset-/tree/main
下载链接
链接失效反馈官方服务:
资源简介:
Urdu-VQA-Dataset是由华中科技大学创建的多任务数据集,包含超过1000张自然场景的Urdu文本图像,用于文本检测、识别和视觉问答任务。该数据集通过手动标注,解决了先前数据集在处理任意形状文本时的局限性,并首次为Urdu文本视觉问答方法提供了基准。数据集内容涵盖多种文本布局、复杂形状和非标准方向,适用于开发和评估能够处理现实世界场景中多样挑战的方法。此外,数据集的应用领域包括提升数字内容的无障碍性、信息检索和语言多样性,以及更好地理解和交互使用Urdu语言的视觉数据。
The Urdu-VQA-Dataset is a multi-task dataset developed by Huazhong University of Science and Technology, which contains over 1,000 natural scene Urdu text images for text detection, text recognition, and visual question answering (VQA) tasks. Manually annotated, this dataset addresses the limitations of previous datasets in handling arbitrarily shaped text, and provides the first benchmark for Urdu text-based visual question answering methods. The dataset covers diverse text layouts, complex text shapes, and non-standard orientations, making it suitable for developing and evaluating methods that can handle various challenges in real-world scenarios. Additionally, its application fields include improving the accessibility of digital content, information retrieval, linguistic diversity, as well as better understanding and interactive utilization of visual data related to the Urdu language.
提供机构:
华中科技大学
创建时间:
2024-05-21



