five

Siluni/sinhala-vqa-dataset

收藏
Hugging Face2026-04-05 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Siluni/sinhala-vqa-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 language: - si task_categories: - visual-question-answering pretty_name: Sinhala VQA size_categories: - 10K<n<100K tags: - sinhala - vqa - low-resource - multimodal - visual-genome --- # Sinhala VQA Dataset A Sinhala-language Visual Question Answering dataset of 37,318 QA pairs, constructed by translating [Visual Genome](https://visualgenome.org/) QA annotations into Sinhala using gemini-3-flash-preview. This dataset was developed as part of research on benchmarking and adapting compact multimodal models for Sinhala VQA under low-resource conditions. ## Dataset Summary | Split | Samples | |------------|---------| | Train | 33,409 | | Validation | 2,909 | | Test | 1,000 | | **Total** | 37,318 | ## Schema Each row contains: | Field | Type | Description | |------------|--------|----------------------------------------------------------------| | `qa_id` | int64 | QA pair ID — directly corresponds to the Visual Genome QA ID | | `image_id` | int64 | Image ID — directly corresponds to the Visual Genome image ID | | `question` | string | Question in Sinhala | | `answer` | string | Answer in Sinhala | ## Images **Images are not included** in this dataset. They must be downloaded separately from Visual Genome: - **Version**: Visual Genome Version 1.2 (completed August 29, 2016) - **Download**: https://homes.cs.washington.edu/~ranjay/visualgenome/api.html - The `image_id` field in each row maps directly to the corresponding image in Visual Genome v1.2. ## Construction QA pairs from Visual Genome v1.2 were translated from English to Sinhala using the gemini-3-flash-preview API. The source dataset is the Visual Genome QA subset (`question_answers.json`). ## License The annotations in this dataset (questions and answers) are released under **CC-BY 4.0**. The underlying images are sourced from Visual Genome and remain under [Visual Genome's own license](https://homes.cs.washington.edu/~ranjay/visualgenome/api.html). ## Citation If you use this dataset, please cite: ```bibtex @misc{keerthiratne2025sinhalavqa, title = {Benchmarking and Adapting Compact Multimodal Models for Sinhala Visual Question Answering}, author = {Keerthiratne, Siluni and Weerasinghe, Ruvan and Sumanathilaka, Deshan}, year = {2025}, institution = {Informatics Institute of Technology / Robert Gordon University}, note = {Dataset available at https://huggingface.co/datasets/Siluni/sinhala-vqa-dataset} } ``` ## Contact Siluni Keerthiratne — Informatics Institute of Technology, Sri Lanka
提供机构:
Siluni
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作