VietTravelVQA
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/fvxtpnh8mh
下载链接
链接失效反馈官方服务:
资源简介:
Objective: This dataset is designed to benchmark and improve Visual Question Answering (VQA) systems in the context of Vietnamese tourism and cultural heritage. It addresses the lack of high-quality, regionally specific multimodal data for Southeast Asia.
Data Content: The dataset comprises thousands of images sourced from Wikimedia Commons, paired with human-verified question-answer sets in Vietnamese. The questions cover five levels of complexity, ranging from basic object identification to deep cultural reasoning.
Methodology:
1. Sourcing: Legally compliant images were filtered from Wikimedia Commons.
2. Annotation: Expert annotators generated QA pairs, focusing on architectural details, historical significance, and spatial reasoning.
3. Validation: Data was cleaned using automated scripts to ensure 100% synchronization between metadata (JSON) and image files, with factual auditing via Large Multimodal Models (LMMs).
Usage: The data is split into train and test sets (.json). It is intended for training, fine-tuning, and evaluating Vision-Language Models (VLMs) on localized cultural contexts.
创建时间:
2026-03-16



