DriVQA: A Gaze-Based Dataset for Visual Question Answering in Driving Scenarios

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://data.mendeley.com/datasets/p25744hwrc

下载链接

链接失效反馈

官方服务：

资源简介：

DriVQA is a novel dataset that combines gaze plots and heatmaps with visual question-answering (VQA) data from participants who were presented with driving scenarios. Visual Questioning Answering (VQA) is proposed as a part of the vehicle autonomy trustworthiness and interpretability solution. They are recently being explored in the context of autonomous driving to enhance the understanding of the environment through visual inputs and enable more intelligent decision-making by the autonomous vehicle. Collected using the Tobii Pro X3-120 eye-tracking device, the DriVQA dataset provides a comprehensive mapping of where participants direct their gaze when presented with images of driving scenes, followed by related questions and answers from every participant. The DriVQA dataset contains five key elements for each scenario: images of driving situations, associated questions, participant answers, gaze plots, and heatmaps. Each gaze plot represents the exact points of focus and their sequence on the driving images, with the size of these exact points illustrating the length of attention, while the heatmaps illustrate the number of gaze points and their durations in various areas of the scene. DriVQA is being used to study the subjectivity inherent in VQA. Its detailed gaze-tracking data offers a unique perspective on how individuals perceive and interpret visual scenes, making it an essential resource for training VQA models that rely on human-like attention. The dataset is a valuable tool for investigating human cognition and behaviour in dynamic, real-world scenarios. DriVQA is highly relevant for VQA models, as it allows the systems to learn from human-like attention behaviour when making decisions based on visual input when trained. The gaze data has the potential to guide VQA models in selecting the most relevant regions of an image for answering specific questions, much like a human would focus on key areas of a driving scene. The dataset has the potential to drive advancements in VQA research and development by improving the safety and intelligence of driving systems through enhanced visual understanding and interaction. DriVQA has significant potential for reuse in various research areas, including the development of advanced VQA models, attention analysis, and human-computer interaction studies. Its comprehensive gaze plots and heatmaps can also be leveraged to improve applications in autonomous driving, driver assistance systems, and cognitive science research, making it a versatile resource for both academic and industrial purposes.

创建时间：

2024-12-17

5,000+

优质数据集

54 个

任务类型

进入经典数据集