arpitareva/Indian_Traffic_VQA_Dataset
收藏Hugging Face2026-03-26 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/arpitareva/Indian_Traffic_VQA_Dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- visual-question-answering
language:
- en
tags:
- Traffic_Signal_Recognition
size_categories:
- 1K<n<10K
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
dataset_info:
features:
- name: id
dtype: int64
- name: image_name
dtype: string
- name: question
dtype: string
- name: answer
dtype: string
- name: ' traffic_sign'
dtype: string
- name: image
dtype: image
splits:
- name: train
num_bytes: 159624173.688
num_examples: 3472
- name: test
num_bytes: 39973129.0
num_examples: 869
download_size: 193968286
dataset_size: 199597302.688
---
🧭 Overview
Indian Traffic VQA is a real-world Visual Question Answering (VQA) dataset focusing on Indian road traffic signboards.
The dataset is designed for training and evaluating Vision-Language Models (VLMs) and VQA systems in the traffic and transportation domain.
This dataset bridges a gap between real-world Indian traffic conditions and machine understanding — ideal for research in autonomous driving, smart city AI, and traffic sign recognition under natural environments.
________________________________________
📦 Dataset Summary
• Images: 1,085 real-world traffic signboard images
• Questions: 4,341 unique questions
• Answers: Short, ground-truth textual responses
• Source: All images were collected using a mobile phone in real Indian road environments
• Format: .csv file with the following columns:
o image_name — name of the image file
o question — text-based query
o answer — corresponding ground-truth answer
________________________________________
🧠 Task Definition
Given an image of a traffic signboard and a related question, the model must predict a short text answer.
Example:
image_name question answer
imag_00001.jpg What does this sign indicate? Speed Limit
img_00002.jpg What does this sign show? Stop
img_00003.jpg Is U-turn allowed here? No
________________________________________
🧩 Applications
• Visual Question Answering (VQA)
• Vision-Language Model (VLM) Fine-tuning
• Multimodal classification of traffic signs
• Dataset for benchmarking model reasoning in domain-specific visual data
________________________________________
🧰 Data Collection Details
• Captured in diverse Indian traffic conditions (urban, rural, highways)
• Includes varying lighting, occlusions, and view angles
• All images are real photographs, not synthetic
________________________________________
⚖️ License
You may distribute, modify, or use this dataset for non-commercial research purposes.
Please give appropriate credit by citing this dataset.
________________________________________
The .zip file contains all the 1085 images with 512x512 resolution.
There are two .csv files attached.
traffic_vqa_1085.csv contains one question and one answer,
traffic_vqa_4341.csv contains multiple questions and answers per image.
The first .csv file can be used for low resource computational environment.
________________________________________
🚀 Future Work
Future releases will include:
• Regional signboard subsets (state-specific)
• Video-based question answering
• Multilingual question support (English + Hindi)
________________________________________
👥 Contributors
🧠 Data Curators
🧩 Chandra Mohan Bhuma
🧩 CH.V.M.S.N. Pavan Kumar
🧩 T. Krishna Chaitanya
🧩 Miriyala Suneel
📷 Data Collectors
📸 Perumalla Himasri
📸 Vallapuneni Venkata Siva Kumar
📸 Ratna Seethal Saripalli
📸 Somanapalli Hindu
---
许可证:MIT协议
任务类别:
- 视觉问答(Visual Question Answering, VQA)
语言:
- 英语
标签:
- 交通信号识别
样本规模区间:
- 1000 < n < 10000
配置项:
- 配置名称:default
数据文件:
- 拆分集:训练集
路径:data/train-*
- 拆分集:测试集
路径:data/test-*
数据集信息:
特征项:
- 名称:id
数据类型:64位整数(int64)
- 名称:image_name
数据类型:字符串
- 名称:question
数据类型:字符串
- 名称:answer
数据类型:字符串
- 名称:traffic_sign
数据类型:字符串
- 名称:image
数据类型:图像
拆分集:
- 名称:训练集
字节数:159624173.688
样本数:3472
- 名称:测试集
字节数:39973129.0
样本数:869
下载总大小:193968286
数据集总大小:199597302.688
---
🧭 数据集概览
印度交通视觉问答数据集(Indian Traffic VQA)是一款聚焦印度道路交通标牌的真实世界视觉问答(Visual Question Answering, VQA)数据集。本数据集专为交通与运输领域的视觉语言模型(Vision-Language Model, VLM)及视觉问答系统的训练与评估而打造,填补了真实印度交通场景与机器理解能力之间的空白,是自动驾驶、智慧城市人工智能及自然环境下交通信号识别相关研究的理想载体。
________________________________________
📦 数据集概况
• 图像:共1085张真实道路交通标牌实拍图像
• 问题:包含4341个独特问题
• 答案:简短的真实标注文本回复
• 来源:所有图像均通过移动设备在真实印度道路场景中采集
• 格式:采用.csv文件格式,包含以下字段:
o image_name — 图像文件名
o question — 文本形式的查询问题
o answer — 对应的真实标注答案
________________________________________
🧠 任务定义
给定一张交通标牌图像及相关问题,模型需预测出简短的文本答案。
示例:
imag_00001.jpg 该标志表示什么? 限速
img_00002.jpg 该标志代表什么? 停车让行
img_00003.jpg 此处允许掉头吗? 不允许
________________________________________
🧩 应用场景
• 视觉问答(Visual Question Answering, VQA)
• 视觉语言模型(Vision-Language Model, VLM)微调
• 交通标牌多模态分类
• 针对领域特定视觉数据的模型推理能力基准测试数据集
________________________________________
🧰 数据采集详情
• 采集场景涵盖多样化的印度交通环境(城市、乡村、高速公路)
• 包含不同光照条件、遮挡情况与拍摄视角
• 所有图像均为实拍照片,非合成生成
________________________________________
⚖️ 许可证
您可出于非商业研究目的分发、修改或使用本数据集。请引用本数据集以表示对原作者的恰当致谢。
________________________________________
本压缩包包含全部1085张分辨率为512×512的图像,附件包含两个.csv文件。其中traffic_vqa_1085.csv每张图像对应一个问题与一个答案,traffic_vqa_4341.csv每张图像对应多个问题与答案。第一个.csv文件可适用于计算资源有限的环境。
________________________________________
🚀 未来规划
后续版本将新增:
• 区域化标牌子集(按邦/州划分)
• 基于视频的问答任务
• 多语言问题支持(英语+印地语)
________________________________________
👥 贡献者
🧠 数据编审
🧩 钱德拉·莫汉·布马
🧩 CH.V.M.S.N. 帕万·库马尔
🧩 T. 克里希纳·查伊塔尼亚
🧩 米里亚拉·苏尼尔
📷 数据采集者
📸 佩鲁马拉·希马斯里
📸 瓦拉普努尼·文卡塔·西瓦·库马尔
📸 拉特纳·西塔尔·萨里帕利
📸 索马纳帕利·欣杜
提供机构:
arpitareva



