iisc-aim/BMD-45
收藏Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/iisc-aim/BMD-45
下载链接
链接失效反馈官方服务:
资源简介:
---
pretty_name: BMD-45 (Bengaluru Mobility Dataset)
license: cc-by-4.0
tags:
- computer-vision
- object-detection
- traffic
- vehicles
- india
- cctv
- bengaluru
- urban
- intelligent-transportation-systems
task_categories:
- object-detection
task_ids:
- vehicle-detection
language:
- und
annotations_creators:
- crowd-sourced
source_datasets: []
size_categories:
- 10K<n<100K
dataset_info:
features:
- name: image
dtype: image
- name: objects
sequence:
- name: bbox
sequence: float32
- name: categories
dtype:
class_label:
names:
- Hatchback
- Sedan
- SUV
- MUV
- Bus
- Truck
- Three-wheeler
- Two-wheeler
- LCV
- Mini-bus
- Tempo-traveller
- Bicycle
- Van
- Other
splits:
- name: train
num_examples: 35792
- name: val
num_examples: 10194
configs:
- config_name: default
data_files:
- split: train
path: "BMD-45-Train/**"
- split: val
path: "BMD-45-Val/**"
---
# BMD-45: Bengaluru Mobility Dataset
**A large-scale CCTV vehicle detection benchmark for Indian urban traffic**
---
## Dataset Summary
**BMD-45** is a large-scale, India-specific vehicle detection dataset captured from **3,679 operational CCTV cameras** across Bengaluru — one of the world's most traffic-congested megacities.
| Statistic | Value |
| ----------------- | ----------------------------------- |
| Total images | **45,986** (1920×1080 RGB) |
| Total annotations | **≈ 481,947** bounding boxes |
| Vehicle classes | **14** fine-grained categories |
| Camera sources | **3,679** Safe City CCTV cameras |
| Train split | 35,792 images / 375,003 annotations |
| Val split | 10,194 images / 106,944 annotations |
| Test split | 5,110 images (annotations withheld) |
| Image resolution | 1920 × 1080 px |
| Annotation format | COCO JSON |
BMD-45 addresses critical gaps in existing traffic datasets: **scale**, **camera-view diversity**, and **fine-grained Indian vehicle taxonomy** — all absent from prior fixed-camera benchmarks (UA-DETRAC, TrafficCAM, IDD).
The dataset captures dense, heterogeneous, and unstructured traffic characteristic of Indian urban environments, with images sampled from long-duration CCTV recordings using **spatial and temporal diversity criteria** and **difficulty scoring** to prioritize challenging, high-information frames.
## Attribution
More technical details about the dataset and models are available in our Technical Report.
If you use these datasets or models, kindly cite the following:
```bibtex
@inproceedings{bmd45_2026,
title = {BMD-45: Bengaluru Mobility Dataset for Large-Scale Vehicle Detection from Urban CCTV},
author = {Akash Sharma and Chinmay Mhatre and Sankalp Gawali and Ruthvik Bokkasam and Brij Kishore and Vishwajeet Pattanaik and Tarun Rambha and Abdul R. Pinjari and Vijay Kovvali and Anirban Chakraborty and Punit Rathore and Raghu Krishnapuram and Yogesh Simmhan},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Findings (CVPRF26)},
year = {2026}
}
```
## Dataset Structure
The dataset follows the folder structure described below.
### **1. BMD-45-Train/**
Contains **35,792 images** (~78% of the dataset) used for training.
- `**images_000/`** through `**images_007/**` – Training images organized into subfolders for convenience.
- `images_000/*` – Training images (`41.png`, `47.png`, …). Each image filename is unique across the entire dataset.
- `images_001/*`, etc. – Additional subfolders following the same structure.
- `**_annotations.coco.json**` – Majority Voting consensus annotations for training images in **COCO JSON format**.
- `**metadata.jsonl`** – HuggingFace ImageFolder annotations (one JSON line per image).
### **2. BMD-45-Val/**
Contains **10,194 images** (~22% of the dataset) used for validation.
- `**images_000/`** through `**images_002/**` – Validation images organized into subfolders.
- `images_000/*` – Validation images. All filenames are globally unique across both training and validation sets.
- `images_001/*`, etc. – Additional subfolders following the same structure.
- `**_annotations.coco.json**` – Majority Voting consensus annotations for validation images in **COCO JSON format**.
- `**metadata.jsonl`** – HuggingFace ImageFolder annotations (one JSON line per image).
## Annotation JSON Schema
Each `_annotations.coco.json` file follows the standard COCO structure:
- `**images**` — list of image metadata
`id`, `file_name`, `width`, `height`
- `**annotations**` — object instances
`id`, `image_id`, `category_id`, `bbox [x, y, width, height]`, `area`
- `**categories**` — class taxonomy (IDs and names below)
Each `metadata.jsonl` contains one JSON line per image:
```json
{"file_name": "images_000/41.png", "objects": {"bbox": [[x, y, w, h], ...], "categories": [0, 2, ...]}}
```
### Annotation Pipeline
- **Source:** frames captured between 06:00 – 18:00 IST during February 2025
- **Pre-annotation:** generated using a fine-tuned **RT-DETR v2-X** model trained on ≈ 3 k expert-labeled images
- **Crowdsourcing:** > 550 student volunteers corrected or validated predictions through a gamified web interface with leaderboards
- **Consensus:** *majority voting* applied to derive final annotations
## Loading the Dataset
```python
from datasets import load_dataset
# Load from HuggingFace Hub
ds = load_dataset("iisc-aim/BMD-45")
# Access a sample
sample = ds["train"][0]
image = sample["image"]
objects = sample["objects"] # {"bbox": [...], "categories": [...]}
```
## Vehicle Classes
| ID | Class Name | Description |
| --- | --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
| 0 | Hatchback | Small passenger cars without a protruding rear boot ("dickey"). |
| 1 | Sedan | Passenger cars with a low-slung design and a separate protruding rear boot ("dickey"). |
| 2 | SUV | Car-like vehicles with high ground clearance, a sturdy body, and no protruding boot. |
| 3 | MUV | Large vehicles with three seating rows, combining passenger and cargo functionality. |
| 4 | Bus | Large passenger vehicles used for public or private transport, including office shuttles and intercity buses. |
| 5 | Truck | Heavy goods carriers with a front cabin and a rear cargo compartment. |
| 6 | Three-wheeler | Compact vehicles with one front wheel and two rear wheels, featuring a covered passenger cabin. |
| 7 | Two-wheeler | Motorbikes and scooters for single or double riders. Bounding boxes include both vehicle and rider. |
| 8 | LCV | Lightweight goods carriers used for short- to medium-distance transport. |
| 9 | Mini-bus | Shorter, compact buses with fewer seats; larger than a Tempo Traveller, often featuring a flat front. |
| 10 | Tempo-traveller | Medium-sized passenger vans with tall roofs and side windows; larger than vans but smaller than minibuses, with a protruding front. |
| 11 | Bicycle | Non-motorized, manually pedalled vehicles including geared, non-geared, women's, and children's cycles. Bounding boxes include both vehicle and rider. |
| 12 | Van | Medium-sized vehicles for transporting goods or people, typically with a flat front and sliding side doors; smaller than Tempo Travellers. |
| 13 | Other | Vehicles not covered in other classes, including agricultural, specialized, or unconventional designs. |
## Baseline Results
To justify the need for BMD-45, SOTA detectors were trained **on existing datasets** and evaluated on an expert-annotated reference set of 3,000 Bengaluru CCTV images. All models trained on other datasets fall well short of practical accuracy:
| Model | Training Data | mAP@50:95 |
| -------------- | ------------- | --------- |
| D-FINE X | IDD | 0.46 |
| RT-DETRv2 X | TrafficCAM | 0.39 |
| Grounding DINO | Zero-shot | 0.13 |
> These are **cross-dataset baselines** (models trained on other datasets, *not* BMD-45). Results for BMD-45-trained models are reported in the paper.
The following figure shows per-class AP@50:95 for selected models trained on BMD-45 and evaluated on the BMD-45 validation split:

Models trained on BMD-45 also demonstrate cross-dataset generalization to **UA-DETRAC**, **IDD**, and **TrafficCAM** using a taxonomy-aware class mapping protocol (see paper for full results).
## Cross-Dataset Generalization
BMD-45-trained models are evaluated on:
- **UA-DETRAC** — highway fixed-camera dataset (China, 4 classes)
- **IDD** — Indian Driving Dataset (ego-centric, 9 classes)
- **TrafficCAM** — Indian CCTV dataset (9 classes, 4,400 frames)
A taxonomy mapping protocol merges BMD-45's 14 fine-grained classes into the coarser categories of each target dataset for fair comparison.
The following figures compare per-class AP@50:95 for models trained on BMD-45 versus models trained on IDD, UA-DETRAC, and TrafficCAM, all evaluated on the BMD-45 validation split:



## Comparison with Existing Datasets
| Dataset | Venue | Task | View | Frames | Annotations | Classes | Cameras | Location |
| ----------------- | --------------- | ----- | -------------- | ------- | ----------- | ------- | --------- | -------- |
| IDD | WACV 2019 | D, S | Ego | 10K | 111.3K | 9 | — | IN |
| UA-DETRAC | CVIU 2020 | D, M | Fixed | 140K | 1.21M | 4 | 24 | CN |
| TrafficCAM | T-ITS 2025 | S | Fixed CCTV | 4.3K | 84.2K | 9 | NA | IN |
| **BMD-45 (Ours)** | **CVPR-F 2026** | **D** | **Fixed CCTV** | **45K** | **481.9K** | **14** | **3,679** | **IN** |
*D = Detection, S = Segmentation, M = Tracking; IN = India, CN = China*
## Collection & Processing Details
- **Source:** ≈ 3,679 *Safe City* surveillance cameras operated by Bengaluru Police
- **Coverage:** both junction and mid-block perspectives across multiple city zones
- **Time period:** February 2025, daytime hours (06:00–18:00 IST)
- **Resolution:** 1920 × 1080 RGB frames
- **Selection:** images with high vehicle density, occlusion, and diverse viewpoints prioritized
- **Filename obfuscation:** Image filenames are anonymized numeric IDs to prevent location inference
## Intended Uses
- Training and benchmarking **vehicle detection models** for CCTV / fixed-camera deployment
- Research in **Intelligent Transportation Systems (ITS)** for Indian and developing-world cities
- Cross-dataset generalization studies for region-specific detection
- Studying detection under **occlusion, heterogeneous traffic, and diverse viewpoints**
## License
- **Dataset:** [CC BY 4.0 International](https://creativecommons.org/licenses/by/4.0/)
- **Pre-trained Models:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
## Acknowledgements
We thank the **Bengaluru Traffic Police (BTP)** and the **Bengaluru Police** for providing access to the *Safe City* camera data from which the image datasets used for this release were derived.
We thank **Capital One** for sponsoring the prizes for the **Urban Vision Hackathon** competition.
We thank **IISc's AI and Robotics Technology Park (ARTPARK)** and the **Centre for Infrastructure, Sustainable Transportation and Urban Planning (CiSTUP)** for funding the annotation and model-training efforts, and the **Kotak IISc AI-ML Centre (KIAC)** for providing the GPU resources required to train the models.
We acknowledge the outreach support provided by the **ACM India Council** and the **IEEE India Council** to encourage chapter volunteers to participate in the hackathon.
Lastly, we thank the **AI Centers of Excellence (AI COE)** initiative of the **Ministry of Education**, their **Apex Committee members**, and the **AIRAWAT Research Foundation**, whose support helped catalyze these efforts.
Created by the **AI for Integrated Mobility (AIM)** group at the **Indian Institute of Science (IISc)**, Bengaluru.
提供机构:
iisc-aim
搜集汇总
数据集介绍

构建方式
在智能交通系统研究领域,构建具有地域代表性的数据集对于提升模型在复杂城市环境中的性能至关重要。BMD-45数据集的构建过程始于从班加罗尔市3,679个“安全城市”监控摄像头中采集原始视频流,时间窗口设定在2025年2月的白天时段。通过应用时空多样性标准与难度评分机制,从长时录像中筛选出信息含量高、挑战性强的关键帧,确保了样本的多样性与代表性。标注流程采用半自动化策略,首先利用在专家标注数据上微调的RT-DETR v2-X模型生成预标注,随后通过一个游戏化的网络界面,组织超过550名学生志愿者进行校正与验证,最终通过多数投票机制形成共识,生成了高质量的边界框标注。
特点
该数据集的核心特征体现在其规模、多样性与精细化的类别体系上。作为针对印度城市交通的大规模基准,它包含了45,986张高分辨率图像及约48.2万个标注框,其数据量远超同类固定摄像头数据集。独特的价值在于其覆盖了3,679个不同的摄像头视角,极大丰富了场景的多样性,能够充分反映印度城市交通中特有的密集、异构与非结构化特征。此外,数据集定义了14个细粒度的车辆类别,涵盖了从两轮车、三轮车到卡车、巴士等具有地域特色的车型,这种精细分类为模型理解复杂交通构成提供了坚实基础。
使用方法
该数据集主要服务于计算机视觉领域,特别是固定摄像头下的车辆检测任务。研究人员可通过Hugging Face平台便捷加载,利用其标准的COCO JSON格式标注进行模型训练与评估。数据集已预先划分为训练集与验证集,分别包含35,792和10,194张图像,便于开展监督学习。其设计初衷是用于训练和评估面向城市监控摄像头的车辆检测模型,尤其适用于研究在遮挡、视角多变和交通异质性强的挑战下的算法性能。同时,该数据集也支持跨数据集泛化研究,通过定义的分类映射协议,可评估模型在UA-DETRAC、IDD等其他基准上的迁移能力。
背景与挑战
背景概述
在智能交通系统与计算机视觉领域,针对固定监控视角的车辆检测研究长期面临数据规模与场景多样性的局限。BMD-45数据集由印度科学院的AI for Integrated Mobility团队于2026年创建,旨在填补印度城市交通环境下大规模、细粒度车辆检测基准的空白。该数据集从班加罗尔市3679个运营中的安全城市监控摄像头采集,包含45,986张高分辨率图像及约48.2万个边界框标注,涵盖14类精细车辆分类。其核心研究问题聚焦于解决传统数据集在规模、摄像机视角多样性及印度本土车辆分类体系上的不足,为发展中国家城市交通的智能感知研究提供了关键数据支撑,显著推动了区域特异性计算机视觉模型的发展。
当前挑战
BMD-45数据集所应对的领域挑战主要体现在印度城市交通的高度异质性与非结构化特征,包括车辆类别繁杂、遮挡严重、密度多变以及监控视角差异巨大,这些因素使得通用检测模型在此类场景中表现显著下降。在构建过程中,研究团队面临多重挑战:首先,从海量实时监控流中筛选具有高信息量的关键帧,需设计兼顾时空多样性与难度评分的采样策略;其次,针对14类细粒度车辆进行精准标注,依赖于结合预训练模型生成与超过550名众包志愿者校正的混合标注流程,并需通过多数投票机制达成标注共识;此外,在确保数据实用性的同时,还需通过文件名匿名化等技术手段处理隐私与地理位置信息泄露的风险。
常用场景
经典使用场景
在智能交通系统研究领域,BMD-45数据集为车辆检测模型提供了大规模、高精度的训练与评估基准。其经典使用场景集中于利用城市监控摄像头视角,对印度班加罗尔地区特有的密集、异构交通流进行细粒度车辆识别。该数据集通过涵盖3,679个不同摄像头视角的45,986张高分辨率图像,构建了包含14类精细车辆分类的标注体系,为模型在复杂城市环境中的鲁棒性验证提供了关键支撑。
解决学术问题
BMD-45数据集有效解决了计算机视觉领域在特定区域交通场景中的三大核心问题:填补了现有数据集在规模、视角多样性和细粒度车辆分类方面的空白;为研究非结构化交通环境下的目标检测算法提供了真实世界基准;支持跨数据集泛化研究,推动模型在多样化地理与文化背景下的适应性。该数据集通过提供大规模印度城市交通标注数据,促进了区域特异性视觉模型的发展,对智能交通系统的学术研究具有里程碑意义。
衍生相关工作
围绕BMD-45数据集衍生的经典工作主要包括区域适应性车辆检测模型的开发与评估。研究团队基于该数据集训练了RT-DETR等先进检测器,并系统验证了其在跨数据集泛化任务中的性能。相关工作进一步探索了细粒度车辆分类在交通行为分析中的应用,以及将模型迁移至UA-DETRAC、IDD等其他交通数据集的有效性。这些研究不仅推动了目标检测算法在特定文化交通环境中的进步,也为全球智能交通系统的本地化部署提供了重要参考框架。
以上内容由遇见数据集搜集并总结生成



