five

IndoTraffic: Indonesian Mixed Traffic Vehicle Detection Dataset

收藏
Zenodo2025-12-15 更新2026-05-26 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.17722304
下载链接
链接失效反馈
官方服务:
资源简介:
IndoTraffic is a comprehensive computer vision dataset designed for vehicle detection and traffic monitoring in Indonesian mixed traffic conditions. The dataset addresses critical gaps in ground-based traffic surveillance datasets for Southeast Asian contexts, where motorcycle-dominated traffic patterns and diverse environmental conditions present unique challenges for automated detection systems. Dataset Overview - Total Images: 4,126 high-resolution images (1920×1080 pixels)- Total Annotations: 50,710 bounding box annotations- Geographic Coverage: 3 major Indonesian cities (Denpasar, Yogyakarta, Greater Jakarta)- Vehicle Classes: 6 classes (Motorcycles, Cars, Trucks, Buses, Pedestrians, Unmotorized Vehicles)- Temporal Coverage: 24-hour sampling across 5 time periods (morning, midday, evening, night, dusk/dawn)- Data Splits: Pre-defined train (70%), validation (20%), test (10%) splits with stratification- Formats: YOLO format (txt) and COCO format (JSON) annotations Key Features 1. Multi-City Coverage: Three Indonesian urban centers representing diverse traffic characteristics (tourism hub, education city, megacity)2. Rich Metadata: Comprehensive spatio-temporal metadata including timestamp, city, lighting conditions, and time period classifications3. Extreme Class Imbalance: Realistic motorcycle-dominated distribution (62.6% motorcycles) reflecting actual Southeast Asian traffic composition4. 24-Hour Temporal Sampling: Complete diurnal cycle coverage including challenging dusk/dawn transition periods5. Indonesian Context: Vehicle classes aligned with Indonesian Road Capacity Manual (PKJI 2023) standards Baseline Performance YOLOv8m baseline model achieves:- Overall mAP@0.5: 76.0% on test set- Per-class performance: Ranges from 97.6% (buses) to 66.5% (motorcycles)- Temporal variation: Cohen's d = 0.77 (medium-large effect) between daytime and nighttime- Novel finding: Motorcycle detection paradox - highest class frequency (62.6%) does not correlate with detection performance (66.5% AP) Research Applications This dataset enables research in:- Vehicle detection algorithms for mixed traffic- Small object detection in dense scenes- Temporal performance analysis- Geographic domain adaptation- Class imbalance mitigation strategies- Real-world traffic monitoring systems Data Collection Data collected from Automated Traffic Counting System (ATCS) cameras at strategic intersections between May-October 2024. Manual annotation performed using Roboflow Annotate with three-stage quality control verification. All annotations follow standardized protocols with inter-annotator agreement checks. License and Citation License: Creative Commons Attribution 4.0 International (CC BY 4.0) Citation:If you use this dataset in your research, please cite: Suartawan, E., et al. (2024). IndoTraffic: A Large-Scale Dataset for Vehicle Detection in Indonesian Mixed Traffic Conditions. Scientific Data (under review). DOI: [to be assigned] Dataset DOI: 10.5281/zenodo.[will be assigned upon upload] Quality Assurance - Three-stage verification protocol- Inter-annotator agreement validation- Stratified splitting with fixed random seed (42) for reproducibility- Comprehensive documentation and usage examples- Known limitations transparently documented DATASET CHARACTERISTICS: Geographic Diversity:- Denpasar (Bali): Tourism-focused city, moderate traffic density- Yogyakarta (Central Java): Education hub, student-heavy traffic- Jabodetabek (Greater Jakarta): Megacity, extreme density Temporal Coverage:- Morning Peak: 06:00-10:00 (41% of dataset)- Midday: 10:00-14:00 (20%)- Evening Peak: 14:00-18:00 (16%)- Night: 18:00-06:00 (11%)- Dusk/Dawn Transition: 05:00-07:00, 17:00-19:00 (12%) Class Distribution:- Motorcycles: 31,736 (62.6%)- Cars: 14,895 (29.4%)- Trucks: 2,301 (4.5%)- Buses: 1,471 (2.9%)- Pedestrians: 210 (0.4%)- Unmotorized: 97 (0.2%) Known Limitations:1. Class imbalance (632:1 ratio between motorcycles and unmotorized)2. Minority classes have limited samples (pedestrians n=210, unmotorized n=97)3. Static camera perspective (no moving vehicle footage)4. Single camera angle per location5. Geographic scope limited to Java and Bali islands6. Weather metadata not validated Technical Specifications:- Image Resolution: 1920×1080 pixels- Image Format: JPEG- Annotation Format: YOLO (normalized coordinates) and COCO (absolute coordinates)- Train/Val/Test Split: 70/20/10 (stratified by city and class distribution)- Random Seed: 42 (for reproducibility) Baseline Model Details:- Architecture: YOLOv8m (medium)- Training: 100 epochs, batch size 16, image size 640×640- Optimizer: AdamW with cosine annealing- Data Augmentation: Mosaic, mixup, HSV augmentation, random flip- Hardware: NVIDIA GPU (training time ~8 hours) USAGE RECOMMENDATIONS: For researchers working with this dataset:1. Use provided train/val/test splits for fair comparison2. Report per-class metrics separately (overall mAP can be misleading)3. Consider temporal metadata for domain-specific analysis4. Apply class balancing techniques for minority classes5. Validate motorcycle detection separately due to paradox finding6. Use metadata for stratified sampling in experiments For production deployment:1. Apply calibration factors for volume estimation (see paper)2. Use time-adaptive confidence thresholds (see Supplementary Information)3. Monitor performance during dusk/dawn periods (known degradation)4. Implement quality control triggers for low-confidence frames5. Plan for periodic model updates (seasonal variations) QUALITY CONTROL MEASURES: 1. Three-stage annotation verification:   - Initial annotation by trained annotator   - Random sample review (10% of each batch)   - Final consistency check across cities 2. Inter-annotator agreement:   - Tested on 100-image subset   - IoU threshold: 0.5   - Agreement rate: >95% 3. Data validation:   - Duplicate image detection (none found)   - Annotation format validation (all passed)   - Bounding box sanity checks (no out-of-bounds)   - Metadata consistency verification (all validated) ETHICAL CONSIDERATIONS: - All images collected from public spaces with appropriate permissions- No personally identifiable information (license plates blurred if visible)- Camera locations chosen to minimize privacy concerns- Dataset use agreement requires ethical research practices- Commercial use permitted but must respect privacy standards Contact For questions, issues, or collaboration inquiries:- Author: Eka Suartawan- nstitution: Bali Land Transportation Polytechnic, Indonesia- Email: putu.eka@poltradabali.ac.id Acknowledgments This work was conducted independently without institutional or external funding. We thank the traffic management authorities in Denpasar, Yogyakarta, and Jakarta for ATCS camera access. Keywords: vehicle detection, traffic monitoring, object detection, computer vision dataset, Indonesian traffic, mixed traffic, motorcycle detection, YOLO, deep learning, intelligent transportation systems Version: 1.0Release Date: November 2025
提供机构:
Zenodo
创建时间:
2025-12-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作