"Freeway Traffic Incident Detection Dataset"
收藏DataCite Commons2026-04-12 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/freeway-traffic-incident-detection-dataset
下载链接
链接失效反馈官方服务:
资源简介:
"This dataset introduces a comprehensive, large-scale, and fully labeled benchmark designed to advance freeway traffic incident detection and imbalanced learning research, derived from the TraffiDent dataset [1]. Utilizing high-fidelity, raw California PeMS traffic time-series (occupancy rate, speed, and volume) and meticulous incident logs from 16,972 loop-detector sensors across major California freeways, we constructed this benchmark to bridge the gap between complex traffic dynamics and real-world anomalies.To ensure strict data integrity and experimental reproducibility, the dataset was engineered through a rigorous four-step pipeline: (1) Spatial matching: precisely aligning incident records to adjacent sensors on identical freeway directions within a 5-mile radius; (2) Time-window extraction: independently isolating a 25-step time series (at 5-minute intervals, spanning \u00b11 hour around the incident) for each traffic signal modality; (3) Historical baseline sampling: systematically collecting normal-condition traffic at the matched sensor, corresponding weekday, and exact time-of-day as negative references; and (4) Percentile-based feature engineering: computing the statistical position of each sample within its historical distribution to generate robust percentile rank features.The resulting dataset is unprecedented in scale, comprising approximately 14 million rows across three traffic modalities, and capturing 8 distinct real-world incident categories (e.g., hazard, no-injury collision, unkninj, fire, etc.). To cater to diverse analytical paradigms, the data is architected into dual formats: ML-optimized Parquet files (incident_data.parquet, non_incident_data.parquet), which are streamlined with pre-extracted temporal attributes (Hour, Month, Day) for immediate model ingestion; and Raw CSV files (A_final_output.csv, A_final_common_table.csv), which preserve comprehensive metadata and original text descriptions for deep exploratory analysis.To demonstrate the dataset's practical utility, it is currently utilized in our companion study (under review for IEEE OJ-ITS) to evaluate a novel semi-supervised anomaly detection framework integrating Variational Autoencoders and Multi-Head Attention. Furthermore, to establish comprehensive and rigorous baselines for the research community, 11 state-of-the-art methods\u2014spanning deep learning (e.g., Transformer, DLinear, LSTM), traditional machine learning, and anomaly detection\u2014were benchmarked using the ML-ready Parquet version. Extensive evaluations under three configurable imbalance ratios (natural 2:1, 11:1, and 100:1) unequivocally demonstrate the dataset's suitability, robustness, and vast potential for advancing Intelligent Transportation Systems (ITS) research.References: [1] X. Gou et al., \"TraffiDent: A Dataset for Understanding the Interplay Between Traffic Dynamics and Incidents,\" in Advances in Neural Information Processing Systems (NeurIPS), 2025. Available: https:\/\/www.kaggle.com\/datasets\/gpxlcj\/xtraffic"
提供机构:
IEEE DataPort
创建时间:
2026-04-12



