five

chicago traffic crash data 2020-2022

收藏
IEEE2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/chicago-traffic-crash-data-2020-2022
下载链接
链接失效反馈
官方服务:
资源简介:
In the realm of traffic safety analysis, the inherent imbalance in crash datasets, particularly in terms of injury severity, poses a significant challenge for machine learning-based classification models. This study delves into the efficacy of Generative Adversarial Networks (GANs), with a specific focus on Conditional Tabular GAN (CTGAN), for synthesizing minority crash data to address this imbalance. Utilizing traffic crash data from Chicago spanning 2020 to 2022, the research evaluates the capabilities of CTGAN against three traditional data resampling methods, as well as an additional cost-sensitive learning approach. These methods are evaluated across various injury severity classification scenarios (2-class, 3-class, and 4-class) using five commonly applied injury severity classification models. The study's dual evaluation approach encompasses both the quality of synthetic data and the enhancement of classification model performance. The pivotal findings reveal that: 1) CTGAN markedly outperforms other data resampling techniques in generating superior quality synthetic data, particularly for the least represented injury severity category; 2) While CTGAN demonstrates substantial improvements over traditional data resampling methods in classification model performance, this advantage diminishes as the number of injury categories increases; 3) Surprisingly, CTGAN's superior data quality does not result in better classification performance compared to cost-sensitive learning, especially in more complex classification scenarios. Cost-sensitive learning combined with LightGBM achieves the best classification performance across all scenarios. Given the significantly lower computational resources required by cost-sensitive learning, this approach is recommended for handling imbalanced injury severity data. 
提供机构:
Zhou, Bei
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作