Modeling Crash Severity and Collision Types Using Machine Learning

NIAID Data Ecosystem2026-03-13 收录

下载链接：

https://zenodo.org/record/6615741

下载链接

链接失效反馈

官方服务：

资源简介：

Traffic safety analysis is the fundamental step for reducing economic, social, and environmental cost incurred due to traffic accidents. The essence of traffic safety is understanding the factors affecting crash occurrence, injury severity and collision type and their underlying relationships and predict-prevent future crash instances. Crash injury severity studies in past have utilized numerous statistical, econometric and Machine Learning (ML) and Artificial Intelligence (AI) tools to extract the underlying relationship between the crash causal factors and the consequent severity or collision type. The study aims to explore the Multi-Label Classification (MLC) tool from the domain of Artificial Intelligence (AI) for classification problems in the setting of traffic safety. MLC finds its application primarily in protein function, semantic scene, and music categorization problems. In the real world, multiple heterogenous subjective factors decide the extent of damage/severity of a particular crash instance. Theoretically, the traffic collision type and crash severity type can be correlated, and thus, it is intuitive to model them simultaneously. The ability of MLC to categorize an entity under analysis to more than one labels, correlated or uncorrelated, provides the approach an edge over the single-class (binary) or multi-class classification approach. The MLC based classification model was calibrated and tested using the historical crash data extracted for the state of Texas. The selection of study area was based on a link-level unsupervised principal component analysis-based clustering approach. Similar clustering approach was also tested at the county-level to understand the spatial behavior and thus transferability of the MLC approach to other key cities in the state. The performance of the proposed approach was tested, compared, and quantified with the conventional binary/multi-class classification tools used in the traffic safety domain. Inferences from the preliminary numerical analysis indicates that the proposed multi-label classification approach has promising performance compared to the traditional classification approaches, specifically found in traffic safety literatures.

交通安全分析是降低交通事故所引发的经济、社会与环境成本的核心步骤。交通安全的本质在于厘清影响事故发生、伤害严重程度与碰撞类型的各类因素及其内在关联，并以此预测并防范未来的交通事故。过往针对事故伤害严重程度的研究，已采用大量统计学、计量经济学、机器学习（ML）与人工智能（AI）工具，以挖掘事故致因因素与后续伤害严重程度或碰撞类型之间的内在关联。本研究旨在探索人工智能（AI）领域中的多标签分类（MLC）工具，将其应用于交通安全场景下的分类任务。多标签分类工具目前主要应用于蛋白质功能预测、语义场景分类与音乐分类等任务中。在现实场景中，多重异质性主观因素共同决定了单次交通事故的损害程度/伤害严重等级。从理论层面而言，交通事故碰撞类型与伤害严重程度存在相关性，因此同时对二者进行建模符合直观逻辑。多标签分类能够为分析对象同时赋予多个相关或不相关的标签，这一特性使得该方法相较于单分类（二分类）或多分类方法更具优势。本研究基于得克萨斯州提取的历史事故数据，对多标签分类模型进行了校准与测试。研究区域的选取采用了基于链路级无监督主成分分析的聚类方法。本研究还在县级层面测试了同类聚类方法，以分析多标签分类方法的空间分布特征，并验证其在该州其他重点城市的可迁移性。本研究将所提方法的性能与交通安全领域常用的传统二分类/多分类工具进行了对比测试与量化评估。初步数值分析结果表明，相较于交通安全相关文献中常用的传统分类方法，本研究提出的多标签分类方法展现出了优异的性能表现。

创建时间：

2022-06-06

5,000+

优质数据集

54 个

任务类型

进入经典数据集