Blocked-traffic-sign detection algorithm based on improved RT-DETR

中国科学数据2026-01-21 更新2026-04-25 收录

下载链接：

https://www.sciengine.com/AA/doi/10.13374/j.issn2095-9389.2025.06.11.001

下载链接

链接失效反馈

官方服务：

资源简介：

Accurate traffic-sign detection is a foundational capability for intelligent transportation systems and autonomous driving technologies; however, it remains a formidable challenge in real-world environments characterized by small scales, severe occlusions, highly variable lighting conditions, and complex backgrounds. Traditional convolutional neural network (CNN)-based detectors often struggle to maintain reliable performance when traffic signs appear at long distances or become partially hidden by vehicles, foliage, or roadside infrastructure owing to inherent limitations in feature extraction, scale sensitivity, and model robustness. To overcome these limitations, this paper presents an enhanced RT-DETR-based approach specifically tailored for occluded-traffic-sign detection under resource-constrained conditions. First, recognizing the scarcity of publicly available data that accurately reflect occlusion scenarios, we curated the traffic sign dataset under occlusion conditions (TSDOC), which comprises 4698 high-resolution images annotated across eight common traffic sign categories—including prohibitory, warning, and indicative signs—with 3572 images allocated for training and 1126 for testing. TSDOC systematically simulates real driving environments by incorporating diverse occlusion types, such as partial masking by other vehicles, foreign object attachment, dynamic shadows, and varying degrees of weather-induced visibility reduction. This enables a rigorous evaluation of detection methods under complex, safety-critical scenarios that closely mirror roadside conditions. Second, to improve the small and occluded object representation without incurring in excessive computational overhead, we redesigned the RT-DETR backbone by replacing the standard ResNet-18 BasicBlock with a novel composite dilated residual block (CDRB). Each CDRB integrates a dilated reparameterization block (DRB) into an inverted residual mobile block (iRMB), thereby combining multi-scale dilated convolutions that capture long-range pixel dependencies essential for reconstructing partially visible sign features with structural reparameterization techniques that streamline the inference graph for reduced latency. Consequently, the modified backbone achieves a 26.0% reduction in parameter count and a 12.5% decrease in floating-point operations per second (GFLOPs) compared to the baseline RT-DETR-R18, while maintaining or improving feature discrimination for occluded targets. Third, for faster convergence and enhanced localization precision—particularly for small and partially occluded signs—we introduce the dynamic scaled IoU loss (DS-IoU), a novel joint loss function that integrates Inner-IoU’s auxiliary bounding box strategy with a dynamically adjustable scaling factor Ratio and incorporates the minimal point distance metric from MPDIoU. This adaptive loss formulation emphasizes interior region overlap and geometric consistency during training, effectively replacing the conventional GIoU loss and enabling the model to focus on the most informative spatial regions under challenging conditions. Comprehensive experiments demonstrate the effectiveness of the proposed approach. On the TSDOC, TT100K, and CCTSDB2021 benchmarks, the proposed model achieved a mean average precision (mAP) of 94.2%, 92.8%, and 91.7%, respectively (a 4.7%, 3.1%, and 2.4% gain over RT-DETR). The real-time inference speed reached 112.8 s−1 a 18.5% improvement over RT-DETR. Ablation studies show that replacing the backbone with CDRB yields a 2.8% mAP increase, while DS-IoU further boosts recall under occlusion by 3.7%. This lightweight architecture and optimized loss function deliver higher detection accuracy and efficiency in occluded-traffic-sign scenarios, making it well suited for deployment in resource-constrained embedded systems.

精准的交通标志检测是智能交通系统（intelligent transportation systems）与自动驾驶技术的核心基础能力，但在现实场景中仍是一项极具挑战的任务——这类场景普遍存在目标尺度极小、遮挡严重、光照条件多变且背景复杂等问题。传统基于卷积神经网络（convolutional neural network, CNN）的检测器，由于特征提取、尺度敏感性与模型鲁棒性的固有局限，在交通标志处于远距离或被车辆、植被、路边基础设施部分遮挡时，往往难以维持可靠的检测性能。为克服上述局限，本文提出一种基于增强版RT-DETR的方法，专门针对资源受限条件下的遮挡交通标志检测任务。首先，鉴于能够精准反映遮挡场景的公开可用数据稀缺，我们构建了遮挡场景下交通标志数据集（traffic sign dataset under occlusion conditions, TSDOC）。该数据集包含4698张高分辨率图像，涵盖禁止、警告与指示等8类常见交通标志且均完成标注，其中3572张用于训练，1126张用于测试。TSDOC通过引入多种遮挡类型，包括其他车辆的部分遮挡、异物附着、动态阴影以及不同程度的天气诱导能见度降低，系统模拟真实驾驶环境，使检测方法能够在高度贴近真实路边场景的复杂安全关键场景中接受严格评估。其次，为在不引入过多计算开销的前提下，提升对小尺度与遮挡目标的特征表征能力，我们对RT-DETR的主干网络进行了重新设计：将标准ResNet-18 BasicBlock替换为一种新型复合膨胀残差块（composite dilated residual block, CDRB）。每个CDRB将膨胀重参数化块（dilated reparameterization block, DRB）集成到反向残差移动块（inverted residual mobile block, iRMB）中，从而结合了多尺度膨胀卷积与结构重参数化技术——前者可捕获重构部分可见标志特征所需的长距离像素依赖关系，后者则能简化推理图以降低延迟。相较于基线模型RT-DETR-R18，修改后的主干网络参数量减少26.0%，每秒浮点运算次数（GFLOPs）降低12.5%，同时对遮挡目标的特征辨别能力保持甚至有所提升。第三，为加快模型收敛速度并提升定位精度，尤其是针对小尺度与部分遮挡的交通标志，我们提出了动态缩放交并比损失（dynamic scaled IoU loss, DS-IoU）——一种新型联合损失函数。该损失将Inner-IoU的辅助边界框策略与动态可调缩放因子Ratio相结合，并融入了MPDIoU的最小点距离度量。这种自适应损失公式在训练过程中强调内部区域重叠与几何一致性，有效替代了传统GIoU损失，使模型能够在复杂场景下聚焦于最具信息量的空间区域。全面的实验验证了所提方法的有效性。在TSDOC、TT100K与CCTSDB2021三个基准数据集上，所提模型的平均精度均值（mean average precision, mAP）分别达到94.2%、92.8%与91.7%，相较于RT-DETR分别提升4.7%、3.1%与2.4%。实时推理速度达到112.8帧每秒，较RT-DETR提升18.5%。消融实验结果表明，使用CDRB替换主干网络可使mAP提升2.8%，而DS-IoU进一步将遮挡场景下的召回率提升3.7%。这种轻量型架构与优化后的损失函数，在遮挡交通标志检测场景中实现了更高的检测精度与效率，非常适合部署在资源受限的嵌入式系统中。

创建时间：

2026-01-21

5,000+

优质数据集

54 个

任务类型

进入经典数据集