Code underlying the publication: Comprehensive Training and Evaluation on Deep Reinforcement Learning for Automated Driving in Various Simulated Driving Maneuvers
收藏4TU.ResearchData2025-02-20 更新2026-04-23 收录
下载链接:
https://data.4tu.nl/datasets/26e8f131-53f8-44b9-8ecf-249bfedb0154/1
下载链接
链接失效反馈官方服务:
资源简介:
This is the code and data related to the publication:Y. Dong, T. Datema, V. Wassenaar, J. Van de Weg, C. T. Kopar and H. Suleman, "Comprehensive Training and Evaluation on Deep Reinforcement Learning for Automated Driving in Various Simulated Driving Maneuvers," 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC), Bilbao, Spain, 2023, pp. 6165-6170, doi: 10.1109/ITSC57777.2023.10422159. keywords: {Training;Deep learning;Roads;Reinforcement learning;Automobiles;Task analysis;Optimization}<br>The implementation is based on Python, Stable-Baselines3 (https://stable-baselines3.readthedocs.io/en/master/) and Highway_env simulation environment https://github.com/Farama-Foundation/HighwayEnv<br>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Developing and testing automated driving models in the real world might be challenging and even dangerous, while simulation can help with this, especially for challenging manoeuvres. Deep reinforcement learning (DRL) has the potential to tackle complex decision-making and controlling tasks through learning and interacting with the environment, thus it is suitable for developing automated driving while not being explored in detail yet. This study carried out a comprehensive study by implementing, evaluating, and comparing the two DRL algorithms, Deep Q-networks (DQN) and Trust Region Policy Optimization (TRPO), for training automated driving on the highway-env simulation platform. Effective and customized reward functions were developed and the implemented algorithms were evaluated in terms of onlane accuracy (how well the car drives on the road within the lane), efficiency (how fast the car drives), safety (how likely the car is to crash into obstacles), and comfort (how much the car makes jerks, e.g., suddenly accelerates or brakes). Results show that the TRPO-based models with modified reward functions delivered the best performance in most cases. Furthermore, to train a uniform driving model that can tackle various driving manoeuvres besides the specific ones, this study expanded the highway-env and developed an extra customized training environment, namely, ComplexRoads, integrating various driving manoeuvres and multiple road scenarios together. Models trained on the designed ComplexRoads environment can adapt well to other driving manoeuvres with promising overall performance. Lastly, several functionalities were added to the highway-env to implement this work. The codes are open on GitHub at https://github.com/alaineman/drlcarsim-paper.<br>
本数据集关联的研究论文为:Y. Dong、T. Datema、V. Wassenaar、J. Van de Weg、C. T. Kopar与H. Suleman合著的《面向多样化仿真驾驶工况的自动驾驶深度强化学习综合训练与评估》,发表于2023年IEEE第26届智能交通系统国际会议(ITSC),举办地为西班牙毕尔巴鄂,2023年,页码范围6165-6170,DOI:10.1109/ITSC57777.2023.10422159。关键词:训练;深度学习;道路;强化学习;汽车;任务分析;优化。
本项目的实现基于Python、Stable-Baselines3(https://stable-baselines3.readthedocs.io/en/master/)以及Highway_env仿真环境(https://github.com/Farama-Foundation/HighwayEnv)。
现实世界中开发与测试自动驾驶模型往往充满挑战,甚至存在安全隐患,而仿真技术则可有效解决这一问题,尤其适用于高难度驾驶工况。深度强化学习(Deep Reinforcement Learning, DRL)能够通过学习并与环境交互来处理复杂的决策与控制任务,因此适配自动驾驶开发场景,但目前相关研究尚未得到充分探索。本研究开展了一项系统性研究:在highway-env仿真平台上,针对自动驾驶训练实现、评估并对比了两种深度强化学习算法——深度Q网络(Deep Q-networks, DQN)与置信域策略优化(Trust Region Policy Optimization, TRPO)。我们设计了高效的定制化奖励函数,并从车道保持精度(车辆在车道内的行驶表现)、行驶效率(车辆的行驶速度)、行驶安全性(车辆与障碍物发生碰撞的概率)以及驾乘舒适性(车辆出现急加速、急刹车等突兀操作的程度)四个维度对所实现的算法进行了评估。实验结果表明,结合改进后奖励函数的TRPO模型在绝大多数场景下均取得了最优性能。
为训练能够适配除特定工况外的多样化驾驶任务的通用自动驾驶模型,本研究对highway-env进行了扩展,构建了一款额外的定制化训练环境——ComplexRoads,该环境整合了多种驾驶工况与多类道路场景。在该定制化ComplexRoads环境中训练得到的模型,能够很好地适配其他驾驶工况,整体性能表现优异。最后,本研究为highway-env新增了若干功能以完成上述工作。相关代码已开源至GitHub,地址为https://github.com/alaineman/drlcarsim-paper。
提供机构:
Tolga Kopar, Cahit; Suleman, Harim; Van de Weg, Joris; Datema, Tobias; Wassenaar, Vincent
创建时间:
2025-02-20



