five

AoI-Aware Resource Allocation for Platoon-Based C-V2X Networks via Multi-Agent Multi-Task Reinforcement Learning

收藏
ieee-dataport.org2025-01-21 收录
下载链接:
https://ieee-dataport.org/documents/aoi-aware-resource-allocation-platoon-based-c-v2x-networks-multi-agent-multi-task
下载链接
链接失效反馈
官方服务:
资源简介:
The simulation code for the paper:"AoI-Aware Resource Allocation for Platoon-Based C-V2X Networks via Multi-Agent Multi-Task Reinforcement Learning" The overall architecture of the proposed MARL framework is shown in the figure. Modified MADDPG: This algorithm trains two critics (different from legacy MADDPG) with the following functionalities:The global critic which estimates the global expected reward and motivates the agents toward a cooperating behavior and an exclusive local critic for each agent that estimates the local individual reward. Modified MADDPG with Task decomposition: This algorithm is similar to the Modified MADDPG; however, in this algorithm, the local holistic reward function of each agent is further decomposed into multiple sub-reward functions based on the tasks each agent has to accomplish, and the task-wise value functions are learned separately. "For both algorithms, the global critic is built upon the twin delayed policy gradient (TD3)."

该论文《基于车队的C-V2X网络中AoI感知资源分配的多智能体多任务强化学习》的仿真代码。所提出的多智能体多任务强化学习(MARL)框架的整体架构如图所示。改进的MADDPG算法:该算法训练两个评论家(与传统的MADDPG不同),具备以下功能:全局评论家,用于估计全局期望奖励并激励智能体向合作行为迈进,以及每个智能体专有的独立局部评论家,用于估计局部个体奖励。改进的MADDPG与任务分解:此算法与改进的MADDPG类似;然而,在此算法中,每个智能体的局部整体奖励函数进一步分解为基于每个智能体需完成的任务的多个子奖励函数,并分别学习任务价值函数。对于这两种算法,全局评论家均基于双延迟策略梯度(TD3)构建。
提供机构:
IEEE Dataport
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作