Deep Reinforcement Learning Algorithms for Heterogeneous Multiple Knapsack Problems
收藏中国科学数据2026-04-13 更新2026-04-25 收录
下载链接:
https://www.sciengine.com/AA/doi/10.19678/j.issn.1000-3428.0070166
下载链接
链接失效反馈官方服务:
资源简介:
By focusing on the traditional multi-Knapsack Problem (KP) in typical logistics system operations, this study abstracts a Heterogeneous Multiple Knapsack Problem (HMKP) and formulates an improved Deep Deterministic Policy Gradient (DDPG) algorithm to solve it. The DDPG algorithm tends to fall into a local optimum when solving the 0-1 KP. To address this issue, a Dynamic Randomization Mechanism (DRM) and Dynamic Penalty Mechanism (DPM) are adopted and embedded with an improved Transformer module to optimize the algorithm. Then, a Dynamic DDPG (TDP-DDPG) algorithm is proposed based on the improved Transformer module. First, a tabu list is added to prevent repeated searches. The TDP-DDPG algorithm demonstrates efficient search capability across several experimental algorithms, finding the ideal optimum in 39 classical algorithms in test sets 1 and 2 from low to high dimensionality, as well as in higher dimensionality test set 3 and in three out of six algorithms in large-scale test set 4. Experiments show that the TDP-DDPG algorithm has stronger optimization seeking ability after incorporating the improved strategy. Next, the BPD-DDPG algorithm is designed based on the TDP-DDPG algorithm to solve the HMKP with higher complexity and is analyzed and evaluated in high-dimensional arithmetic by combining several classical 0-1 KP examples. The results show that the BPD-DDPG algorithm is more accurate than Gurobi in three low-scale cases; however, the solution time is longer. The BPD-DDPG algorithm can efficiently solve high-dimension, large-scale HMKP at a low computational cost within an acceptable time.
创建时间:
2026-04-13



