A Novel Spark Job Scheduling Algorithm Oriented Deadline-Cost Balance
收藏中国科学数据2026-03-16 更新2026-04-25 收录
下载链接:
https://www.sciengine.com/AA/doi/10.19678/j.issn.1000-3428.0069968
下载链接
链接失效反馈官方服务:
资源简介:
The significance of big data computation frameworks such as Apache Spark for large-scale data analysis is becoming increasingly prominent. However, handling data-intensive jobs by relying solely on local computing resources is difficult. Therefore, a feasible solution is to rent cloud resources from public cloud service providers and fully deploy Spark clusters in the cloud. However, this operation incurs high deployment costs. To reduce costs, an increasing number of users are choosing to use local and cloud resources together to build hybrid cloud computing clusters. However, in Spark clusters deployed in hybrid clouds, scheduling jobs while simultaneously meeting multiple service-level agreement requirements (such as minimizing costs and ensuring job deadlines at the same time) is challenging. Existing research mainly focuses on ways to reduce cluster usage costs or improve job deadline satisfaction rates, without considering the balance between these two goals. This paper proposes a Deadline-Cost Aware Ant Colony Optimization (DC-ACO) algorithm to solve the job scheduling problem in hybrid clouds. DC-ACO can optimize the pricing of different Virtual Machine (VM) instances in a hybrid cloud deployment cluster while maximizing the percentage of job deadlines met. In extensive simulation experiments, DC-ACO is compared with baseline methods. The results demonstrate that the proposed algorithm exhibits robust scalability, achieving an approximately 20% increase in the job deadline fulfillment percentage, coupled with a notable 10% reduction in VM usage costs for hybrid clusters.
创建时间:
2026-03-16



