Mekha: an in-memory, coordinated checkpoint-restart mechanism for a virtual cluster

Name: Mekha: an in-memory, coordinated checkpoint-restart mechanism for a virtual cluster
Creator: Thammasat University
Published: 2024-01-02 06:46:20
License: 暂无描述

DataCite Commons2024-01-02 更新2025-04-16 收录

下载链接：

http://doi.nrct.go.th/?page=resolve_doi&resolve_doi=10.14457/TU.the.2023.16

下载链接

链接失效反馈

官方服务：

资源简介：

Cloud computing allows users to easily create a cluster of Virtual Machines (VMs) or Virtual Clusters (VCs) without investing in expensive hardware. Users can freely choose the guest OS and run distributed applications in the VC without any modifications. This makes running the VC on a cloud computing environment more attractive for high-performance computing users. However, to handle computing workloads from many users, cloud providers require a large number of servers to build their data centers to provide sufficient computing power. VMs in a VC are managed by a cloud management system and run on these servers. A major issue when running services or applications on a large-scale server cluster is a crash failure. When a crash failure occurs, VMs affected by the crash failure will fail or stop the execution, and the user needs to restart the VMs and rerun the application from the beginning. This significantly increases the time and budget required to execute the application. To address this problem, a classic solution is a checkpoint-restart mechanism, which periodically saves the state of VMs in a VC to persistent storage and recovers them from the saved state to resume execution when a crash failure occurs. Many checkpoint-restart mechanisms that are implemented at the hypervisor level have been proposed to increase the recoverability of the VC. The main advantage of these mechanisms is that they are highly transparent to the guest OS and applications, which do not need to be modified to support the checkpoint-restart mechanisms. However, the implementation of these mechanisms is challenging due to several problems. First, ensuring the correctness of the created checkpoint is difficult. Second, existing mechanisms suffer from high checkpoint overheads or high checkpoint latency. Third, the immediate time shift when restarting the VC may corrupt the execution of the application. Fourth, the impact of suspending and resuming VMs at different times may change the application behavior. Fifth, I/O contention when saving the state of all VMs to persistent storage can be an issue. Finally, the impact of the imbalance of checkpoint performance of some VMs may reduce the overall checkpoint performance. We introduce Mekha, an in-memory coordinated checkpoint-restart mechanism designed for a VC to address challenges that arise when running VCs in a cloud environment. Mekha ensures that all checkpoints created can recover the VC accurately. It uses the main memory as a temporary storage for checkpoints and a new Memory-bound Timed-multiplex Data transfer (MTD) algorithm to minimize checkpoint overhead and downtime. Additionally, Mekha implements time virtualization to isolate the time of VMs from the physical host's time, preventing the immediate time-shift problem. To reduce I/O contention, Mekha utilizes a scheduling algorithm to manage the saving of the state of VMs to checkpoint storage. Furthermore, Mekha employs multi-level checkpoint storage to reduce checkpoint latency. Finally, Mekha proposes global ending conditions to ensure that the MTD mechanism finishes simultaneously in all VMs to mitigate the impact of checkpoint performance imbalances. We developed a prototype of a VC management system to evaluate Mekha's performance. The system consists of two main components: the checkpoint coordinator and the checkpoint agent. The checkpoint and restart protocols were integrated into the checkpoint coordinator, while the QEMU hypervisor was used to implement the MTD algorithm and related functions. We conducted extensive experiments to evaluate Mekha's performance and compared it with two existing checkpoint mechanisms. We ran several NAS parallel benchmark programs on two different sizes of VC and tested the performance of Mekha and the other mechanisms using different types of checkpoint storage. Our experiments showed that Mekha could create correct checkpoints and restart the VC from the created checkpoint correctly due to the checkpoint and restart protocols of Mekha. Time inside a VM was preserved and progressed correctly when restarting the VM from the checkpoint. When running benchmark programs with a high memory update rate in VC and performing checkpoint operations, Mekha significantly reduced checkpoint overheads compared to the checkpoint mechanism that did not use the precopy algorithm and main memory as transient storage. The synchronization using rendezvous time efficiently reduced the difference between the suspension time and the resumption time of VMs. The MTD algorithm reduced VM checkpoint downtime significantly and completely prevented indefinite checkpoint operation when running computation-intensive and memory-intensive applications in the VC. The performance of the MTD algorithm was similar to the traditional precopy algorithm, and the checkpoint performance imbalance had less impact on the checkpoint performance of Mekha due to the global condition for finishing the MTD mechanism in each VM. The multi-level checkpoint storage efficiently reduced the duration of the saving state of VMs to the low write bandwidth checkpoint storage compared to the duration of the saving state of VMs to the low write bandwidth checkpoint storage directly. Additionally, the multi-level checkpoint storage and proposed scheduling algorithm could reduce I/O contention and mitigate I/O crashes.

提供机构：

Thammasat University

创建时间：

2024-01-02

5,000+

优质数据集

54 个

任务类型

进入经典数据集