Dataset (2025) for article "Resource Optimization with MPI Process Malleability for Dynamic Workloads in HPC Clusters"

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/14812021

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset was generated and used in the publication "Resource Optimization with MPI Process Malleability for Dynamic Workloads in HPC Clusters." The dataset is organized into three stages: raw data, preprocessed data, and processed data. Each workload execution includes log files, application code, executables, and launching scripts. Execution times are extracted from log files, specifically from "slurm-dmr_*.out" and "slurm-dmr_*.info", where "*" represents a number corresponding to a specific job execution. Dataset Structure: 1. raw_data Contains the output files from executing workloads on the MarenostrumV HPC cluster. This section is divided into three subsections, each corresponding to a different workload type: Static_Workload: Data for the static workload, which does not use malleability. Sync_Workload: Data for the synchronous dynamic workload, including results for both baseline and merge configurations (5 executions each). Async_Workload: Data for the asynchronous dynamic workload, including results for both baseline and merge configurations (5 executions each). 2. preprocessed_data This section contains the collected raw data in .pkl files, following the same structure as the raw_data folder. For each workload execution, four .pkl files are generated. The variable name can take values from [baseline, merge, static], while X represents a workload execution number: If X = J, the file contains a compilation of all workloads with the same configuration. If A appears before X, it refers to an asynchronous execution. The four types of .pkl files are: nameX_data.pkl: Contains application runtime data. A description is available in nameX_data_description.txt. nameX_data_resize.pkl: Contains application resize data. A description is available in nameX_data_resize_description.txt. nameAX_iter_data.pkl: Contains iteration time data for asynchronous (A) workloads. A description is available in nameAX_iter_data_description.txt. nameX_workload.pkl: Contains Slurm workload metrics. A description is available in nameX_workload_description.txt. 3. processed_data Includes the analyzed results from the preprocessed_data folder. This section contains .xlsx files and images used in the Experimental Setup section of the paper. The Excel files are categorized as follows: Exec_dataX.xlsx: Contains application execution results. Mall_dataX.xlsx: Contains resize time results for dynamic executions. 4. Codes This folder contains the scripts used to convert raw_data into preprocessed_data, along with a Jupyter Notebook used for data analysis and visualization. To understand or use these codes, please contact the dataset creators.

创建时间：

2025-02-06

5,000+

优质数据集

54 个

任务类型

进入经典数据集