five

Dataset (2025) for article "Resource Optimization with MPI Process Malleability for Dynamic Workloads in HPC Clusters"

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14812021
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset was generated and used in the publication "Resource Optimization with MPI Process Malleability for Dynamic Workloads in HPC Clusters." The dataset is organized into three stages: raw data, preprocessed data, and processed data. Each workload execution includes log files, application code, executables, and launching scripts. Execution times are extracted from log files, specifically from "slurm-dmr_*.out" and "slurm-dmr_*.info", where "*" represents a number corresponding to a specific job execution. Dataset Structure: 1. raw_data Contains the output files from executing workloads on the MarenostrumV HPC cluster. This section is divided into three subsections, each corresponding to a different workload type: Static_Workload: Data for the static workload, which does not use malleability. Sync_Workload: Data for the synchronous dynamic workload, including results for both baseline and merge configurations (5 executions each). Async_Workload: Data for the asynchronous dynamic workload, including results for both baseline and merge configurations (5 executions each). 2. preprocessed_data This section contains the collected raw data in .pkl files, following the same structure as the raw_data folder. For each workload execution, four .pkl files are generated. The variable name can take values from [baseline, merge, static], while X represents a workload execution number: If X = J, the file contains a compilation of all workloads with the same configuration. If A appears before X, it refers to an asynchronous execution. The four types of .pkl files are: nameX_data.pkl: Contains application runtime data. A description is available in nameX_data_description.txt. nameX_data_resize.pkl: Contains application resize data. A description is available in nameX_data_resize_description.txt. nameAX_iter_data.pkl: Contains iteration time data for asynchronous (A) workloads. A description is available in nameAX_iter_data_description.txt. nameX_workload.pkl: Contains Slurm workload metrics. A description is available in nameX_workload_description.txt. 3. processed_data Includes the analyzed results from the preprocessed_data folder. This section contains .xlsx files and images used in the Experimental Setup section of the paper. The Excel files are categorized as follows: Exec_dataX.xlsx: Contains application execution results. Mall_dataX.xlsx: Contains resize time results for dynamic executions. 4. Codes This folder contains the scripts used to convert raw_data into preprocessed_data, along with a Jupyter Notebook used for data analysis and visualization. To understand or use these codes, please contact the dataset creators.
创建时间:
2025-02-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作