PreNeT: Leveraging Computational Features to Predict Deep Neural Network Training Time
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/records/14920437
下载链接
链接失效反馈官方服务:
资源简介:
PreNeT: Leveraging Computational Features to Predict Deep Neural Network Training Time
We have developed a framework that allows users to predict the training time of various deep learning models and architectures. This is achieved by gathering execution times of different deep learning layer types under diverse configurations.
After collecting this data, which is shown in the figure below, we trained and evaluated multiple machine learning models for the task of predicting training times for individual layers. The best-performing model for each layer type was selected and used to provide accurate predictions of training times. This approach helps users estimate model performance and make informed decisions when designing or optimizing neural networks.
You can find more details about this framework and the experiments in our paper published in ACM/SPEC ICPE 2025. You can access the paper here: arxiv link
Setup
To get started, follow the steps below:
Clone the repository:
git clone https://github.com/pacslab/PreNet.gitcd PreNet
Create and activate a virtual environment:
python -m virtualenv venvsource ./venv/bin/activate
Install the required dependencies:
python -m pip install -r requirements.txt
Training Data Generation
To gather training data for a specific GPU, navigate to the data_collection directory and run a benchmark. For example, to test RNN models, use the following command:
mkdir datacd data_collectionpython run.py --testRNN --num_val=20000 --repetitions=5 --logdir=../data
This command will run the benchmark for RNN models with 20,000 validation examples and repeat the process 5 times to gather data. The results will be saved in the data directory.
Gathered Data
Sample Data
The data directory contains a sample of the data (in the sample_dataset folder) we collected for different layer types and GPUs. The data is stored in CSV files, with each row representing a different configuration. It is one-fourth of the original data we collected for the paper. The full dataset is available in the full_dataset folder.
创建时间:
2025-02-24



