five

Gaussian Process kernels comparison - Datasets and python code

收藏
DataCite Commons2024-06-24 更新2024-07-13 收录
下载链接:
https://figshare.unimelb.edu.au/articles/dataset/Gaussian_Process_kernels_comparison_-_Datasets_and_python_code/26087719
下载链接
链接失效反馈
官方服务:
资源简介:
OverviewData used for publication in "<b>Comparing Gaussian Process Kernels Used in LSG Models for Flood Inundation Predictions</b>". We investigate the impact of 13 Gaussian Process (GP) kernels, consisting of five single kernels and eight composite kernels, on the prediction accuracy and computational efficiency of the Low-fidelity, Spatial analysis, and Gaussian process learning (LSG) modelling approach. The GP kernels are compared for three distinct case studies namely Carlisle (United Kingdom), Chowilla floodplain (Australia), and Burnett River (Australia). The high- and low-fidelity model simulation results are obtained from the data repository Fraehr, N. (2024, January 19). Surrogate flood model comparison - Datasets and python code (Version 1). The University of Melbourne. https://doi.org/10.26188/24312658.v1.Dataset structureThe dataset is structured in 5 file folders:CarlisleChowillaBurnettRVComparison_resultsPython_dataThe first three folders contain simulation data and analysis codes. The "Comparison_results" folder contains plotting codes, figures and tables for comparison results. The "Python_data" folder contains LSG model functions and Python environment requirement.Carlisle, Chowilla, and BurnettRVThese files contain high- and low-fidelity hydrodynamic modelling data for training and validation for each individual case study, as well as specific Python scripts for training and running the LSG model with different GP kernels in each case study. There are only small differences between each folder, depending on the hydrodynamic model simulation results and EOF analysis results.Each case study file has the following folders:Geometry_dataDEM files<code>.npz</code> files containing of the high-fidelity models grid (XYZ-coordinates) and areas (Same data is available for the low-fidelity model used in the LSG model)<code>.shp</code> files indicating location of boundaries and main flow pathsXXX_modeldataFolder to storage trained model data for each XXX kernel LSG model. For example, EXP_modeldata contains files used to store the trainined LSG model using exponential Gaussian Process kernel.ME3LIN means ME3 + LIN. ME3mLIN means ME3 x LIN.EXPLow mean inducing points percentage for Sparse GP is 5%.EXPMid mean inducing points percentage for Sparse GP is 15%.EXPHigh mean inducing points percentage for Sparse GP is 35%.EXPFULL mean inducing points percentage for Sparse GP is 100%.HD_model_dataHigh-fidelity simulation results for all flood events of that case studyLow-fidelity simulation results for all flood events of that case studyAll boundary input conditionsHF_EOF_analysisStoring of data used in the EOF analysis for the LSG model.Results_dataStoring results of running the evaluation of the LSG models with different GP kernel candidates.Train_test_split_dataThe train-test-validation data split is the same for all LSG models with different GP kernel candidates. The specific split for each cross-validation fold is stored in this folder.YYY_event_summary.csv, YYY_Extrap_event_summary.csvFiles containing overview of all events, and which events are connected between the low- and high-fidelity models for each YYY case study.EOF_analysis_HFdata_preprocessing.py, EOF_analysis_HFdata.pyPreprocessing before EOF analysis and the EOF analysis of the high-fidelity data.Evaluation.py, Evaluation_extrap.pyScripts for evaluating the LSG model for that case study and saving the results for each cross-validation fold.train_test_split.pyScript for splitting the flood datasets for each cross-validation fold, so all LSG models with different GP kernel candidates train on the same data.XXX_training.pyScript for training each LSG model using the XXX GP kernel.ME3LIN means ME3 + LIN. ME3mLIN means ME3 x LIN.EXPLow mean inducing points percentage for Sparse GP is 5%.EXPMid mean inducing points percentage for Sparse GP is 15%.EXPHigh mean inducing points percentage for Sparse GP is 35%.EXPFULL mean inducing points percentage for Sparse GP is 100%.XXX_training.batBatch scripts for training all LSG models using different GP kernel candidates.Comparison_resultsFiles used for comparing LSG models using different GP kernel candidates and generate the figures in the paper "Comparing Gaussian Process Kernels Used in LSG Models for Flood Inundation Predictions". Figures are also included.Python_dataFolder containing Python script with utility functions for setting up, training, and running the LSG models, as well as for evaluating the LSG models. Python environmentThis folder also contains two python environment file with all Python package versions and dependencies. You can install CPU version or GPU version of environment. GPU version environment can use GPU to speed up the GPflow training process. It will install cuda and CUDnn package.You can choose to install environment online or offline. Offline installation reduces dependency issues, but it requires that you also use the same Windows 10 operating system as I do.Online installationLSG_CPU_environment.yml: python environment for running LSG models using CPU of the computerLSG_GPU_environment.yml: python environment for running LSG models using GPU of the computer, mainly using GPU to speed up the GPflow training process. It need to install cuda and CUDnn package.In the directory where the <code>.yml</code> file is located, use the console to enter the following command<pre>conda env create -f LSG_CPU_environment.yml -n myenv_name</pre>or<pre>conda env create -f LSG_GPU_environment.yml -n myenv_name</pre><br>Offline installationIf you also use Windows 10 system as I do, you can directly unzip environment packed by conda-pack.LSG_CPU.tar.gz: Zip file containing all packages in the virtual environment for CPU onlyLSG_GPU.tar.gz: Zip file containing all packages in the virtual environment for GPU accelerationIn <b>Windows</b> system, create a new <code>LSG_CPU</code> or <code>LSG_GPU</code> folder in the Anaconda environment folder and extract the packaged <code>LSG_CPU.tar.gz</code> or <code>LSG_GPU.tar.gz</code> file into that folder.<pre>tar -xzvf LSG_CPU.tar.gz -C ./LSG_CPU</pre>or<pre>tar -xzvf LSG_GPU.tar.gz -C ./LSG_GPU</pre>Access to the environment path<pre>cd ./LSG_GPU</pre>activation environment<pre>.\Scripts\activate.bat</pre>Remove prefixes from the activation environment<pre>.\Scripts\conda-unpack.exe</pre>Exit environment<pre>.\Scripts\deactivate.bat</pre>LSG_mods_and_funcPython scripts for using the LSG model.Evaluation_metrics.pyMetrics used to evaluate the prediction accuracy and computational efficiency of the LSG models.
提供机构:
The University of Melbourne
创建时间:
2024-06-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作