PPMLES – Perturbed-Parameter ensemble of MUST Large-Eddy Simulations
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11394346
下载链接
链接失效反馈官方服务:
资源简介:
Dataset description
This repository contains the PPMLES (Perturbed-Parameter ensemble of MUST Large-Eddy Simulations) dataset, which corresponds to the main outputs of 200 large-eddy simulations (LES) of microscale pollutant dispersion that replicate the MUST field experiment [Biltoft. 2001, Yee and Biltoft. 2004] for varying meteorological forcing parameters.
The goal of the PPMLES dataset is to provide a comprehensive dataset to better understand the complex interactions between the atmospheric boundary layer (ABL), the urban environment, and pollutant dispersion. It was originally used to assess the impact of the meteorological uncertainty on microscale pollutant prediction and to build a surrogate model that can replace the costly LES model [Lumet et al. 2025]. The total computational cost of the PPMLES dataset is estimated to be about 6 million core hours.
For each sample of meteorological forcing parameters (inlet wind direction and friction velocity), the AVBP solver code [Schonfeld and Rudgyard. 1999, Gicquel et al. 2011] was used to perform LES at very high spatio-temporal resolution (1e-3s time step, 30cm discretization length) to provide a fine representation of the pollutant concentration and wind velocity statistics within the urban-like canopy. The total computational cost of the PPMLES dataset is estimated to be about 6 million core hours.
File list
The data is stored in HDF5 files, which can be efficiently processed in Python using the h5py module.
input_parameters.h5: list of the 200 input parameter samples (alpha_inlet, ustar) obtained using the Halton sequence that defines the PPMLES ensemble.
ave_fields.h5: lists of the main field statistics predicted by each of the 200 LES samples over the 200-s reference window [Yee and Biltoft. 2004], including:
c: the time-averaged pollutant concentration in ppmv (dim = (n_samples, n_nodes) = (200, 1878585)),
(u, v, w): the time-averaged wind velocity components in m/s,
crms: the root mean square concentration fluctuations in ppmv,
tke: the turbulent kinetic energy in m^2/s^2,
(uprim_cprim, vprim_cprim, wprim_cprim): the pollutant turbulent transport components
uncertainty.h5: lists of the estimated aleatory uncertainty induced by the internal variability of the LES (variability_#) [Lumet et al. 2024] for each of the fields in ave_fields.h5. Also includes the stationary bootstrap [Politis and Romano. 1994] parameters (n_replicates, block_length) used to estimate the uncertainty for each field and each sample.
mesh.h5: the tetrahedral mesh on which the fields are discretized, composed of about 1.8 millions of nodes.
time_series.h5: HDF5 file consisting of 200 groups (Sample_NNN) each containing the time series of the pollutant concentration (c) and wind velocity components (u, v, w) predicted by the LES sample #NNN at 93 locations.
probe_network.dat: provides the location of each of the 93 probes corresponding to the positions of the experimental campaign sensors [Biltoft. 2001].
Code examples
A) Dataset reading
### Imports
import h5py
import numpy as np
### Load the input parameters list into a numpy array (shape = (200, 2))
inputf = h5py.File('PPMLES/input_parameters.h5', 'r')
input_parameters = np.array((inputf['alpha_inlet'], inputf['friction_velocity'])).T### Load the domain mesh node coordinatesmeshf = h5py.File('../PPMLES/mesh.h5', 'r')mesh_nodes = np.array((meshf['Nodes']['x'], meshf['Nodes']['y'], meshf['Nodes']['z'])).T
### Load the set of time-averaged LES fields and their associated uncertainty
var = 'c' # Can be: 'c', 'u', 'v', 'w', 'crms', 'tke', 'uprim_cprim', 'vprim_cprim', or 'wprim_cprim'
fieldsf = h5py.File('PPMLES/ave_fields.h5', 'r')
fields_list = fieldsf[var]
uncertaintyf = h5py.File('PPMLES/uncertainty_ave_fields.h5', 'r')
uncertainty_list = uncertaintyf[var]
### Time series reading example
timeseriesf = h5py.File('PPMLES/time_series.h5', 'r')
var = 'c' # Can be: 'c', 'u', 'v', or 'w'
probe = 32 # Integer between 0 and 92, see probe_network.csv
time_list = []
time_series_list = []
for i in range(200):
time_list.append(np.array(timeseriesf[f'Sample_{i+1:03}']['time']))
time_series_list.append(np.array(timeseriesf[f'Sample_{i+1:03}'][var][probe]))
B) Interpolation of one-field from the unstructured grid to a new structured grid
### Imports
import h5py
import numpy as np
from scipy.interpolate import griddata
### Load the mean concentration field sample #028
fieldsf = h5py.File('PPMLES/ave_fields.h5', 'r')
c = fieldsf['c'][27]
### Load the unstructured grid
meshf = h5py.File('PPMLES/mesh.h5', 'r')
unstructured_nodes = np.array((meshf['Nodes']['x'], meshf['Nodes']['y'], meshf['Nodes']['z'])).T
### Structured grid definition
x0, y0, z0 = -16.9, -115.7, 0.
lx, ly, lz = 205.5, 232.1, 20.
resolution = 0.75
x_grid, y_grid, z_grid = np.meshgrid(np.linspace(x0, x0 + lx, int(lx/resolution)),
np.linspace(y0, y0 + ly, int(ly/resolution)),
np.linspace(z0, z0 + lz, int(lz/resolution)),
indexing='ij')
### Interpolation of the field on the new grid
c_interpolated = griddata(unstructured_nodes, c,
(x_grid.flatten(), y_grid.flatten(), z_grid.flatten()),
method='nearest')
C) Expression of all time series over the same time window with the same time discretization
### Imports
import h5py
import numpy as np
from scipy.interpolate import griddata
### Define a common time discretization over the 200-s analysis period
common_time = np.arange(0., 200., 0.05)
u_series_list = np.zeros((200, np.shape(common_time)[0]))
### Interpolate the u-compnent velocity time series at probe DPID10 over this time discretization
timeseriesf = h5py.File('PPMLES/time_series.h5', 'r')
for i in range(200):
sample_time = np.array(timeseriesf[f'Sample_{i+1:03}']['time']) - \
np.array(timeseriesf[f'Sample_{i+1:03}']['Parameters']['t_spinup']) # Offset the spinup time
u_series_list[i] = griddata(sample_time, timeseriesf[f'Sample_{i+1:03}']['u'][9], common_time, method='linear')
D) Surrogate model construction example
The training and validation of a POD-GPR surrogate model [Marrel et al. 2015] learning from the PPMLES dataset is given in the following GitHub repository. This surrogate model was successfully used by Lumet et al. 2025 to emulate the LES mean concentration prediction for varying meteorological forcing parameters.
Acknowledgments
This work was granted access to the HPC resources from GENCI-TGCC/CINES (A0062A10822, project 2020-2022). The authors would like to thank Olivier Vermorel for the preliminary development of the LES model, and Simon Lacroix for his proofreading.
创建时间:
2024-11-27



