ThousandWorlds
收藏DataCite Commons2026-05-07 更新2026-05-18 收录
下载链接:
https://dataverse.harvard.edu/citation?persistentId=doi:10.7910/DVN/8IEH6Q
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Overview</h3>
ThousandWorlds contains 1760 global climate model (GCM) simulations of rocky
exoplanets in or near the habitable zone. It is released as a benchmark dataset
for exoplanet climate emulation.<br>
<b>Inputs:</b> 8 continuous planet parameters, plus the source GCM label.<br>
<b>Outputs:</b> time-averaged climate fields on a 32×64 latitude-longitude grid.
(Three-dimensional variables are stored as pressure-level channels; two-
dimensional variables are stored as single fields.)<br><br>
<h3>Files</h3>
The release includes:
<ul>
<li><code>dataset.tar.gz</code> -- the ThousandWorlds dataset</
li>
<li><code>results-baselines-*.tar.gz</code> -- baseline predictions for the 3 subsets</li>
<li><code>croissant.json</code> -- Croissant metadata</li>
<li><code>*.sha256</code> -- checksum sidecars</li>
</ul>
<h3>Dataset contents</h3>
The dataset contains gridded fields (numpy), input metadata (CSV), predefined train/test splits, and for the convenience of candidate methods: normalization statistics, and spherical harmonic coefficients + inverse-SHT weights for spectral methods.
<h3>Subsets</h3>
The dataset is organised into three splits of increasing complexity and realism:
<ol>
<li><code>single-complete</code>: smaller split, simulations from a single GCM, complete observations only (no missing fields)</li>
<li><code>multi-complete</code>: all 5 GCMs, but still no missing fields</li>
<li><code>multi-partial</code>: the full dataset -- all 5 GCMs, simulations contain missing fields (represented as NaNs)</li>
</ol>
<h3>Evaluation protocols</h3>
Two evaluation protocols are provided:
<ol>
<li><b>Standard:</b> larger, ideal for ML model comparison.</li>
<li><b>Shared-planets:</b> smaller, includes only planets simulated by both of two high-fidelity GCMs; this protocol is used for assessing model performance relative to inter-GCM error, i.e., how close the model gets to the epistemic uncertainty floor of the problem.</li>
</ol>
提供机构:
Harvard Dataverse
创建时间:
2026-04-17



