five

Structured-Noise Deconvolution Benchmark — Representative Sample

收藏
DataCite Commons2026-05-07 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.20044224
下载链接
链接失效反馈
官方服务:
资源简介:
Anonymous deposit for the NeurIPS 2026 Evaluations & Datasets Track double-blind review. This deposit provides a representative ~76 MB sample of the full benchmark (16.7 GB), per the OpenReview "Dataset Large URL" requirement that large datasets include a sample for reviewer inspection without a full download. Open access — no request needed. The complete benchmark is hosted at the companion restricted-access deposit: DOI 10.5281/zenodo.20044006. CONTENTS samples.zip is 76 MB compressed and ~102 MB uncompressed. SHA-256: ae7bf6f59f02bf89a225982d796b1cf5393e4e7e52b2ce86cf9f6ec7a3112d1f real_amsre_modis_sample_1000patches.nc (~41 MB) — 1,000 patches randomly drawn from the canonical real AMSR-E / MODIS day-test set (17,487 patches total). Sampling: numpy default_rng(seed=0), without replacement, sorted by index. NetCDF4 schema identical to the full release. Variables (each shape (1000, 32, 32) on a 10 km grid): L2eqa_AMSR_E_SST (real AMSR-E observation, deconvolution input), L2eqa_MODIS_SST (real MODIS reference, deconvolution target), L2eqa_MODIS_conv (H-convolved real MODIS, forward-model comparator), L2eqa_lat and L2eqa_lon (per-pixel coordinates), and amsre_range (per-patch dynamic range scalar). simulated_subset_sample_one_file.npz (~52 MB) — One LLC4320-derived simulator file from the simulated test split, the alphabetically-first file, ~14,998 patches. NPZ schema identical to the full release. Arrays: simulated (simulated AMSR-E, equal to H ⊛ target at zero noise by construction), target (LLC4320 high-resolution reference signal), and legacy_ok (per-patch validity mask). arch_se_attention_seed42_best_model.pth (~7 MB) — Trained SE-Attention U-Net checkpoint achieving the headline 0.5497 °C RMSE on the canonical 25,971-patch test set (best multi-seed model in Table 1 of the paper). figure_data/ (~2.5 MB) — Seven pre-aggregated bundles consumed by the figure-generation notebooks: amsre_nsr_esr.npz (per-patch NSR/ESR scatter values), psd_sim_real.npz (sim/real PSD comparison), figures_bundle.json (main figure-data aggregate), partial_correlation.json (NSR partial-r controlling for σ(x_p)), rescale_bootstrap_cis.json (bootstrap CIs on the rescaling curve), linear_baseline_sweep.json (extended linear-baseline sweep), and biosr_within_level_r.json (BioSR within-photon-level correlations). README.md — In-archive copy of these contents. SAMPLING METHOD Deterministic. Re-running the sampling script with seed=0 reproduces the exact 1,000 patches and the same alphabetically-first simulator file. The script ships with the supplementary code at https://anonymous.4open.science/r/structured-noise-deconvolution-benchmark-84DB/. Relevant entry points: prepare_zenodo_data.py, prepare_simulated_subset.py, and build_zenodo_bundles.sh (sampling step in the build pipeline). HOW TO USE THE SAMPLE The sample exposes the same Python loader API as the full release. The supplementary code includes notebooks/load_data_example.py, which demonstrates loading both data sources via src/data/real_dataset.py (PatchDataset, real .nc) and src/data/simulated_dataset.py (SSTDeconvDataset, simulated .npz). Both return float32 tensors of shape (1, 32, 32) under the canonical normalization (PatchNorm + GLOBAL_STD = 0.895249), regardless of the underlying file format. Reviewers can also evaluate the bundled SE-Attention checkpoint on the 1,000-patch sample by running: python src/eval_canonical.py --checkpoint arch_se_attention_seed42_best_model.pth --config configs/arch/se_attention.yaml. Expected RMSE on the 1,000-patch sample is within 0.01 °C of the full-test 0.5497 °C reported in Table 1. METADATA Protocol version: canonical-2026-04-20 (frozen). License: CC-BY-4.0 for derivatives. Users must respect the upstream licenses of source assets: AMSR-E (Remote Sensing Systems), MODIS-Aqua (NASA OB.DAAC), and LLC4320 (NASA PO.DAAC). Code, evaluation scripts, and reproduction notebooks: see the Code URL on OpenReview.
提供机构:
Zenodo
创建时间:
2026-05-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作