five

MGO GCM (T42L25) OpenMP Modernization: Measurement and Verification Suite for Roofline, Energy, and Bitwise Reproducibility on AMD Zen 5 and Intel Ivy Bridge-EP

收藏
DataCite Commons2026-05-04 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19653299
下载链接
链接失效反馈
官方服务:
资源简介:
Complete replication package for an independent-researcher study of OpenMP thread-safety modernization and empirical Memory-Wall diagnosis of the Main Geophysical Observatory (MGO) spectral Global Circulation Model at T42L25 resolution (Gaussian 64 x 128 grid, 25 vertical levels, triangular truncation M=42). The package supports the accompanying manuscript (title and venue to be confirmed at preprint or publication; the paper DOI will be added to this record's Related Identifiers field upon publication). Measurement platforms: AMD Ryzen 9 9950X (Zen 5, 16 cores, DDR5-5600): primary target 2x Intel Xeon E5-2697 v2 (Ivy Bridge-EP, 24 cores dual-socket, DDR3-1866): cross-platform validation (host D30) Headline results (all bit-traceable to raw data in this package): 4.73x total application speedup at 16 threads vs legacy sequential 4.36x OpenMP strong scaling (Zen 5, 16T); 5.94x on Ivy Bridge (24T) Algorithmic serial fraction f_alg = 16.3 % (perf callgraph, 22,554 samples); 4.36x attains 93.9 % of the corresponding Amdahl envelope Residual 6.1 % gap = DDR5 bandwidth saturation at 100.8 % of STREAM Triad (53.4 / 53.0 GB/s); operational intensity 0.86 FLOP/Byte in the memory-bound region of the Roofline (ridge point OI* = 10.9 FLOP/Byte) Bitwise-identical tabm output across all thread counts on each platform (tabm MD5 = 84991763...  on Zen 5; 1322f6da... on IVB) Governor-sensitivity experiment on IVB: 2.25x CPU frequency change yields median |Delta| = 0.35 % wall-time effect 49.0 % peak Energy-to-Solution reduction (N = 4 threads) Contents (124 files, ~2.3 MB): 01_environment/   Platform captures: CPU topology, kernel version, compiler info, governor state, perf_event_ paranoid level, binary MD5; STREAM source; run script; 5 + 3 + 3 uninstrumented wall-time replicates for 4.73x / 4.36x / 1.09x numbers 02_roofline/    Roofline ceilings, OI / achieved-bandwidth / FLOP summaries, Python plot-regeneration code for Figures 1-3, 5 of the companion manuscript, canonical roofline_data.csv with Wall_s / Wall_RAPL_s / full PMC instrumentation, and serial-fraction profiling report (serial_fraction_measurement.md) 03_energy/       RAPL-derived Energy-to-Solution analysis, raw energy traces per thread count, Russia / EU / global carbon footprint derivation referencing Ember GER 2024 04_vectorization/ compiler vectorization audit (-fopt-info-vec output, objdump disassembly statistics) 05_bwr/           bitwise reproducibility verification script and MD5 checksums for tabm binary output across 1-16 threads (Zen 5) and 1-24 threads (IVB) 06_ivb_crossplatform/ Ivy Bridge governor-sensitivity experiment (schedutil vs performance), numactl experiment, dual-socket STREAM, bitwise verification (Lenovo Thinkstation D30 host)  data/            raw performance-counter logs (perf stat, likwid, AMDuProf) for all thread counts The package does NOT include the MGO GCM source code itself, which is subject to an institutional NDA. A Rospatent-registered OpenMP implementation of MGO GCM T42L25 (certificate No. 2026660623, 14 April 2026) documents the specific refactoring; source access may be requested from the author, subject to appropriate agreements. All included scripts are self-contained (standard Linux tools, Python 3 with matplotlib / numpy, gfortran 13). A pre-flight checkscript (verify_environment.sh) confirms prerequisites before reproduction.
提供机构:
Zenodo
创建时间:
2026-05-04
二维码
社区交流群
二维码
科研交流群
商业服务