five

Advanced Techniques for High-Performance Fock Matrix Construction on GPU Clusters

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Advanced_Techniques_for_High-Performance_Fock_Matrix_Construction_on_GPU_Clusters/27905300
下载链接
链接失效反馈
官方服务:
资源简介:
This Article presents two optimized multi-GPU algorithms for Fock matrix construction, building on the work of Ufimtsev and Martinez [J. Chem. Theory Comput. 2009, 5, 1004–1015] and Barca et al. [J. Chem. Theory Comput. 2021, 17, 7486–7503]. The novel algorithms, opt-UM and opt-Brc, introduce significant enhancements, including improved integral screening, exploitation of sparsity and symmetry, a linear scaling exchange matrix assembly algorithm, and extended capabilities for Hartree–Fock caculations up to f-type angular momentum functions. Opt-Brc excels for smaller systems and for highly contracted triple-ζ basis sets, while opt-UM is advantageous for large molecular systems. Performance benchmarks on NVIDIA A100 GPUs show that our algorithms in the EXtreme-scale Electronic Structure System (EXESS), when combined, outperform all current GPU and CPU Fock build implementations in TeraChem, QUICK, GPU4PySCF, LibIntX, ORCA, and Q-Chem. The implementations were benchmarked on linear and globular systems and average speed ups across three double-ζ basis sets of 1.4×, 8.4×, and 9.4× were observed compared to TeraChem, QUICK, and GPU4PySCF respectively. An increased average speedup of 2.1× over TeraChem is observed when using four A100 GPUs. Strong scaling analysis reveals over 91% parallel efficiency on four GPUs for opt-Brc, making it typically faster for multi-GPU execution. Single-compute-node comparisons with CPU-based software like ORCA and Q-Chem show speedups of up to 42× and 31×, respectively, enhancing power efficiency by up to 18×.
创建时间:
2024-11-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作