Massively parallel implementation of iterative eigensolvers in large-scale plane-wave density functional theory
收藏doi.org2024-02-28 更新2025-03-23 收录
下载链接:
http://doi.org/10.17632/c8v2mx5vn4.1
下载链接
链接失效反馈官方服务:
资源简介:
The Kohn-sham density functional theory (DFT) is a powerful method to describe the electronic structures of molecules and solids in condensed matter physics, computational chemistry and materials science. However, large and accurate DFT calculations within plane waves process a cubic-scaling computational complexity, which is usually limited by expensive computation and communication costs. The rapid development of high performance computing (HPC) on leadership supercomputers brings new opportunities for developing plane-wave DFT calculations for large-scale systems. Here, we implement parallel iterative eigensolvers in large-scale plane-wave DFT calculations, including Davidson, locally optimal block preconditioned conjugate gradient (LOBPCG), projected preconditioned conjugate gradient (PPCG) and the Chebyshev subspace iteration (CheFSI) algorithms, and analyze the performance of these algorithms in massively parallel plane-wave computing tasks. We adopt a two-level parallelization strategy that combines the message passing interface (MPI) with open multi-processing (OpenMP) parallel programming to handle data exchange and matrix operations in the construction and diagonalization of large-scale Hamiltonian matrix within plane waves. Numerical results illustrate that these iterative eigensolvers can scale up to 42,592 processing cores with high peak performance of 30% on leadship supercomputers to study the electronic structures of bulk silicon systems containing 10,648 atoms.
科恩-尚密度泛函理论(Kohn-Sham density functional theory,简称DFT)是描述凝聚态物理学中分子和固体电子结构的一种强大方法,广泛应用于计算化学和材料科学领域。然而,在平面波处理过程中,大型的精确DFT计算呈现出立方级计算复杂度,这通常受到昂贵的计算和通信成本的制约。领导级高性能计算(HPC)的快速发展为大型平面波DFT计算带来了新的机遇。在此,我们实现了大规模平面波DFT计算中的并行迭代特征值求解器,包括Davidson、局部最优块预条件共轭梯度(LOBPCG)、投影预条件共轭梯度(PPCG)和切比雪夫子空间迭代(CheFSI)算法,并分析了这些算法在大量并行平面波计算任务中的性能。我们采用两级并行化策略,将消息传递接口(MPI)与开放多处理(OpenMP)并行编程相结合,以处理平面波中大规模哈密顿矩阵构建和特征值对角化过程中的数据交换和矩阵运算。数值结果表明,这些迭代特征值求解器能够扩展至42592个处理核心,在领导级超级计算机上实现高达30%的峰值性能,以研究包含10,648个原子的块状硅系统的电子结构。
提供机构:
doi.org



