five

Density-fitted singles and doubles coupled cluster on graphics processing units

收藏
Taylor & Francis Group2016-01-18 更新2026-04-16 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Density_fitted_singles_and_doubles_coupled_cluster_on_graphics_processing_units/963378/2
下载链接
链接失效反馈
官方服务:
资源简介:
We adapt an algorithm for singles and doubles coupled cluster (CCSD) that uses density fitting or Cholesky decomposition (CD) in the construction and contraction of all electron repulsion integrals (ERIs) for use on heterogeneous compute nodes consisting of a multicore central processing unit (CPU) and at least one graphics processing unit (GPU). The use of approximate three-index ERIs ameliorates two of the major difficulties in designing scientific algorithms for GPUs: (1) the extremely limited global memory on the devices and (2) the overhead associated with data motion across the bus. For the benzene trimer described by an aug-cc-pVDZ basis set, the use of a single NVIDIA Tesla C2070 (Fermi) GPU accelerates a CD-CCSD computation by a factor of 2.1, relative to the multicore CPU-only algorithm that uses six highly efficient Intel Core i7-3930K CPU cores. The use of two Fermi GPUs provides an acceleration of 2.89, which is comparable to that observed when using a single NVIDIA Kepler K20c GPU (2.73).

我们针对单双耦合簇(CCSD,singles and doubles coupled cluster)算法开展适配工作:该算法在构建与收缩全电子排斥积分(ERIs,all electron repulsion integrals)时,采用密度拟合(density fitting)或乔列斯基分解(CD,Cholesky decomposition)方法,可适配由多核中央处理器(CPU,multicore central processing unit)与至少一块图形处理器(GPU,graphics processing unit)组成的异构计算节点。近似三索引全电子排斥积分的应用,有效缓解了为GPU部署科学算法时面临的两大核心瓶颈:一是设备全局内存容量极为有限,二是跨总线数据传输所产生的额外开销。针对采用aug-cc-pVDZ基组(aug-cc-pVDZ basis set)描述的苯三聚体体系,相较于使用6颗高性能Intel Core i7-3930K CPU核心的纯多核CPU算法,单块NVIDIA Tesla C2070(Fermi架构)GPU可将CD-CCSD计算提速2.1倍;搭载两块Fermi架构GPU时,计算加速比可达2.89,与单块NVIDIA Kepler K20c GPU的加速效果(2.73)基本持平。
提供机构:
C. David Sherrill
创建时间:
2014-05-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作