TY - JOUR

T1 - Highly efficient GPU eigensolver for three-dimensional photonic crystal band structures with any Bravais lattice

AU - Huang, Tsung Ming

AU - Lin, Wen Wei

AU - Tsai, Hsinhan

AU - Wang, Weichung

N1 - Publisher Copyright:
© 2019 Elsevier B.V.

PY - 2019/12

Y1 - 2019/12

N2 - A null-space free method with the FFT-based matrix–vector multiplications was proposed to solve the Maxwell equations that model the three-dimensional photonic crystals. The most time-consuming parts of this method were the FFT-based matrix–vector multiplications. In this article, we propose new mathematical formulas to compute the FFT-based matrix–vector multiplications and derive highly efficient algorithms on top of the NVIDIA GPU architecture. The resulting algorithms are approximately two- to threefold faster than the previous algorithms. We have successfully used a single NVIDIA Tesla P100 GPU to solve a set of generalized eigenvalue problems of 5,184,000 dimensions in 17 to 22 seconds for each problem. Furthermore, we ported the codes to a GPU cluster and achieved near linear scalability. To our knowledge, these GPU implementations of the proposed algorithms are the fastest implementations. The schemes can be applied to simulate a three-dimensional photonic crystal with all 14 Bravais lattices. These highly efficient schemes and codes raise possibilities for large-scale and near real-time numerical simulations for novel physical discoveries and engineering applications of photonic crystals.

AB - A null-space free method with the FFT-based matrix–vector multiplications was proposed to solve the Maxwell equations that model the three-dimensional photonic crystals. The most time-consuming parts of this method were the FFT-based matrix–vector multiplications. In this article, we propose new mathematical formulas to compute the FFT-based matrix–vector multiplications and derive highly efficient algorithms on top of the NVIDIA GPU architecture. The resulting algorithms are approximately two- to threefold faster than the previous algorithms. We have successfully used a single NVIDIA Tesla P100 GPU to solve a set of generalized eigenvalue problems of 5,184,000 dimensions in 17 to 22 seconds for each problem. Furthermore, we ported the codes to a GPU cluster and achieved near linear scalability. To our knowledge, these GPU implementations of the proposed algorithms are the fastest implementations. The schemes can be applied to simulate a three-dimensional photonic crystal with all 14 Bravais lattices. These highly efficient schemes and codes raise possibilities for large-scale and near real-time numerical simulations for novel physical discoveries and engineering applications of photonic crystals.

KW - Fast Fourier transform

KW - Null-space free method

KW - Parallel GPU computing

KW - The Maxwell equations

KW - Three-dimensional photonic crystals

UR - http://www.scopus.com/inward/record.url?scp=85070670210&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85070670210&partnerID=8YFLogxK

U2 - 10.1016/j.cpc.2019.07.007

DO - 10.1016/j.cpc.2019.07.007

M3 - Article

AN - SCOPUS:85070670210

SN - 0010-4655

VL - 245

JO - Computer Physics Communications

JF - Computer Physics Communications

M1 - 106841

ER -