TY - JOUR
T1 - Highly efficient GPU eigensolver for three-dimensional photonic crystal band structures with any Bravais lattice
AU - Huang, Tsung Ming
AU - Lin, Wen Wei
AU - Tsai, Hsinhan
AU - Wang, Weichung
N1 - Publisher Copyright:
© 2019 Elsevier B.V.
PY - 2019/12
Y1 - 2019/12
N2 - A null-space free method with the FFT-based matrix–vector multiplications was proposed to solve the Maxwell equations that model the three-dimensional photonic crystals. The most time-consuming parts of this method were the FFT-based matrix–vector multiplications. In this article, we propose new mathematical formulas to compute the FFT-based matrix–vector multiplications and derive highly efficient algorithms on top of the NVIDIA GPU architecture. The resulting algorithms are approximately two- to threefold faster than the previous algorithms. We have successfully used a single NVIDIA Tesla P100 GPU to solve a set of generalized eigenvalue problems of 5,184,000 dimensions in 17 to 22 seconds for each problem. Furthermore, we ported the codes to a GPU cluster and achieved near linear scalability. To our knowledge, these GPU implementations of the proposed algorithms are the fastest implementations. The schemes can be applied to simulate a three-dimensional photonic crystal with all 14 Bravais lattices. These highly efficient schemes and codes raise possibilities for large-scale and near real-time numerical simulations for novel physical discoveries and engineering applications of photonic crystals.
AB - A null-space free method with the FFT-based matrix–vector multiplications was proposed to solve the Maxwell equations that model the three-dimensional photonic crystals. The most time-consuming parts of this method were the FFT-based matrix–vector multiplications. In this article, we propose new mathematical formulas to compute the FFT-based matrix–vector multiplications and derive highly efficient algorithms on top of the NVIDIA GPU architecture. The resulting algorithms are approximately two- to threefold faster than the previous algorithms. We have successfully used a single NVIDIA Tesla P100 GPU to solve a set of generalized eigenvalue problems of 5,184,000 dimensions in 17 to 22 seconds for each problem. Furthermore, we ported the codes to a GPU cluster and achieved near linear scalability. To our knowledge, these GPU implementations of the proposed algorithms are the fastest implementations. The schemes can be applied to simulate a three-dimensional photonic crystal with all 14 Bravais lattices. These highly efficient schemes and codes raise possibilities for large-scale and near real-time numerical simulations for novel physical discoveries and engineering applications of photonic crystals.
KW - Fast Fourier transform
KW - Null-space free method
KW - Parallel GPU computing
KW - The Maxwell equations
KW - Three-dimensional photonic crystals
UR - http://www.scopus.com/inward/record.url?scp=85070670210&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85070670210&partnerID=8YFLogxK
U2 - 10.1016/j.cpc.2019.07.007
DO - 10.1016/j.cpc.2019.07.007
M3 - Article
AN - SCOPUS:85070670210
SN - 0010-4655
VL - 245
JO - Computer Physics Communications
JF - Computer Physics Communications
M1 - 106841
ER -