TY - JOUR
T1 - An efficient algorithm for communication set generation of data parallel programs with block-cyclic distribution
AU - Hwang, Gwan Hwan
N1 - Funding Information:
Thanks are due to the Computer Center of the National Normal University, Taiwan, for providing access to the SUN E10K workstation. G.-H. Hwang's work was supported in part by the ROC National Science Council under grant 91-2213-E -003-001 and ROC MOE/NSC program for promoting academic excellence of universities under grant 89-E-FA04-1-4.
PY - 2004/4
Y1 - 2004/4
N2 - Data parallel programming languages, such as High Performance Fortran, are widely regarded as a promising means for writing portable programs for distributed-memory machines. In this paper, we present a new algorithm for computing the communication sets in array section movements with block-cyclic (cyclic (k) in HPF) distribution. Our framework can handle multi-level alignments, multi-dimensional arrays, array intrinsic functions, affine indices and axis exchanges in the array subscript. Instead of employing the linear diophantine equation solver, a new algorithm which does not rely on the linear diophantine equation solver to calculate communication sets is proposed. We use formal proof and experimental results to show that it is more efficient than previous solutions to the same problem. Another important contribution of this paper is that we prove that the compiler is able to compute efficiently the communication sets of block-cyclic distribution as long as the block sizes of the arrays are set to be identical or the lowest common multiple (LCM) of block sizes is not a huge integer. We demonstrate it by thorough complexity analyses and extensive experimental results.
AB - Data parallel programming languages, such as High Performance Fortran, are widely regarded as a promising means for writing portable programs for distributed-memory machines. In this paper, we present a new algorithm for computing the communication sets in array section movements with block-cyclic (cyclic (k) in HPF) distribution. Our framework can handle multi-level alignments, multi-dimensional arrays, array intrinsic functions, affine indices and axis exchanges in the array subscript. Instead of employing the linear diophantine equation solver, a new algorithm which does not rely on the linear diophantine equation solver to calculate communication sets is proposed. We use formal proof and experimental results to show that it is more efficient than previous solutions to the same problem. Another important contribution of this paper is that we prove that the compiler is able to compute efficiently the communication sets of block-cyclic distribution as long as the block sizes of the arrays are set to be identical or the lowest common multiple (LCM) of block sizes is not a huge integer. We demonstrate it by thorough complexity analyses and extensive experimental results.
KW - Block-cyclic distributions
KW - Data parallel programs
KW - Distributed memory machines
KW - HPF compiler
KW - Parallelizing compiler
UR - http://www.scopus.com/inward/record.url?scp=2442694369&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=2442694369&partnerID=8YFLogxK
U2 - 10.1016/j.parco.2004.02.001
DO - 10.1016/j.parco.2004.02.001
M3 - Article
AN - SCOPUS:2442694369
SN - 0167-8191
VL - 30
SP - 473
EP - 501
JO - Parallel Computing
JF - Parallel Computing
IS - 4
ER -