TY - GEN
T1 - High-performance parallel location-aware algorithms for approximate string matching on GPUs
AU - Lin, Cheng Hung
AU - Huang, Chun Cheng
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2016/1/15
Y1 - 2016/1/15
N2 - Approximate string matching has been widely used in many applications, including deoxyribonucleic acid sequence searching, spell checking, text mining, and spam filters. The method is designed to find all locations of strings that approximately match a pattern in accordance with the number of insertion, deletion, and substitution operations. Among the proposed algorithms, the bit-parallel algorithms are considered to be the best and highly efficient algorithms. However, the traditional bit-parallel algorithms lacks the ability of identifying the start and end positions of a matched pattern. Furthermore, acceleration of the bit-parallel algorithms has become a crucial issue for processing big data nowadays. In this paper, we propose two kinds of parallel location-aware algorithms called data-segmented parallelism and high-degree parallelism as means to accelerate approximate string matching using graphic processing units. Experimental results show that the high-degree parallelism on GPUs achieves significant improvement in system and kernel throughputs compared to CPU counterparts. Compared to state-of-the-art approaches, the proposed high-degree parallelism achieves 11 to 105 times improvement.
AB - Approximate string matching has been widely used in many applications, including deoxyribonucleic acid sequence searching, spell checking, text mining, and spam filters. The method is designed to find all locations of strings that approximately match a pattern in accordance with the number of insertion, deletion, and substitution operations. Among the proposed algorithms, the bit-parallel algorithms are considered to be the best and highly efficient algorithms. However, the traditional bit-parallel algorithms lacks the ability of identifying the start and end positions of a matched pattern. Furthermore, acceleration of the bit-parallel algorithms has become a crucial issue for processing big data nowadays. In this paper, we propose two kinds of parallel location-aware algorithms called data-segmented parallelism and high-degree parallelism as means to accelerate approximate string matching using graphic processing units. Experimental results show that the high-degree parallelism on GPUs achieves significant improvement in system and kernel throughputs compared to CPU counterparts. Compared to state-of-the-art approaches, the proposed high-degree parallelism achieves 11 to 105 times improvement.
KW - Approximate string matching
KW - Bit-parallel algorithm
KW - Graphic processing units
KW - Levenshtein distance
KW - Nondeterministic finite automaton
KW - Parallel algorithm
UR - http://www.scopus.com/inward/record.url?scp=84964619916&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84964619916&partnerID=8YFLogxK
U2 - 10.1109/ICPADS.2015.77
DO - 10.1109/ICPADS.2015.77
M3 - Conference contribution
AN - SCOPUS:84964619916
T3 - Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS
SP - 570
EP - 575
BT - Proceedings - 2015 IEEE 21st International Conference on Parallel and Distributed Systems, ICPADS 2015
PB - IEEE Computer Society
T2 - 21st IEEE International Conference on Parallel and Distributed Systems, ICPADS 2015
Y2 - 14 December 2015 through 17 December 2015
ER -