Accelerating pattern matching using a novel parallel algorithm on gpus

Cheng Hung Lin, Chen Hsiung Liu, Lung Sheng Chien, Shih Chieh Chang

Research output: Contribution to journalArticle

51 Citations (Scopus)

Abstract

Graphics processing units (GPUs) have attracted a lot of attention due to their cost-effective and enormous power for massive data parallel computing. In this paper, we propose a novel parallel algorithm for exact pattern matching on GPUs. A traditional exact pattern matching algorithm matches multiple patterns simultaneously by traversing a special state machine called an Aho-Corasick machine. Considering the particular parallel architecture of GPUs, in this paper, we first propose an efficient state machine on which we perform very efficient parallel algorithms. Also, several techniques are introduced to do optimization on GPUs, including reducing global memory transactions of input buffer, reducing latency of transition table lookup, eliminating output table accesses, avoiding bank-conflict of shared memory, coalescing writes to global memory, and enhancing data transmission via peripheral component interconnect express. We evaluate the performance of the proposed algorithm using attack patterns from Snort V2.8 and input streams from DEFCON. The experimental results show that the proposed algorithm performed on NVIDIA GPUs achieves up to 143.16-Gbps throughput, 14.74 times faster than the Aho-Corasick algorithm implemented on a 3.06-GHz quad-core CPU with the OpenMP. The library of the proposed algorithm is publically accessible through Google Code.

Original languageEnglish
Article number12
Pages (from-to)1906-1916
Number of pages11
JournalIEEE Transactions on Computers
Volume62
Issue number10
DOIs
Publication statusPublished - 2013 Sep 9

Fingerprint

Pattern matching
Graphics Processing Unit
Pattern Matching
Parallel algorithms
Parallel Algorithms
State Machine
Data storage equipment
Computer peripheral equipment
Table lookup
Parallel architectures
Parallel Architectures
OpenMP
Look-up Table
Interconnect
Parallel processing systems
Parallel Computing
Matching Algorithm
Shared Memory
Data Transmission
Data communication systems

Keywords

  • Aho-Corasick
  • Graphics processing units
  • Parallel algorithm
  • Pattern matching

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computational Theory and Mathematics

Cite this

Accelerating pattern matching using a novel parallel algorithm on gpus. / Lin, Cheng Hung; Liu, Chen Hsiung; Chien, Lung Sheng; Chang, Shih Chieh.

In: IEEE Transactions on Computers, Vol. 62, No. 10, 12, 09.09.2013, p. 1906-1916.

Research output: Contribution to journalArticle

Lin, Cheng Hung ; Liu, Chen Hsiung ; Chien, Lung Sheng ; Chang, Shih Chieh. / Accelerating pattern matching using a novel parallel algorithm on gpus. In: IEEE Transactions on Computers. 2013 ; Vol. 62, No. 10. pp. 1906-1916.
@article{3a3c7448b0fc4210ba342bcf3f6d5c5e,
title = "Accelerating pattern matching using a novel parallel algorithm on gpus",
abstract = "Graphics processing units (GPUs) have attracted a lot of attention due to their cost-effective and enormous power for massive data parallel computing. In this paper, we propose a novel parallel algorithm for exact pattern matching on GPUs. A traditional exact pattern matching algorithm matches multiple patterns simultaneously by traversing a special state machine called an Aho-Corasick machine. Considering the particular parallel architecture of GPUs, in this paper, we first propose an efficient state machine on which we perform very efficient parallel algorithms. Also, several techniques are introduced to do optimization on GPUs, including reducing global memory transactions of input buffer, reducing latency of transition table lookup, eliminating output table accesses, avoiding bank-conflict of shared memory, coalescing writes to global memory, and enhancing data transmission via peripheral component interconnect express. We evaluate the performance of the proposed algorithm using attack patterns from Snort V2.8 and input streams from DEFCON. The experimental results show that the proposed algorithm performed on NVIDIA GPUs achieves up to 143.16-Gbps throughput, 14.74 times faster than the Aho-Corasick algorithm implemented on a 3.06-GHz quad-core CPU with the OpenMP. The library of the proposed algorithm is publically accessible through Google Code.",
keywords = "Aho-Corasick, Graphics processing units, Parallel algorithm, Pattern matching",
author = "Lin, {Cheng Hung} and Liu, {Chen Hsiung} and Chien, {Lung Sheng} and Chang, {Shih Chieh}",
year = "2013",
month = "9",
day = "9",
doi = "10.1109/TC.2012.254",
language = "English",
volume = "62",
pages = "1906--1916",
journal = "IEEE Transactions on Computers",
issn = "0018-9340",
publisher = "IEEE Computer Society",
number = "10",

}

TY - JOUR

T1 - Accelerating pattern matching using a novel parallel algorithm on gpus

AU - Lin, Cheng Hung

AU - Liu, Chen Hsiung

AU - Chien, Lung Sheng

AU - Chang, Shih Chieh

PY - 2013/9/9

Y1 - 2013/9/9

N2 - Graphics processing units (GPUs) have attracted a lot of attention due to their cost-effective and enormous power for massive data parallel computing. In this paper, we propose a novel parallel algorithm for exact pattern matching on GPUs. A traditional exact pattern matching algorithm matches multiple patterns simultaneously by traversing a special state machine called an Aho-Corasick machine. Considering the particular parallel architecture of GPUs, in this paper, we first propose an efficient state machine on which we perform very efficient parallel algorithms. Also, several techniques are introduced to do optimization on GPUs, including reducing global memory transactions of input buffer, reducing latency of transition table lookup, eliminating output table accesses, avoiding bank-conflict of shared memory, coalescing writes to global memory, and enhancing data transmission via peripheral component interconnect express. We evaluate the performance of the proposed algorithm using attack patterns from Snort V2.8 and input streams from DEFCON. The experimental results show that the proposed algorithm performed on NVIDIA GPUs achieves up to 143.16-Gbps throughput, 14.74 times faster than the Aho-Corasick algorithm implemented on a 3.06-GHz quad-core CPU with the OpenMP. The library of the proposed algorithm is publically accessible through Google Code.

AB - Graphics processing units (GPUs) have attracted a lot of attention due to their cost-effective and enormous power for massive data parallel computing. In this paper, we propose a novel parallel algorithm for exact pattern matching on GPUs. A traditional exact pattern matching algorithm matches multiple patterns simultaneously by traversing a special state machine called an Aho-Corasick machine. Considering the particular parallel architecture of GPUs, in this paper, we first propose an efficient state machine on which we perform very efficient parallel algorithms. Also, several techniques are introduced to do optimization on GPUs, including reducing global memory transactions of input buffer, reducing latency of transition table lookup, eliminating output table accesses, avoiding bank-conflict of shared memory, coalescing writes to global memory, and enhancing data transmission via peripheral component interconnect express. We evaluate the performance of the proposed algorithm using attack patterns from Snort V2.8 and input streams from DEFCON. The experimental results show that the proposed algorithm performed on NVIDIA GPUs achieves up to 143.16-Gbps throughput, 14.74 times faster than the Aho-Corasick algorithm implemented on a 3.06-GHz quad-core CPU with the OpenMP. The library of the proposed algorithm is publically accessible through Google Code.

KW - Aho-Corasick

KW - Graphics processing units

KW - Parallel algorithm

KW - Pattern matching

UR - http://www.scopus.com/inward/record.url?scp=84883364091&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84883364091&partnerID=8YFLogxK

U2 - 10.1109/TC.2012.254

DO - 10.1109/TC.2012.254

M3 - Article

AN - SCOPUS:84883364091

VL - 62

SP - 1906

EP - 1916

JO - IEEE Transactions on Computers

JF - IEEE Transactions on Computers

SN - 0018-9340

IS - 10

M1 - 12

ER -