Abstract
Deoxyribonucleic acid (DNA) sequences are difficult to analyze similarity due to their length and complexity. The challenge lies in being able to use digital signal processing (DSP) to solve highly relevant problems in DNA sequences. Here, we transfer a one-dimensional (ID) DNA sequence into a two-dimensional (2D) pattern by using the Peano scan algorithm. Four complex values are assigned to the characters "A", "C", "T", and "G", respectively. Then, Fourier transform is employed to obtain far-field amplitude distribution of the 2D pattern. Hereto, a ID DNA sequence becomes a 2D image pattern. Features are extracted from the 2D image pattern with the Principle Component Analysis (PCA) method. Therefore, the DNA sequence database can be established. Unfortunately, comparing features may take a long time when the database is large since multi-dimensional features are often available. This problem is solved by building indexing structure like a filter to filter-out non-relevant items and select a subset of candidate DNA sequences. Clustering algorithms can organize the multi-dimensional feature data into the indexing structure for effective retrieval. Accordingly, the query sequence can be only compared against candidate ones rather than all sequences in database. In fact, our algorithm provides a pre-processing method to accelerate the DNA sequence search process. Finally, experimental results further demonstrate the efficiency of our proposed algorithm for DNA sequences similarity retrieval.
Original language | English |
---|---|
Pages (from-to) | 635-645 |
Number of pages | 11 |
Journal | Proceedings of SPIE - The International Society for Optical Engineering |
Volume | 5096 |
DOIs | |
Publication status | Published - 2003 |
Externally published | Yes |
Event | PROCEEDINGS OF SPIE SPIE - The International Society for Optical Engineering: Signal Processing, Sensor Fusion, and Target Recognition XII - Orlando, FL, United States Duration: 2003 Apr 21 → 2003 Apr 23 |
Keywords
- Clustering algorithm
- DNA
- Fourier transformation
- Indexing structure
- Peano scan
- Principle Component Analysis
- Similarity retrieval
ASJC Scopus subject areas
- Electronic, Optical and Magnetic Materials
- Condensed Matter Physics
- Computer Science Applications
- Applied Mathematics
- Electrical and Electronic Engineering