DNA Sequence Similarity Search through Content-Based Retrieval Technique

  • Chia H. Yeh*
  • , Po Y. Sung
  • , Hsuan T. Chang
  • , Chung J. Kuo
  • *此作品的通信作者

研究成果: 雜誌貢獻會議論文同行評審

摘要

Deoxyribonucleic acid (DNA) sequences are difficult to analyze similarity due to their length and complexity. The challenge lies in being able to use digital signal processing (DSP) to solve highly relevant problems in DNA sequences. Here, we transfer a one-dimensional (ID) DNA sequence into a two-dimensional (2D) pattern by using the Peano scan algorithm. Four complex values are assigned to the characters "A", "C", "T", and "G", respectively. Then, Fourier transform is employed to obtain far-field amplitude distribution of the 2D pattern. Hereto, a ID DNA sequence becomes a 2D image pattern. Features are extracted from the 2D image pattern with the Principle Component Analysis (PCA) method. Therefore, the DNA sequence database can be established. Unfortunately, comparing features may take a long time when the database is large since multi-dimensional features are often available. This problem is solved by building indexing structure like a filter to filter-out non-relevant items and select a subset of candidate DNA sequences. Clustering algorithms can organize the multi-dimensional feature data into the indexing structure for effective retrieval. Accordingly, the query sequence can be only compared against candidate ones rather than all sequences in database. In fact, our algorithm provides a pre-processing method to accelerate the DNA sequence search process. Finally, experimental results further demonstrate the efficiency of our proposed algorithm for DNA sequences similarity retrieval.

原文英語
頁(從 - 到)635-645
頁數11
期刊Proceedings of SPIE - The International Society for Optical Engineering
5096
DOIs
出版狀態已發佈 - 2003
對外發佈
事件PROCEEDINGS OF SPIE SPIE - The International Society for Optical Engineering: Signal Processing, Sensor Fusion, and Target Recognition XII - Orlando, FL, 美国
持續時間: 2003 4月 212003 4月 23

ASJC Scopus subject areas

  • 電子、光磁材料
  • 凝聚態物理學
  • 電腦科學應用
  • 應用數學
  • 電氣與電子工程

指紋

深入研究「DNA Sequence Similarity Search through Content-Based Retrieval Technique」主題。共同形成了獨特的指紋。

引用此