摘要
We proposed a fast and unsupervised clustering method, minimum span clustering (MSC), for analyzing the sequence-structure-function relationship of biological networks, and demonstrated its validity in clustering the sequence/structure similarity networks (SSN) of 682 membrane protein (MP) chains. The MSC clustering of MPs based on their sequence information was found to be consistent with their tertiary structures and functions. For the largest seven clusters predicted by MSC, the consistency in chain function within the same cluster is found to be 100%. From analyzing the edge distribution of SSN for MPs, we found a characteristic threshold distance for the boundary between clusters, over which SSN of MPs could be properly clustered by an unsupervised sparsification of the network distance matrix. The clustering results of MPs from both MSC and the unsupervised sparsification methods are consistent with each other, and have high intracluster similarity and low intercluster similarity in sequence, structure, and function. Our study showed a strong sequence-structure-function relationship of MPs. We discussed evidence of convergent evolution of MPs and suggested applications in finding structural similarities and predicting biological functions of MP chains based on their sequence information.
原文 | 英語 |
---|---|
頁(從 - 到) | 1450-1461 |
頁數 | 12 |
期刊 | Proteins: Structure, Function and Bioinformatics |
卷 | 83 |
發行號 | 8 |
DOIs | |
出版狀態 | 已發佈 - 2015 8月 1 |
ASJC Scopus subject areas
- 結構生物學
- 生物化學
- 分子生物學