Parameterized BLOSUM Matrices for Protein Alignment

Dandan Song, Jiaxing Chen, Guang Chen, Ning Li, Jin Li, Jun Fan, Dongbo Bu, Shuai Cheng Li

Research output: Contribution to journalArticlepeer-review

Abstract

Protein alignment is a basic step for many molecular biology researches. The BLOSUM matrices, especially BLOSUM62, are the de facto standard matrices for protein alignments. However, after widely utilization of the matrices for 15 years, programming errors were surprisingly found in the initial version of source codes for their generation. And amazingly, after bug correction, the “intended” BLOSUM62 matrix performs consistently worse than the “miscalculated” one. In this paper, we find linear relationships among the eigenvalues of the matrices and propose an algorithm to find optimal unified eigenvectors. With them, we can parameterize matrix BLOSUMx for any given variable x that could change continuously. We compare the effectiveness of our parameterized isentropic matrix with BLOSUM62. Furthermore, an iterative alignment and matrix selection process is proposed to adaptively find the best parameter and globally align two sequences. Experiments are conducted on aligning 13,667 families of Pfam database and on clustering MHC II protein sequences, whose improved accuracy demonstrates the effectiveness of our proposed method.
Original languageEnglish
Pages (from-to)686 - 694
JournalIeee-Acm Transactions on Computational Biology and Bioinformatics
Volume12
Issue number3
DOIs
Publication statusPublished - 31 Oct 2014

Fingerprint

Dive into the research topics of 'Parameterized BLOSUM Matrices for Protein Alignment'. Together they form a unique fingerprint.

Cite this