Book of Abstracts: Albany 2007

category image Albany 2007
Conversation 15
June 19-23 2007

Identification of Functional Patterns in Protein Families using Sequence Correlation Entropy

Protein functional site prediction methods are important for the classification of uncharacterized proteins. The availability of such methods might allow for the identification of classes of proteins that could be involved in different diseases. Here we present a method that probes covariations in different classes of residues occurring along the primary sequences of protein families. The method computes the Sequence Correlation Entropy (SCE) [1] using a quenched probability PSk (i,j) of finding a given type of a residue pair at a separation Sk, and allows us to classify protein families based on their SCE values. In the current study, we classify 1022 families from the PFAM database using the SCE approach. For fast and reliable identification of functional motifs, the SCE values of a combination of chemical patterns across protein families are further grouped using clustering algorithms. The method is benchmarked based on its ability to identify the correct functional classification in test cases. Predictions on a number of uncharacterized proteins will be presented and discussed.

References and Footnotes
  1. R. I. Dima and D. Thirumalai, ``Proteins associated with diseases show enhanced sequence correlation between charged residues'', Bioinformatics, 20, p. 2345 (2004).

Dina Wassaf1
Kenneth Marx1
Ruxandra I. Dima2

1 Dept. of Chemistry,
Univ. of Massachusetts,
Lowell, MA 01854
2 Dept. of Chemistry,
Univ. of Cincinnati,
Cincinnati, OH 45221

Phone: 978 934 3658
Fax : 978 394 3013
E-mail: Dina_Wassaf@student.uml.edu