Book of Abstracts: Albany 2007
June 19-23 2007
Identification of Functional Patterns in Protein Families using Sequence Correlation Entropy
Protein functional site prediction methods are important for the classification of uncharacterized proteins. The availability of such methods might allow for the identification of classes of proteins that could be involved in different diseases. Here we present a method that probes covariations in different classes of residues occurring along the primary sequences of protein families. The method computes the Sequence Correlation Entropy (SCE)  using a quenched probability PSk (i,j) of finding a given type of a residue pair at a separation Sk, and allows us to classify protein families based on their SCE values. In the current study, we classify 1022 families from the PFAM database using the SCE approach. For fast and reliable identification of functional motifs, the SCE values of a combination of chemical patterns across protein families are further grouped using clustering algorithms. The method is benchmarked based on its ability to identify the correct functional classification in test cases. Predictions on a number of uncharacterized proteins will be presented and discussed.
References and Footnotes
1 Dept. of Chemistry,