Albany 2015:Book of Abstracts

Albany 2015
Conversation 19
June 9-13 2015
©Adenine Press (2012)

Influence of Homo-Repeats on Functions and Aggregation Propensities of Protein Chains

Single amino acid repeats or homo-repeats represent an unexplored area with many connections to genome function and evolution. What homo-repeat is expected to occur at a specific length in a proteome? In a large-scale analysis, we calculated amino acid frequencies for ∼1.5 millions of proteins measuring occurrences at various sequence lengths and in 122 eukaryotic and bacterial proteomes. We found that the number of proteins with homo-repeats is significantly larger than what expected from theoretical estimates (Fig.1). As long homo-repeats are present in proteins with a high number of interactions, we hypothesize that a strong positive selection acts in their evolution. Is there a link between diseases and occurrence of specific homo-repeats? Considering MIM database of human disease, we found that the homo-repeats with length larger than four for such amino acids as Leucine, Serine, Alanine, Glycine, and Proline have a larger propensity to be coupled with disease. Indeed, it has been found that developmental diseases are associated with homo-repeat expansions such as poly-A (alanine): synpolydactyly type II (HOXD13), blepharophimosis (FOXL2), oculopharyngeal muscular dystrophy (PABPN1), infantile spasm syndrome (ARX), and holoprosencephaly (ZIC2). Also, expansion of poly-Q is implicated in several neurodegenerative diseases, including Huntington's disease and several spinocerebellar ataxia's. The length of homo-repeats which can affect on aggregation properties has been found for each amino acid and compared with random proteomes. It has been found that the longer homo-repeats occur in a protein the stronger aggregation ability we observe for protein sequence. The ability to regulate aggregation of proteins can be one of the general tools for the drug development.


Figure 1. Dependence of the number of proteins that contain homo-repeats of different lengths for 20 amino acids in Dictyostelium discoideum proteome.

This research has been supported by the Russian Science Foundation Grant 14-14-00536.

Oxana V. Galzitskaya
Mikhail Lobanov

Group of Bioinformatics of Institute of Protein Research
Moscow region, Russia, 142290

Ph: (+74967) 318275
Fx: (+74967) 418435