Book of Abstracts: Albany 2007
June 19-23 2007
How protein physics shapes biological evolution
One of the key unsoved problems in Biology today is understanding impact of classical evolution, on organismal and population level on molecular evolution of genes and proteins. Here we present a microscopic physical model of early, pre-darwinian biological evolution, where phenotype - organism life expectancy - is directly related to genotype ? the stability of its proteins which can be determined exactly in the model. Simulating the model on a computer, we consistently observe the ??Big Bang?? scenario whereby exponential population growth ensues as favorable sequence-structure combinations (precursors of stable proteins) are discovered. After that, random diversity of the structural space abruptly collapses into a small set of preferred proteins. The key feature of evolved proteins is their remarkable robustness with respect to point mutations. We observe that protein folds remain stable and abundant in the population at time scales much greater than mutation or organism lifetime, and the distribution of the lifetimes of dominant folds in a population approximately follows a power law. The separation of evolutionary time scales between discovery of new folds and generation of new sequences gives rise to emergence of protein families and superfamilies whose sizes are power-law distributed, closely matching the same distributions for real proteins. On a populational level we observe emergence quasi species ? subpopulations which carry genomes that are similar to each other but distinct between different species. Further, the model predicts that ancient protein domains represent a highly connected and clustered subset of all protein domains, in complete agreement with reality. Further, we noted that evolution of populations of organisms each carrying M genes s is isomorphic to the problem of M-dimensional random walkers in space of protein sequences with two adsorbing boundary conditions: at high energies of protein native conformations where proteins become unviable and organisms carrying their genes die and at lower energies where proteins are depleted of possible stable sequences. This problem can be solved exactly and we obtained the relation between mutation rates, duplication rate and stability range of the proteins at which populations remain viable. This formula predicts the effect of mutational meltdown and provides important insights into genomic organization and evolution of viruses and simple organisms. It predicts that RNA viruses are much shorter than dsDNA ones and that genomes of thermophilic bacteria are shorther than that of their mesophilic counterparts. All these predictions are verified in bioinformatics analyses and in mutational experiments on RNA viruses. Together, these results provide a complete microscopic first principles picture of early stages of evolution prior to emergence of error-correction mechanisms at which most structural domains were discovered.
Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street,