Albany 2001

category image Biomolecular
SUNY at Albany
June 19-23, 2001

Evolution of the Triplet Code and the Earliest Proteins.

A complete chronology of amino acids is derived, by exploiting 44 different criteria and hypotheses about chronological order of appearance of amino acids in the early evolution. This is summarized in consensus ranking, earliest first (1): G, A, V, D, S, E, P, L, T, N, R, I, K, Q, C, F, H, M, Y, W The established order allows to reconstruct the chronology of codons as well. Three striking features are revealed: (i) the codons providing the most stable codon-anticodon interactions appear first, (ii) the new codons appear simultaneously with their complementary counterparts, and (iii) the new codons are lowest cost derivatives of the codons acquired earlier - by mutations in the third position, and by complementary copying. Two essentially independent amino-acid alphabets are suggested by the above evolutionary scheme, for two complementary coding strands of the earliest small genes. The Glycine family includes amino acids encoded by triplets with purines in central position - G, D, S, E, Q, N, R, K, C, H, Y and W. The Alanine family consists of amino acids A, V, P, S, L, T, I, M and F, with pyrimidines in the central positions of their codons. After the earliest genes were fused to form longer molecules, the encoded protein sequences, presumably, contained a mosaic of short patches of residues from two different alphabets (2). This expectation is confirmed by massive analysis of protein sequences from complete bacterial genomes. The detected mosaic unit is 6 amino-acid residues long. It represents the first stage in the protein evolution. The second stage is reflected in the sequences as well. A preferred distance between hydrophobic residues V, A, G, L and I is observed, 25-30 residues. This size corresponds to the closed loops, basic elements of protein structure (3). The loop closure stage of the protein evolution involved formation of van-der-Waals locks at the loop ends (4). Fusion of the early genes encoding the loops resulted in modern domains (folds), which all appear as the loop-n-lock structures (4).

    References and Footnotes
  1. E. N. Trifonov, Gene 261, 139-151 (2000)
  2. E. N. Trifonov, A. Kirzhner, V. M. Kirzhner, I. N. Berezovsky, J. Mol. Evol., in press
  3. I. N. Berezovsky, A. Y. Grosberg, E. N. Trifonov, FEBS Letters 466, 283-286 (2000)
  4. I. N. Berezovsky, E. N. Trifonov, J. Mol. Biol., in press.

E. N. Trifonov

Department of Structural Biology, The Weizmann Institute of Science, Rehovot 76100, Israel
phone/FAX +972 8 934 2653 edward.trifonov@weizmann.ac.il