Book of Abstracts: Albany 2003

category image Albany 2003
Conversation 13
Abstract Book
June 17-21 2003

Conservation of Genetic Information Between Regulatory Proteins and their Cognate DNA/RNA Binding Sites: A Basis for Site-specific Protein/Nucleotide Recognition and the Origin of the Genetic Code

Conservation of information between prokaryotic and eukaryotic regulatory proteins and their cognate binding sites on DNA and RNA (operators and response elements) has been observed and is proposed as a basis for site-specific recognition. We present analyses of protein/nucleotide interactions based on information from genetic sequence comparisons, secondary structure prediction, molecular model building, molecular dynamics simulations, X-ray crystallography, NMR and point mutation studies in support of our hypothesis describing the underlying mechanism of site specific protein/nucleotide recognition. We have examined several cognate protein/nucleotide complexes ranging in origin from viruses, bacteria and yeast to rodents and humans. The results indicate that amino acids within nucleotide binding domains of regulatory proteins specifically interact with their codon/anticodon nucleotides within their cognate DNA or RNA binding sites (1-6).

Codon recognition has been observed in Tetrahymena group I self-splicing intronic RNA by arginine (7). The arginine sidechain shows stereo-selective binding for its codons AGA, CGA and AGG, which are conserved at the catalytic site in 66 group I sequences (8). These observations of specific amino acid-codon interactions are consistent with our earlier findings (1-6) as well as those we report herein. Nucleotide sequence similarity between cDNA encoding glucocorticoid receptor (GR) exon splice junction sites and cognate glucocorticoid response element (GRE) and flanking nucleotides as well as the spatial alignment of amino acids of the exon 3, 4 and 5 encoded structures of the GR DNA binding domain (DBD) with trinucleotides identical to their cognate codons/anticodons within the GRE and its flanks suggest that these structures may have been template dependent in their evolution (i.e. peptides acting as templates for nucleotide polymerization or vice-versa) (see figure 1). A similar recognition pattern has also been observed in the Ala tRNA synthetase system that regulates its own transcription by binding at an operator upstream of the Ala tRNA synthetase gene. Nucleotide sequences similar to this operator are found within the Ala tRNA synthetase protein binding regions of the Ala tRNA and within the cDNA encoding the DNA binding domain of the Ala tRNA synthetase protein.

Figure 1: a) A schematic of local nucleotide sequence alignments for mouse mammary tumor virus 5' long terminal repeat (GenBank locus MMTPRGR1) nucleotides ranging from -312 to -40 upstream from the MMTV transcription start site vs. exon 3: 1318 to 1485 bp, exon 4: 1486 to 1602 bp and exon 5: 1603 to 1626 bp of the GR DBD (GenBank locus HUMGCRA). Below, (b-d), computer models of the GR DBD from NMR structure determination with putative flanking alpha helices attached docked at a 39bp sequence of MMTPRGR1 are shown. Composites of amino acid and codon alignments are highlighted on the protein and DNA respectively. All highlighted residues have a dot surface indicating the van der Waals surfaces of each atom in that residue. The DNA nucleotides are color-coded: Ade = green, Thy = red, Gua = yellow and Cyt = blue. Color-coding of amino acids is based on polarity: positively charged side chains = blue, negatively charged side chains = red, uncharged polar side chains = yellow and nonpolar side chains = purple. The protein is docked at a distance of about 10 angstroms from the DNA for visual clarity. b) Exon 3 encoded DNA recognition helix alignments c) Exon 4 encoded beta strand alignments. d) Exon 5 encoded putative alpha helix alignments.

Our findings indicate that the nucleotide recognition helices of regulatory proteins and their cognate DNA (or RNA) binding sites are conserved remnants of primordial structures capable of molecular recognition. Therefore we propose that prebiotic, template directed autocatalytic synthesis of mutually cognate peptides and polynucleotides resulted in their amplification and evolutionary conservation in contemporary prokaryotic and eukaryotic organisms as a genetic regulatory apparatus. These observations are consistent with the idea that the origin of the genetic code and site specific protein/nucleotide recognition has the same underlying mechanism - direct amino acid- cognate codon/anticodon interaction (9-11). The basic mechanism of this recognition appears to be stereochemical complementarity between the proteins' nucleotide recognition alpha helix amino acids and their cognate codon/anticodon nucleotides within their specific nucleotide binding sites. In addition, three other areas of research support this hypothesis and complement our findings: i) Correlations between amino acids' side chain physicochemical characteristics and the nucleotides of their cognate codons (12-14). ii) Stereochemical complementarity and structural relationships between amino acids and their cognate codons and/or anticodons nucleotides (15-17). iii) Direct in vitro binding preference for codon nucleotides by cognate amino acids (8, 18-20). Furthermore, our earlier work has demonstrated that our approach, applied to genetic sequence analysis, secondary structural prediction and molecular model building, can be used as a predictive tool for determining sites on DNA regulatory proteins that recognize cognate DNA binding sites and vice-versa (see figure 2).

Figure 2: a) A schematic of the nucleotide sequence of a mouse mammary tumor virus 5' long terminal repeat containing a well characterized GRE (MMTPRGR1 -312 to -40 bp) compared to the cDNA encoding the human glucocorticoid receptor DNA binding domain (HUMGCRA 1291 to 1749 bp). b) The maximally similar nucleotide subsequences from the schematic in part a is shown. GR binding sites have been detected within the MMTPRGR1 nucleotide sequence with nuclease footprinting studies by others and are shown as large boxes (21) and dashed underlines and overlines (22-23). Small boxes contain the two glucocorticoid receptor binding half-sites GTTACA and TGTTCT respectively. Nucleotide base pair matches between MMTPRGR1 and HUMGCRA cDNA sequences are starred. c) The amino acid sequence corresponding to the HUMGCRA subsequence from part b is shown in one-letter code with the amino acids numbered as in the Rat GR (24). A predicted GR DNA recognition helix is underlined. d) A computer model of the predicted GR DNA recognition helix from part c. e) RMS comparison of computer models of our predicted (1989) GR DNA recognition helix to subsequent (1990 and 1991 respectively) NMR and X-ray crystallographic structural determinations of a GR DNA recognition helix.

Lester F. Harris
Michael R. Sullivan

Abbott Northwestern Hospital
David F. Hickok Memorial Cancer Research Laboratory
800 E. 28th Street
Minneapolis, MN 55407

References and Footnotes
  1. Harris, L., Sullivan, M. and Hickok, D., Computers and Mathematics with Applications 20, 1-23 (1990 - accepted for publication 1989).
  2. Harris, L., Sullivan, M. and Hickok, D., Computers and Mathematics with Applications 20, 25-48 (1990 - accepted for publication 1989).
  3. Harris, L., Sullivan, M. and Hickok, D., Proceedings of the National Academy of Sciences 90, 5534-5538 (1993).
  4. Harris, L., Sullivan, M., Popken-Harris, P. and Hickok, D., Biological Structure and Dynamics, Proceedings of the Ninth Conversation, State University of New York, Albany, NY 1995 Adenine Press 1996, pages 61-82.
  5. Harris, L., Sullivan, M., Popken-Harris, P. and Hickok, D., The World Wide Web Journal of Biology, {www.epress.com/w3jbio) Volume 3 article #1 (1998).
  6. Harris, L., Sullivan, M and Hatfield Directed Molecular Evolution. Origins of Life and Evolution of the Biosphere 29, 425-35 (1999).
  7. Yarus, M. and Christian, E., Nature (London) 342, 349-350 (1989).
  8. Yarus, M., New Biol. 3, 183-189 (1991).
  9. Nelsestuen, G., Journal of Molecular Evolution 11, 109-120 (1978).
  10. Nelsestuen, G., Biochemistry 18, 2843-2846 (1979).
  11. Woese, C., J. Mol. Biol. 43, 235-240 (1969).
  12. Jungck, J., J. Mol. Evol. 11, 211-224 (1978).
  13. Pieber, M., and Toha, J., Origins Life 13, 139-146 (1983).
  14. Sjostrom, M. and Wold, S., J. Mol. Evol. 22, 272-277 (1985).
  15. Hendry, L, Bransome Jr., E., Hutson, M. and Campbell, L., Perspect. Biol. Med. 27, 623-651 (1984).
  16. Lacey, J., Mullins Jr., D., and Khaled, M., Origins Life 14, 505-511 (1984).
  17. Al'tshtein, A. and Efimov, A., Mol. Biol. (Moscow) 22, 1411-1429 (1988); Mol. Biol. Engl. Transl. 22, 1133-1149 (1988).
  18. Saxinger, C. and Ponnamperuma, C., Origins Life 5, 189-200 (1974).
  19. Lacey, J. and Mullins, D., Origins Life 13, 3-42 (1983).
  20. Lacey J. C. Jr, Wickramasinghe N. S., Cook G. W., Orig. Life. Evol. Biosph. 22, 243-75. Review (1992).
  21. Payvar, F., DeFranco, D., Firestone, G., Edgar, B., Wrange, O., Okret, S., Gustafsson, and J., Yamamoto, K., Cell 35, 381-392 (1983).
  22. Scheidereit, C., Geisse, S., Westphal, H., and Beato, M., Nature (London) 304, 749-752 (1983).
  23. Scheidereit, C., and Beato, M., Proc.Natl. Acad. Sci. USA 81, 3029-3034 (1984).
  24. Miesfeld, R., Godowski, P., Maler, B. and Yamamoto, K., Science 236, 423-426 (1987).