Book of Abstracts: Albany 2003
June 17-21 2003
Conservation of Genetic Information Between Regulatory Proteins and their Cognate DNA/RNA Binding Sites: A Basis for Site-specific Protein/Nucleotide Recognition and the Origin of the Genetic Code
Conservation of information between prokaryotic and eukaryotic regulatory proteins and their cognate binding sites on DNA and RNA (operators and response elements) has been observed and is proposed as a basis for site-specific recognition. We present analyses of protein/nucleotide interactions based on information from genetic sequence comparisons, secondary structure prediction, molecular model building, molecular dynamics simulations, X-ray crystallography, NMR and point mutation studies in support of our hypothesis describing the underlying mechanism of site specific protein/nucleotide recognition. We have examined several cognate protein/nucleotide complexes ranging in origin from viruses, bacteria and yeast to rodents and humans. The results indicate that amino acids within nucleotide binding domains of regulatory proteins specifically interact with their codon/anticodon nucleotides within their cognate DNA or RNA binding sites (1-6).
Codon recognition has been observed in Tetrahymena group I self-splicing intronic RNA by arginine (7). The arginine sidechain shows stereo-selective binding for its codons AGA, CGA and AGG, which are conserved at the catalytic site in 66 group I sequences (8). These observations of specific amino acid-codon interactions are consistent with our earlier findings (1-6) as well as those we report herein. Nucleotide sequence similarity between cDNA encoding glucocorticoid receptor (GR) exon splice junction sites and cognate glucocorticoid response element (GRE) and flanking nucleotides as well as the spatial alignment of amino acids of the exon 3, 4 and 5 encoded structures of the GR DNA binding domain (DBD) with trinucleotides identical to their cognate codons/anticodons within the GRE and its flanks suggest that these structures may have been template dependent in their evolution (i.e. peptides acting as templates for nucleotide polymerization or vice-versa) (see figure 1). A similar recognition pattern has also been observed in the Ala tRNA synthetase system that regulates its own transcription by binding at an operator upstream of the Ala tRNA synthetase gene. Nucleotide sequences similar to this operator are found within the Ala tRNA synthetase protein binding regions of the Ala tRNA and within the cDNA encoding the DNA binding domain of the Ala tRNA synthetase protein.
Figure 1: a) A schematic of local nucleotide sequence alignments for mouse mammary tumor virus 5' long terminal repeat (GenBank locus MMTPRGR1) nucleotides ranging from -312 to -40 upstream from the MMTV transcription start site vs. exon 3: 1318 to 1485 bp, exon 4: 1486 to 1602 bp and exon 5: 1603 to 1626 bp of the GR DBD (GenBank locus HUMGCRA). Below, (b-d), computer models of the GR DBD from NMR structure determination with putative flanking alpha helices attached docked at a 39bp sequence of MMTPRGR1 are shown. Composites of amino acid and codon alignments are highlighted on the protein and DNA respectively. All highlighted residues have a dot surface indicating the van der Waals surfaces of each atom in that residue. The DNA nucleotides are color-coded: Ade = green, Thy = red, Gua = yellow and Cyt = blue. Color-coding of amino acids is based on polarity: positively charged side chains = blue, negatively charged side chains = red, uncharged polar side chains = yellow and nonpolar side chains = purple. The protein is docked at a distance of about 10 angstroms from the DNA for visual clarity. b) Exon 3 encoded DNA recognition helix alignments c) Exon 4 encoded beta strand alignments. d) Exon 5 encoded putative alpha helix alignments.
Our findings indicate that the nucleotide recognition helices of regulatory proteins and their cognate DNA (or RNA) binding sites are conserved remnants of primordial structures capable of molecular recognition. Therefore we propose that prebiotic, template directed autocatalytic synthesis of mutually cognate peptides and polynucleotides resulted in their amplification and evolutionary conservation in contemporary prokaryotic and eukaryotic organisms as a genetic regulatory apparatus. These observations are consistent with the idea that the origin of the genetic code and site specific protein/nucleotide recognition has the same underlying mechanism - direct amino acid- cognate codon/anticodon interaction (9-11). The basic mechanism of this recognition appears to be stereochemical complementarity between the proteins' nucleotide recognition alpha helix amino acids and their cognate codon/anticodon nucleotides within their specific nucleotide binding sites. In addition, three other areas of research support this hypothesis and complement our findings: i) Correlations between amino acids' side chain physicochemical characteristics and the nucleotides of their cognate codons (12-14). ii) Stereochemical complementarity and structural relationships between amino acids and their cognate codons and/or anticodons nucleotides (15-17). iii) Direct in vitro binding preference for codon nucleotides by cognate amino acids (8, 18-20). Furthermore, our earlier work has demonstrated that our approach, applied to genetic sequence analysis, secondary structural prediction and molecular model building, can be used as a predictive tool for determining sites on DNA regulatory proteins that recognize cognate DNA binding sites and vice-versa (see figure 2).
Lester F. Harris
Abbott Northwestern Hospital