Book of Abstracts: Albany 2003
June 17-21 2003
The Difference Between Coding and Non-coding DNA Thermal Stability: A Large Scale Surrvey of 28 Eukaryotic Organisms of Varying (G+C) Composition Studied by Melting Simulation, Base Shuffling and DNA Frequency Analysis
The melting of the coding and non-coding classes of natural DNA sequences was investigated using a program, MELTSIM, which simulates DNA melting based upon a nearest neighbor thermodynamic model (1-3). We calculated Tm results of 8,144 natural sequences from 28 eukaryotic organisms of varying FGC (mole fraction of G and C) and of 3,775 coding and 3,297 non-coding sequences derived from those natural sequences. These data demonstrated that the Tm vs. FGC relationships in coding and non-coding DNAs are both linear but have a statistically significant difference (6.6%) in their slopes. These relationships are significantly different from the Tm vs. FGC relationship embodied in the classical Marmur-Schildkraut-Doty (MSD) equation for the intact natural sequences. By analyzing the simulation results from various base shufflings of the original DNAs and the average nearest neighbor frequencies of those natural sequences across the FGC range, we showed that these differences in the Tm vs. FGC relationships are a direct result of systematic FGC-dependent biases in nearest neighbor frequencies for those two different DNA classes. Those differences in the Tm vs. FGC relationships and biases in nearest neighbor frequencies also appear between the sequences from multicellular and unicellular organisms in the same coding or non-coding classes, albeit of smaller but significant magnitudes.
Dang D. Long1
1Center for Intelligent Biomaterials