Albany 2013: Book of Abstracts
June 11-15 2013
©Adenine Press (2012)
Electrostatic properties of bacterial DNA and promoter predictions
Electrostatic potential distribution (EPD) around DNA molecule seems to be the only physical property that could be recognized by other molecules from a distance. More than 10 year ago a method for simple and fast calculation of EPD was purposed in our laboratory (Polozov et al., 1999). It is based on Coulomb formula and allows estimating the main EPD patterns for DNA sequences of a size of a whole prokaryotic genome.
We applied the projection on latent structures discriminant analysis (PLS-DA) (M. Sarker & W.Rayens, 2003) to create 3 types of models to discriminate promoter and non-promoter sequences of E.coli K-12 genome – on the basis of its EPD profiles. Randomized, coding and promoter-like regions were used to train models as the non-promoter sequences. The information about promoters and promoter-like regions was taken from the sources (Regulon 6.0, S. Gama-Castro et al., 2008) and (K.S. Shavkunov et al., 2009) respectively.
By our models we evaluated the probability of being possible transcription start site (TSS) at the E.coli K-12 whole genome EPD with 1Å step. It was shown that more than 2500 real promoters have TSS predicted in the regions [-50, +10] Å around annotated +1 position and so could be classified as recognized TSS. No additional information about nucleotide sequence, like localization and direction of nearby ORF or positions of annotated gene start codons, were taken into account for those predictions. This makes the methods of EPD analysis a good candidates for development of multi-step promoter searching algorithms.
Left panel – The comparison of lengths for neighboring interval of predicted TSS (positive) and non-TSS (negative) positions; Right panel – Prediction accuracy of a model with different allowed regions around annotated promoter TSS.
Sarker & W.Rayens. (2003) Partial least squares for discrimination. J. Chemom. 17, 166.
S. Gama-Castro et al. (2008) RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation. NAR 36, D120-D124.
K.S. Shavkunov et al. (2009) Gains and unexpected lessons from genome-scale promoter mapping. NAR 37, 4919-4931.
Evgenia A. Temlyakova
Mechanism of Cell Functioning Group