Book of Abstracts: Albany 2007
June 19-23 2007
Deciphering Antisense Regulation in Bacteria
Untranslated RNAs play a multifarious role in regulation of gene expression. Almost a hundred of small regulatory RNAs have been discovered in E.coli and more than one thousand have been predicted by various computational approaches. Most known regulatory RNAs are encoded by independent genes (sRNAs), while 15 transcripts are partly generated from the antisense strands of coding sequences (aRNAs) and may affect translation or stability of overlapping mRNAs. Since strategies used to discover regulatory RNAs were focused primarily on intergenic regions, most aRNAs were found by chance. At the same time, recent genome-wide expression studies identified many transcripts generated from the antisense strand of annotated genes (1); computational search predicted many potential promoters for antisense transcription (2); and ChIP-chip technology revealed intragenic RNA polymerase binding sites (3-5). Thus, it becomes quite evident that antisense transcription within bacterial genes is far more prevalent, than currently expected. In this study we exploit intragenic transcription signals as primary indicators for alternative transcription; check RNA polymerase binding capacity for representative promoters in vitro; compare genomic coordinates of promoters found in silico with polymerase binding sites registered by ChIP-chip technique in vivo; classify promoters in respect to their coding potential and analyze physico-chemical features of expected RNA-products.
A total of 1192 potential start points for antisense transcription within coding sequences were predicted by promoter-search software PlatProm (group A). Besides them and expected promoters, located in front of coding regions, we also identified 709 unexpected promoter-like sites with a potentiality to initiate synthesis of shortened RNA-products from the sense strand (co-directed promoters, group C) and 341 very similar signals between convergent genes or between genes, transcribed from the opposite strand (intragenic promoters, group I) (2). Since promoters of the latter two groups also may initiate transcripts antisense to the mRNAs of neighboring genes, they were not eliminated from further analysis. RNA-polymerase binding capacity of twenty representative promoters of all these three categories were tested in vitro by gel-shift assays and in 17 cases it was confirmed (exemplified in Fig. 1). Appearance of the ChIP-chip data for the whole genome (3) provided another way of verification for predicted promoters. Only those RNA polymerase binding sites, which were registered in all three ChIP-chip experiments with p<0.01 were taken into account. Very high degrees of correspondence between in silico and in vivo data were observed for groups C and I. Thus polymerase binding sites were registered near 572 (80%) of intragenic and 275 (76%) of intergenic promoters. This percentage is much smaller (39%) for the set of antisense promoters (A), which might be explained by possible interference between antisense transcription and mRNA synthesis. In any case two methods for the genome-wide mapping of RNA polymerase binding sites complement each other. Their combination gives an advantage for determining direction of expected transcription and selecting the most reliable candidates for further analysis.
At the next step ORF Finder software (NCBI) was employed to filter out promoters, which may initiate synthesis of novel mRNAs, rather than aRNAs or sRNAs. Five hundred base pair leader sequence and three mismatches in ribosome binding site (AGGAGGT) were allowed. The highest percentage of promoters followed by putative ORFs (56%) has been found in group I regions, while the smallest one in group A (26%), assuming that most RNAs potentially produced in antisense direction within coding regions may function as untranslated RNAs.
Figure 1: [A] Results of promoter mapping by PlatProm in the genetic locus of gene dps, encoding protein of nucleoid. Bars represent promoters predicted on both strands. Only signals with p<0.0004 are shown. Direction of transcription from normal promoter Pdps and predicted antisense promoter Pa are shown by arrow. [B] Gel-shift experiment verifying RNA polymerase (RNAP) binding activity for Pa (PCR-generated with primers 1 and 2). Polymerase interaction with this genomic region in vivo has been registered in ChIP-chip experiments (3).
Physico-chemical properties of predicted RNA-products were compared with those of known sRNAs and mRNAs of comparable size. For this reason strong ρ-independent terminators were searched downstream from promoters (up to 1000 bp). Stop signals were found for 61, 60, and 67% of selected promoters from groups A, I, and C, respectively. Average lengths of expected RNA-products in these categories were 429, 431, and 403 nt. Although these values fall in the range of sizes of known sRNAs (34-465 nt), many predicted RNAs are longer than sRNAs. Stability of predicted RNAs was evaluated using RNA Structure software, supplied with thermodynamic scoring system (http://rna.chem.rochester.edu). We observed that the first order polynomial regression lines, which quite accurately represent dependence of size on free energy of folding for all categories of compared RNA molecules, do not overlap or intersect. On the basis of this criterion, sRNAs demonstrated the highest folding propensity, while the known (15 species) and predicted aRNAs showed the lowest one. Given that stability of comparable in size mRNAs, transcribed as monocistronic units from known promoters, appeared to be intermediate between sRNAs and aRNAs, the lowest stability of aRNAs may be considered as their typical feature, that should be taken into account designing alternative methods of aRNA search based on RNA structural properties.
The research was supported by Russian Foundation for Basic Research.
References and footnotes