Book of Abstracts: Albany 2011
June 14-18 2011
©Adenine Press (2010)
Cis-Regulatory Modules: Identification in silico and Understanding of Gene Regulatory Networks
Cis-regulatory modules (CRM) are segments of DNA responsible for tissue- and time- specific regulation of gene expression (1). The length of CRMs is difficult to estimate directly but it is believed to vary from several hundreds to several thousands of base pairs. In multicellular eukaryotes CRMs may be located not only in the upstream vicinity of transcription start sites of the dependent genes but also at tens of thousands nucleotides upstream or downstream from the transcription start sites. CRMs contain multiple binding sites for protein factors regulating the transcription. Identification of CRMs in silico and prediction of their regulatory function allows one to suggest new regulatory inputs controlling expression of particular genes, which makes a useful introductory step before modeling of cell signaling processes. CRMs make an important component of meaningful non-coding DNA. Genetic variations overlapping with CRMs contribute to functional disorders associated with non-coding DNA.
Detail investigation of CRM sequences exhibit that transcription factor binding sites (TFBS) form complex arrangements, probably corresponding to yet unknown regulatory code of gene expression. The simplest feature found in CRMs is clusters of binding sites. Eukaryotic gene expression usually is controlled by many inputs from different regulatory circuits, so a typical CRM contains many binding sites for different transcription factors (TF), both activators and repressors, thus integrating different regulatory contributions. Thus, a typical CRM contains many sites for different TFs which are rather densely packed and form a so-called heterotypic cluster (2). Sites for different TFs in such clusters are often found at specific distances from each other (3), probably facilitating correct positioning of proteins at DNA necessary for protein-protein interaction, either direct or via adapter proteins (4). In addition, many CRMs contain many occurrences of the same binding sites, forming the so-called homotypic clusters of TFs (5,6). The function of homotypic clustering is yet unclear; probably it is served for providing for a specific TF-concentration dependence for TF binding (7). The alternative explanation is that this type of arrangement is needed to facilitate lateral diffusion of binding factor along DNA to its functional binding position (8). The exceptional form of a homotypic cluster is a tandem repeat made from sequences, specifically bound by TFs. This type of arrangement is characterstic for some CRMs in Drosophila (9), and might serve as an origin of some CRMs in evolution. Studying DNA sequences of CRMs often help to decipher yet unknown regulatory circuits. Two examples are touched upon: Drosophila development and human hypoxia response cascade. I’m also going to discuss how CRMs are predicted in silico, and consequences of site turnover for prediction of CRM and TFBS. This research has been supported by Presidium of Russian Academy of Sciences Program in Molecular and Cell Biology.
Ivan V. Kulakovskiy1,3
1Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia