Book of Abstracts: Albany 2007
June 19-23 2007
Structural properties of GC-rich sequences may hinder nucleosome positioning in CpG islands
Highly GC-rich sequences (CpG islands) encompass the transcription start sites of many genes in the human genome. Unlike most of the genome, the CpG islands remain unmethylated and are associated with the open, transcriptionally competent form of chromatin rather than the closed, inactive form. We hypothesize that the sequence-dependent structural properties of DNA in the CpG islands are crucial for maintaining the open state of chromatin.
To investigate the sequence-directed positioning of nucleosomes near the transcription starts of human genes, we have developed a new numerical algorithm to map the potential positions of nucleosomes on DNA sequences. Since there are very few (if any) direct sequence-specific interactions between the histones and DNA bases, the affinity of the histone core to DNA is determined primarily by the energy needed to wrap DNA on the surface of the nucleosome. Hence, our algorithm is based on the calculation of the deformation energy required for a duplex of given sequence to follow the nucleosomal DNA trajectory (threading) . A novel function has been introduced to calculate the nucleosome-positioning scores. This function estimates the degree of deviation of the DNA deformation energy calculated for each position of the histone octamer on the sequence from the energies calculated for the neighboring positions. Such an approach allows the mapping of both nucleosome-attracting sites (characterized by a lower-than-average DNA deformation energy) and nucleosome-repelling sites (characterized by a higher-than-average DNA deformation energy) in genomic sequences.
The sequences ±10 kb (kilobases) around the transcription starts of human genes have been analyzed. All genes were split into two groups: (i) genes with CpG islands and (ii) genes without CpG islands encompassing their transcription starts. Our results, presented in Figure 1, show that the ?concentration? of nucleosome-attracting sites is noticeably lower and the ?concentration? of nucleosome-repelling sites is noticeably higher near the transcription starts of genes with CpG islands compared to the ?bulk? genome. By contrast, there is very little change in the ?concentration? of nucleosome-positioning sites of both types near the transcription starts of genes without CpG islands.
Michael Y. Tolstorukov* 1
1Department of Biological and Medical Physics, V. Karazin Kharkov National University, Kharkov, 61024, Ukraine
Figure 1. Distributions of nucleosome-attracting (A) and nucleosome-repelling (B) sites near gene starts. Data points represent the average numbers <NATT> and <NREP> of such sites occurring in a 0.5-kb running window as function of the distance, dTSS, between the window center and the transcription start site (denoted by hooked green arrows). Results for two groups of aligned genes are shown: red, 10,773 genes with CpG islands; blue, 15,642 genes without CpG islands.
The observed non-trivial distribution of nucleosome-positioning sites near the starts of genes with CpG islands can be rationalized as follows. Nucleosomal DNA is deformed non-uniformly and, usually, the presence of flexible (such as pyrimidine-purine) or stiff (such as purine-pyrimidine) dimers at the critical positions of nucleosome distortion results in appearance of nucleosome-attracting or repelling sites respectively. Since GC-rich sequences are in general more deformable, e.g. bendable, than AT-rich sequences, the presence of flexible dimers at the critical positions often does not diminish the deformation energy significantly enough for a GC-rich sequence to create a nucleosome-attracting site. By contrast, the appearance of stiff dimers at the critical positions increases the energy enough to create a nucleosome-repelling site. In other words, GC-rich regions constitute an unfavorable background for distinctive nucleosome-attracting sites and a favorable background for distinctive nucleosome-repelling sites. Thus, our results suggest that the pronounced deformability of GC-rich sequences may hinder exact nucleosome positioning in CpG islands providing new insight into the molecular mechanisms underlying the experimentally observed nucleosome depletion in CpG islands.
References and footnotes