Albany 2013: Book of Abstracts

category image Albany 2013
Conversation 18
June 11-15 2013
©Adenine Press (2012)

Text segmentation approach reveals simple repeat "fossils" in genomic sequences

A novel concept, on mechanisms of evolution of genes and genomes, formulated recently in (Frenkel and Trifonov, 2012; Koren and Trifonov, 2011), and suggested by results of earlier works, starting from (Ohno, 1972), is that the sequences evolve largely by local events of tandem repeats expansion and subsequent mutational changes in the repeats. According to this view, frequently occurring segments of tandemly repeating codons manifest the immediate memory about the recent expansion events. The main arguments in favor of this hypothesis are:

  • Codons of GCC, GCA and GAA expanding families detected in triplet expansion diseases are dominant, making 46.6% of all observed repeats in mRNA.
  • Those codons, which are more frequently found in tandem, are also generally more frequent in the regions with no repeats.
  • Sequence segments up to 300 nucleotides in size, starting and ending with the same triplet, have substantially elevated content of the border triplet itself and of the point mutation derivatives of this triplet.
  • More than 40% of natural sequences have both the dominant codon, and one of its first derivatives, on the top of the codon frequency list (instead of expected, random case, 15%).

  • We applied a clustering (text segmentation) algorithm for rigorous mapping of the original, now hidden, triplet expansions in genomic sequence. A significant difference between the natural sequences and the corresponding shuffled sequences is detected. The natural fragments are longer and more similar to the putative repeat sequences. More than 35% of bacterial genomic sequences detectably “remember” their ancient expansion history. A significant difference between segmentations of genomic sequences from different taxonomic classes was detected. The developed tool opens the possibility to investigate the influence of the ancient triplet expansion events on modern protein 3D-structure.


      Frenkel Z.M. & Trifonov, E.N. (2012). Origin and evolution of genes and genomes. Crucial role of triplet expansions, Journal of Biomolecular Structure & Dynamics, 30, 201-210.

      Koren, Z. and Trifonov, E.N. (2011). Role of Everlasting Triplet Expansions in Protein Evolution, Journal of Molecular Evolution, 72, 232-239.

      Ohno, S. (1972). In Smith, H.H. (ed), Evolution of Genetic Systems.

    Zakharia M. Frenkel1, 2
    Edward N. Trifonov2

    1 Department of Software Engineering
    ORT Braude College
    Karmiel, Israel
    2 Genome Diversity Center
    Institute of Evolution
    University of Haifa
    Haifa, Israel

    Ph/Fx: (972)-4-8288096