Book of Abstracts: Albany 2005
Evolution of Spliceosomal Introns: Insights From Comparative Genomics
The availability of multiple, complete eukaryotic genome sequences and the development of the comprehensive evolutionary classification of eukaryotic genes embodied in the collection of clusters of probable orthologs (euKaryotic Orthologous Groups, or KOGs) allows one to address many fundamental evolutionary questions on genome scale. In an attempt to reveal the major trends in the evolution of eukaryotic gene structure, intron positions were compared for 684 KOGs from 8 complete genomes of animals, plants, fungi, and protists, and parsimonious scenarios were constructed for evolution of exon-intron structure for the respective genes. Remarkable conservation of intron position through >1.5 billion years of evolution was revealed, with one third of the introns in the malaria parasite Plasmodium falciparum shared with at least one crown-group eukaryote. Paradoxically, humans share many more introns with the plant Arabidopsis thaliana than with fly or nematode. The evolutionary scenario (Figure 1) inferred from this data holds that the common ancestor of Plasmodium and the crown group and especially the common ancestor of animals, plants, and fungi had numerous introns. Most of these ancestral introns, which are retained in the genomes of vertebrates and plants, have been lost in fungi, nematodes and arthropods, and probably Plasmodium. A strong positive correlation was noticed between the loss and gain of genes and loss and gain of introns in highly conserved genes in different eukaryotic lineages, pointing to the existence of distinct, lineage-specific trends toward genome shrinkage or expansion. Comparison of various features of ancient and younger introns starts shedding light on probable mechanisms of intron insertion, indicating that propagation of old introns is unlikely to be a major mechanism for origin of new ones. The existence and structure of ancestral protosplice sites was addressed by examining the context of introns inserted within codons which encode amino acids conserved in all eukaryotes and, accordingly, are not subject to selection for splicing efficiency. We show that introns indeed predominantly insert into specific protosplice sites which have the consensus sequence (A/C)AG|Gt.
Figure 1: The parsimonious evolutionary scenario of intron gain/loss for the most likely topology of the eukaryotic phylogenetic tree. Intron gains and losses are mapped to each species and each internal branch; dashes show branches for which losses could not be inferred from the available data. The (minimal) number of introns inferred to have existed in the analyzed set of genes in the respective ancestral forms is indicated in a box next to each internal node of the tree. Species abbreviations: At, Arabidopsis thaliana; Ce, Caenorhabditis elegans; Dm, Drosophila melanogaster; Hs, Homo sapiens; Ag, Anopheles gambiae; Pf, Plasmodium falciparum; Sc, Saccharomyces cerevisiae; Sp, Schizosaccharomyces pombe.