Abstracts (first author)
Polymorphisms-aware phylogenetic models
Comparative analysis of genomes of related species, and of different individuals of the same species, can reveal adaptive trends in the history of the considered taxa, as well as show intensity and genomic variation of evolutionary patterns. However, these intra and interspecific data also bring new challenges, such as the presence of incomplete lineage sorting and ancestral shared polymorphisms.
We propose a new POlymorphisms-aware phylogenetic MOdel (PoMo) that relaxes the assumption of instantaneous substitutions of standard phylogenetic approaches. A substitution is hereby modeled through a mutational event followed by a gradual fixation. Our model utilizes both divergence and polymorphism data from different species/populations. By allowing polymorphisms at internal phylogenetic nodes, it also naturally accounts for incomplete lineage sorting and shared ancestral polymorphisms. PoMo can accurately and time-efficiently estimate phylogenetic trees of any shape and dimension, e.g. species trees, population trees, or any combination of those. It can also disentangle the contributions of mutations and fixation biases in substitution patterns.
We analyzed synonymous sites in genome-wide alignments of human, chimpanzee, and two orangutan species. Using PoMo, we obtained accurate estimates of mutation rates and GC-biased gene conversion (gBGC) in great apes. We found that both mutation rates and gBGC vary with GC content, determining the well-known differences in substitution rates. Our results are consistent with the presence of directional selection in synonymous sites regarding exonic splicing enhancers.
Lastly, we show with simulations that PoMo accurately estimates phylogenetic branch lengths, whereas standard substitutions models present large biases due to ancestral polymorphisms. Furthermore, our methods are more computationally efficient than coalescent-based approaches.