Projects


MGRA

Project lead: Pavel Avdeyev

MGRA (Multiple Genome Rearrangements and Ancestors) is a tool for reconstruction of ancestor genomes and evolutionary history of extant genomes.

GitHub repository: https://github.com/ablab/mgra
Web-server: http://mgra.cblab.org


Genome Scaffolder

Project lead: Sergey Aganezov

GOS-ASM is a tool for multi-genome simultaneous co-scaffolding. Our framework aims at improving existing unfinished annotated assemblies by performing multi genome gene order analysis, with optional incorporation of phylogenetic relations between observed genomes and flanking repeats at the end of unfinished fragments.

Full project description: cblab.org/scaffolder
Dedicated software page:
cblab.org/gos-asm
GitHub repository:
github.com/aganezov/gos-asm
Issue tracker: youtrack.cblab.org/issues/GOSASM


Estimating of True Evolutionary Distance

Project lead: Nikita Alexeev

The ability to estimate the evolutionary distance between extant genomes plays a crucial role in many phylogenomic studies. Often such estimation is based on the parsimony assumption, implying that the distance between two genomes can be estimated as the minimal number of genome rearrangements required to transform one genome into the other. However, in reality the parsimony assumption may not always hold, emphasizing the need for estimation that does not rely on the minimal number of genome rearrangements. While there exists a method for such estimation, it however assumes that genomes can be broken by rearrangements equally likely at any position in the course of evolution. This assumption, known as the random breakage model, has recently been refuted in favor of the more rigorous fragile breakage model postulating that only certain “fragile” genomic regions are prone to rearrangements. We propose a new method for estimating the evolutionary distance between two genomes with high accuracy under the fragile breakage mode.
Publications:

Combinatorial Scoring of Phylogenetic Networks

Project lead: Nikita Alexeev

Construction of phylogenetic trees and networks for extant species from their characters represents one of the key problems in phylogenomics. While solution to this problem is not always uniquely defined and there exist multiple methods for tree/network construction, it becomes important to measure how well constructed networks capture the given character relationship across the species.
In the current study, we propose a novel method for measuring the specificity of a given phylogenetic network in terms of the total number of distributions of character states at the leaves that the network may impose. While for binary phylogenetic trees, this number has an exact formula and depends only on the number of leaves and character states but not on the tree topology, the situation is much more complicated for non-binary trees or networks. Nevertheless, we develop an algorithm for combinatorial enumeration of such distributions, which is applicable for arbitrary trees and networks under some reasonable assumptions.

Sage Code: cactus_network
Publications:

  • N. Alexeev and M. A. Alekseyev. “Combinatorial Scoring of Phylogenetic Networks”. Lecture Notes in Computer Science 9797 (2016), 560–572. doi:10.1007/978-3-319-42634-1_45 arXiv:1602.02841