×

Best match graphs. (English) Zbl 1415.92133

J. Math. Biol. 78, No. 7, 2015-2057 (2019); corrigendum ibid. 82, No. 6, Paper No. 47, 9 p. (2021).
Summary: Best match graphs arise naturally as the first processing intermediate in algorithms for orthology detection. Let \(T\) be a phylogenetic (gene) tree \(T\) and \(\sigma \) an assignment of leaves of \(T\) to species. The best match graph \((G,\sigma )\) is a digraph that contains an arc from \(x\) to \(y\) if the genes \(x\) and \(y\) reside in different species and \(y\) is one of possibly many (evolutionary) closest relatives of \(x\) compared to all other genes contained in the species \(\sigma (y)\). Here, we characterize best match graphs and show that it can be decided in cubic time and quadratic space whether \((G,\sigma )\) derived from a tree in this manner. If the answer is affirmative, there is a unique least resolved tree that explains \((G,\sigma )\), which can also be constructed in cubic time.

MSC:

92D15 Problems related to evolution
92D10 Genetics and epigenetics
05C90 Applications of graph theory
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Aho A, Sagiv Y, Szymanski T, Ullman J (1981) Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions. SIAM J Comput 10:405-421 · Zbl 0462.68086
[2] Aho AV, Garey MR, Ullman JD (1972) The transitive reduction of a directed graph. SIAM J Comput 1:131-137 · Zbl 0247.05128
[3] Altenhoff AM, Boeckmann B, Capella-Gutierrez S, Dalquen DA, DeLuca T, Forslund K, Jaime HC, Linard B, Pereira C, Pryszcz LP, Schreiber F, da Silva AS, Szklarczyk D, Train CM, Bork P, Lecompte O, von Mering C, Xenarios I, Sjölander K, Jensen LJ, Martin MJ, Muffato M, Gabaldón T, Lewis SE, Thomas PD, Sonnhammer E, Dessimoz C (2016) Standardized benchmarking in the quest for orthologs. Nat Methods 13:425-430
[4] Altenhoff AM, Dessimoz C (2009) Phylogenetic and functional assessment of orthologs inference projects and methods. PLoS Comput Biol 5:e1000262
[5] Bork P, Dandekar T, Diaz-Lazcoz Y, Eisenhaber F, Huynen M, Yuan Y (1998) Predicting function: from genes to genomes and back. J Mol Biol 283:707-725
[6] Bryant D, Steel M (1995) Extension operations on sets of leaf-labeled trees. Adv Appl Math 16:425-453 · Zbl 0863.92012
[7] Bull JJ, Pease CM (1989) Combinatorics and variety of mating-type systems. Evolution 43:667-671
[8] Crespelle C, Paul C (2006) Fully dynamic recognition algorithm and certificate for directed cographs. Discrete Appl Math 154:1722-1741 · Zbl 1110.68096
[9] Dalquen DA, Dessimoz C (2013) Bidirectional best hits miss many orthologs in duplication-rich clades such as plants and animals. Genome Biol Evol 5:1800-1806
[10] Deng Y, Fernández-Baca D (2018) Fast compatibility testing for rooted phylogenetic trees. Algorithmica 80:2453-2477 · Zbl 1392.68451
[11] Dondi R, Lafond M, El-Mabrouk N (2017) Approximating the correction of weighted and unweighted orthology and paralogy relations. Algorithms Mol Biol 12:4 · Zbl 1383.92054
[12] Elmasry A (2010) The subset partial order: computing and combinatorics. In: Sedgewick R, Golin M (eds) Proceedings of the seventh workshop on analytic algorithmics and combinatorics (ANALCO). Society for Industrial and Applied Mathematics, Philadelphia, pp 27-33 · Zbl 1430.68198
[13] Force A, Lynch M, Pickett FB, Amores A, Yl Yan, Postlethwait J (1999) Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531-1545
[14] Geiß M, Anders J, Stadler PF, Wieseke N, Hellmuth M (2018) Reconstructing gene trees from Fitch’s xenology relation. J Math Biol 77:1459-1491 · Zbl 1396.05025
[15] Gries D, Martin AJ, van de Snepscheut JLA, Udding JT (1989) An algorithm for transitive reduction of an acyclic graph. Sci Comput Prog 12:151-155 · Zbl 0678.68061
[16] Grünewald S, Steel M, Swenson MS (2007) Closure operations in phylogenetics. Math Biosci 208:521-537 · Zbl 1119.92048
[17] Harel D, Tarjan RE (1984) Fast algorithms for finding nearest common ancestors. SIAM J Comput 13:338-355 · Zbl 0535.68022
[18] Hellmuth M (2017) Biologically feasible gene trees, reconciliation maps and informative triples. Algorithm Mol Biol 12:23
[19] Hellmuth M, Hernandez-Rosales M, Huber KT, Moulton V, Stadler PF, Wieseke N (2013) Orthology relations, symbolic ultrametrics, and cographs. J Math Biol 66:399-420 · Zbl 1408.05038
[20] Hellmuth M, Marc T (2015) On the Cartesian skeleton and the factorization of the strong product of digraphs. Theor Comput Sci 565:16-29 · Zbl 1315.05135
[21] Hellmuth M, Stadler PF, Wieseke N (2017) The mathematics of xenology: di-cographs, symbolic ultrametrics, 2-structures and tree-representable systems of binary relations. J Math Biol 75:199-237 · Zbl 1368.05023
[22] Hellmuth, M.; Wieseke, N.; Pontarotti, P. (ed.), From sequence data incl. orthologs, paralogs, and xenologs to gene and species trees, 373-392 (2016), Cham
[23] Hellmuth M, Wieseke N (2018) On tree representations of relations and graphs: symbolic ultrametrics and cograph edge decompositions. J Comb Opt 36:591-616 · Zbl 1402.90147
[24] Hellmuth M, Wieseke N, Lechner M, Lenhof HP, Middendorf M, Stadler PF (2015) Phylogenetics from paralogs. Proc Natl Acad Sci USA 112:2058-2063
[25] Hernandez-Rosales M, Hellmuth M, Wieseke N, Huber KT, Moulton V, Stadler PF (2012) From event-labeled gene trees to species trees. BMC Bioinf 13:S6
[26] Jahangiri-Tazehkand S, Wong L, Eslahchi C (2017) OrthoGNC: a software for accurate identification of orthologs based on gene neighborhood conservation. Genomics Proteomics Bioinf 15:361-370
[27] Kumar S (2005) Molecular clocks: four decades of evolution. Nat Rev Genet 6:654-662
[28] Lafond M, Dondi R, El-Mabrouk N (2016) The link between orthology relations and gene trees: a correction perspective. Algorithms Mol Biol 11:4 · Zbl 1383.92054
[29] Lafond M, El-Mabrouk N (2014) Orthology and paralogy constraints: satisfiability and consistency. BMC Genomics 15:S12
[30] Lechner M, Findeiß S, Steiner L, Marz M, Stadler PF, Prohaska SJ (2011) Proteinortho: detection of (co-)orthologs in large-scale analysis. BMC Bioinf 12:124
[31] Lechner M, Hernandez-Rosales M, Doerr D, Wieseke N, Thévenin A, Stoye J, Hartmann RK, Prohaska SJ, Stadler PF (2014) Orthology detection combining clustering and synteny for very large datasets. PLoS ONE 9:e105015
[32] McKenzie R (1971) Cardinal multiplication of structures with a reflexive relation. Fund Math 70:59-101 · Zbl 0228.08002
[33] Moreno-Hagelsieb G, Latimer K (2008) Choosing BLAST options for better detection of orthologs as reciprocal best hits. Bioinformatics 24:319-324
[34] Nieselt-Struwe (2001) Quartet-mapping, a generalization of the likelihood-mapping procedure. Mol Biol Evol 18:1204-1219
[35] Overbeek R, Fonstein M, D’Souza M, Pusch GD, Maltsev N (1999) The use of gene clusters to infer functional coupling. Proc Natl Acad Sci USA 96:2896-2901
[36] Pritchard P (1995) A simple sub-quadratic algorithm for computing the subset partial order. Inf Process Let 56:337-341 · Zbl 0875.68463
[37] Rauch Henzinger M, King V, Warnow T (1999) Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology. Algorithmica 24:1-13 · Zbl 0936.68026
[38] Schieber B, Vishkin U (1988) On finding lowest common ancestors: simplification and parallelization. SIAM J Comput 17:1253-1262 · Zbl 0669.68049
[39] Semple C (2003) Reconstructing minimal rooted trees. Discrete Appl Math 127:489-503 · Zbl 1038.05014
[40] Semple C, Steel M (2003) Phylogenetics. Oxford University Press, Oxford · Zbl 1043.92026
[41] Setubal, JC; Stadler, PF; Setubal, JC (ed.); Stadler, PF (ed.); Stoye, J. (ed.), Gene phyologenies and orthologous groups, No. 1704, 1-28 (2018), Heidelberg
[42] Sumner DP (1973) Point determination in graphs. Discrete Math 5:179-187 · Zbl 0265.05124
[43] Tatusov RL, Koonin EV, Lipman DJ (1997) A genomic perspective on protein families. Science 278:631-637
[44] Train CM, Glover NM, Gonnet GH, Altenhoff AM, Dessimoz C (2017) Orthologous matrix (OMA) algorithm 2.0: more robust to asymmetric evolutionary rates and more scalable hierarchical orthologous group inference. Bioinformatics 33:i75-i82
[45] Wall DP, Fraser HB, Hirsh AE (2003) Detecting putative orthologs. Bioinformatics 19:1710-1711
[46] Wolf YI, Koonin EV (2012) A tight link between orthologs and bidirectional best hits in bacterial and archaeal genomes. Genome Biol Evol 4:1286-1294
[47] Yu C, Zavaljevski N, Desai V, Reifman J (2011) QuartetS: a fast and accurate algorithm for large-scale orthology detection. Nucleic Acids Res 39:e88
[48] Zuckerkandl, E.; Pauling, LB; Kasha, M. (ed.); Pullman, B. (ed.), Molecular disease, evolution, and genic heterogeneity, 189-225 (1962), New York
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.