×

Yule-generated trees constrained by node imbalance. (English) Zbl 1281.92052

Summary: The Yule process generates a class of binary trees which is fundamental to population genetic models and other applications in evolutionary biology. We introduce a family of sub-classes of ranked trees, called \({\Omega}\)-trees, which are characterized by imbalance of internal nodes. The degree of imbalance is defined by an integer \(0\leq {\omega}\). For caterpillars, the extreme case of unbalanced trees, \({\omega}=0\). Under models of neutral evolution, for instance the Yule model, trees with small \({\omega}\) are unlikely to occur by chance. Indeed, imbalance can be a signature of permanent selection pressure, such as observable in the genealogies of certain pathogens. From a mathematical point of view it is interesting to observe that the space of \({\Omega}\)-trees maintains several statistical invariants although it is drastically reduced in size compared to the space of unconstrained Yule trees. Using generating functions, we study here some basic combinatorial properties of \({\Omega}\)-trees. We focus on the distribution of the number of subtrees with two leaves. We show that expectation and variance of this distribution match those for unconstrained trees already for very small values of \({\omega}\).

MSC:

92D15 Problems related to evolution
92D10 Genetics and epigenetics
05C05 Trees
92-08 Computational methods for problems pertaining to biology
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Yule, G. U., A mathematical theory of evolution, based on the conclusions of Dr. J.C. Willis, F.R.S., Philosophical Transactions of the Royal Society of London. Series B, Containing Papers of a Biological Character, 213, 21 (1925)
[2] Kingman, J. F.C., The coalescent, Stochastic Processes and their Applications, 13, 235 (1982) · Zbl 0491.60076
[3] Aldous, D., Probability distributions on cladograms, (Random Discrete Structures (1996), Springer), 1 · Zbl 0841.92015
[4] Zhu, S.; Degnan, J. H.; Steel, M., Clades, clans, and reciprocal monophyly under neutral evolutionary models, Theoretical Population Biology, 79, 4, 220 (2011) · Zbl 1338.92091
[5] Wedderburn, J. H.M., The functional equation \(g(x^2) = 2 \alpha x + [g(x)]^2\), The Annals of Mathematics, 24, 2, 121 (1922) · JFM 49.0244.02
[6] Harding, E. F., The probabilities of rooted tree-shapes generated by random bifurcation, Advances in Applied Probability, 3, 1, 44 (1971) · Zbl 0241.92012
[7] Steel, M.; McKenzie, A., Properties of phylogenetic trees generated by Yule-type speciation models, Mathematical Biosciences, 170, 1, 91 (2001) · Zbl 0977.92017
[8] Disanto, F.; Wiehe, T., Exact enumeration of cherries and pitchforks in ranked trees under the coalescent model, Mathematical Biosciences, 242, 2, 195 (2013) · Zbl 1261.92040
[9] McKenzie, A.; Steel, M., Distributions of cherries for two models of trees, Mathematical Biosciences, 164, 1, 81 (2000) · Zbl 0947.92021
[10] Blum, M. G.; François, O., On statistical tests of phylogenetic tree imbalance: the Sackin and other indices revisited, Mathematical Biosciences, 195, 2, 141 (2005) · Zbl 1065.62183
[11] Rosenberg, N. A., The mean and variance of the numbers of r-pronged nodes and r-caterpillars in Yule-generated genealogical trees, Annals of Combinatorics, 10, 1, 129 (2006) · Zbl 1090.05019
[12] Kirkpatrick, M.; Slatkin, M., Searching for evolutionary patterns in the shape of a phylogenetic tree, Evolution, 47, 4, 1171 (1993)
[13] Sackin, M. J., Good and bad phenograms, Systematic Zoology, 21, 2, 225 (1972)
[14] Mooers, A. O.; Heard, S. B., Inferring evolutionary process from phylogenetic tree shape, The Quarterly Review of Biology, 72, 1, 31 (1997)
[15] Aldous, D. J., Stochastic models and descriptive statistics for phylogenetic trees, from Yule to today, Statistical Sciences, 16, 23 (2001) · Zbl 1127.60313
[16] Mooers, A. O.; Heard, S. B., Using tree shape, Systematic Biology, 51, 6, 833 (2002)
[17] Blum, M. G.; Francois, O., Which random processes describe the tree of life? a large-scale study of phylogenetic tree imbalance, Systematic Biology, 55, 4, 685 (2006)
[18] Grenfell, B. T.; Pybus, O. G.; Gog, J. R.; Wood, J. L.; Daly, J. M.; Mumford, J. A.; Holmes, E. C., Unifying the epidemiological and evolutionary dynamics of pathogens, Science, 303, 5656, 327 (2004)
[19] Fay, J. C.; Wu, C. I., Hitchhiking under positive Darwinian selection, Genetics, 155, 1405 (2000)
[20] Li, H., A new test for detecting recent positive selection that is free from the confounding impacts of demography, Molecular Biology and Evolution, 28, 1, 365 (2011)
[21] Li, H.; Wiehe, T., Coalescent tree imbalance and a simple test for selective sweeps based on microsatellite variation, PLoS Computational Biology, 9, 5, e1003060 (2013)
[22] Neher, R. A.; Hallatschek, O., Genealogies of rapidly adapting populations, Proceedings of the National Academy of Sciences USA, 110, 2, 437 (2013)
[23] Rosenberg, N. A., Counting coalescent histories, Journal of Computational Biology, 14, 3, 360 (2007)
[24] Hudson, R. R., Generating samples under a Wright-Fisher neutral model of genetic variation, Bioinformatics, 18, 337 (2002)
[25] Tajima, F., Evolutionary relationship of DNA sequences in finite populations, Genetics, 105, 2, 437 (1983)
[26] Flajolet, P.; Sedgewick, R., Analytic Combinatorics (2009), Cambridge University Press · Zbl 1165.05001
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.