an:06668617
Zbl 1431.05084
Kelk, Steven; Stamoulis, Georgios
A note on convex characters, Fibonacci numbers and exponential-time algorithms
EN
Adv. Appl. Math. 84, 34-46 (2017).
0196-8858
2017
j
05C30 05A15 05C85 05C90 11B39
Fibonacci numbers; phylogenetics; trees; convexity; counting; algorithms
Summary: Phylogenetic trees are used to model evolution: leaves are labelled to represent contemporary species (``taxa'') and interior vertices represent extinct ancestors. Informally, convex characters are measurements on the contemporary species in which the subset of species (both contemporary and extinct) that share a given state, forms a connected subtree. Given an unrooted, binary phylogenetic tree \(\mathcal{T}\) on a set of \(n\geq 2\) taxa, a closed (but fairly opaque) expression for the number of convex characters on \(\mathcal{T}\) has been known since 1992 [\textit{M. Steel}, J. Classif. 9, No. 1, 91--116 (1992; Zbl 0766.92002)], and this is independent of the exact topology of \(\mathcal{T}\). In this note we prove that this number is actually equal to the \((2n-1)\)th Fibonacci number.
Next, we define \(g_k(\mathcal{T})\) to be the number of convex characters on \(\mathcal{T}\) in which each state appears on at least \(k\) taxa. We show that, somewhat curiously, \(g_2(\mathcal{T})\) is also independent of the topology of \(\mathcal{T}\), and is equal to the \((n-1)\)th Fibonacci number. As we demonstrate, this topological neutrality subsequently breaks down for \(k \geq 3\). However, we show that for each fixed \(k \geq 1\), \(g_k(\mathcal{T})\) can be computed in \(O(n)\) time and the set of characters thus counted can be efficiently listed and sampled. We use these insights to give a simple but effective exact algorithm for the NP-hard \textit{maximum parsimony distance} problem that runs in time \(\operatorname{\Theta}(\phi^n \cdot n^2)\), where \(\phi \approx 1.618 \ldots\) is the golden ratio, and an exact algorithm which computes the \textit{tree bisection and reconnection} distance (equivalently, a \textit{maximum agreement forest}) in time \(\operatorname{\Theta}(\phi^{2 n} \cdot \operatorname{poly}(n))\), where \(\phi^2 \approx 2.619\).
0766.92002