Partitioning, duality, and linkage disequilibria in the Moran model with recombination.

*(English)*Zbl 1359.92080This is a very comprehensive and unifying paper that looks at the genetic evolution of a population via a multilocus Moran model with recombination. Each unit in the population is a chromosome taken as an haploid individual and the total population remains constant; when an individual dies after an exponentially distributed lifetime, it is randomly replaced by a full copy of a single parent or by a recombination of two parents resulting from a single cross-over. The Moran model is a forward continuous-time Markov process that identifies the population at time \(t\) with a random counting measure \(Z_t\) on the type space (space of all possible chromosomal types).

The paper builds an important bridge between this forward approach and a genealogical (backward in time) approach using sampling formulae. Although the forward approach traditionally recurs to asymptotic approximations and the paper looks at them when appropriate, the bridge is made using the original Moran model by proving a duality between this model and a marginalized version (each locus being considered in one individual only) of the ancestral recombination process (ARP) (genealogical approach). This is done using what the authors call sampling functions as duality functions and is achieved by extending the recombinator formalism to the stochastic setting. Due to the above marginalization, this leads to an explicit closed ODE system for the expected sampling functions, which are building blocks for the linkage disequilibria that the paper also analyses.

Several tools are used to follow this program. One is the use of Möbius functions (and Möbius inversion) to turn sampling without replacement into sampling with replacement and to turn type frequencies into linkage disequilibria. Another is the partitioning process, a Markov process consisting of a mixture of splitting and coalescence events that describe how the sites are partitioned into different individuals backward in time. The paper studies its limiting behaviour as the population size \(N \rightarrow +\infty\) (deterministic limit) and the limiting behaviour of a sequence of such processes with time sped up by a factor of \(N\) and recombination probabilities also rescaled (diffusion limit). Finally, the extended recombinator formalism, together with Möbius functions, allows the introduction of the sampling functions.

At the end, there are applications to two or three sites; in particular, it is shown that linkage disequilibria decay exponentially. In the case of two sites, the time evolution of the expected composition of the population is obtained, as well as the fixation probabilities.

The paper builds an important bridge between this forward approach and a genealogical (backward in time) approach using sampling formulae. Although the forward approach traditionally recurs to asymptotic approximations and the paper looks at them when appropriate, the bridge is made using the original Moran model by proving a duality between this model and a marginalized version (each locus being considered in one individual only) of the ancestral recombination process (ARP) (genealogical approach). This is done using what the authors call sampling functions as duality functions and is achieved by extending the recombinator formalism to the stochastic setting. Due to the above marginalization, this leads to an explicit closed ODE system for the expected sampling functions, which are building blocks for the linkage disequilibria that the paper also analyses.

Several tools are used to follow this program. One is the use of Möbius functions (and Möbius inversion) to turn sampling without replacement into sampling with replacement and to turn type frequencies into linkage disequilibria. Another is the partitioning process, a Markov process consisting of a mixture of splitting and coalescence events that describe how the sites are partitioned into different individuals backward in time. The paper studies its limiting behaviour as the population size \(N \rightarrow +\infty\) (deterministic limit) and the limiting behaviour of a sequence of such processes with time sped up by a factor of \(N\) and recombination probabilities also rescaled (diffusion limit). Finally, the extended recombinator formalism, together with Möbius functions, allows the introduction of the sampling functions.

At the end, there are applications to two or three sites; in particular, it is shown that linkage disequilibria decay exponentially. In the case of two sites, the time evolution of the expected composition of the population is obtained, as well as the fixation probabilities.

Reviewer: Carlos A. dos Santos Braumann (Évora)

##### MSC:

92D10 | Genetics and epigenetics |

92D15 | Problems related to evolution |

60J28 | Applications of continuous-time Markov processes on discrete state spaces |

##### Keywords:

Moran model with recombination; ancestral recombination process; linkage disequilibria; Möbius inversion; duality##### References:

[1] | Aigner M (1979) Combinatorial theory. Springer, Berlin (reprint 1997) · Zbl 0415.05001 |

[2] | Baake, M, Recombination semigroups on measure spaces, Monatsh Math, 146, 267-278, (2005) · Zbl 1081.92028 |

[3] | Baake, M; Baake, E, An exactly solved model for mutation, recombination and selection, Can J Math, 55, 3-41, (2003) · Zbl 1056.92040 |

[4] | Baake, E; Herms, I, Single-crossover dynamics: finite versus infinite populations, Bull Math Biol, 70, 603-624, (2008) · Zbl 1139.92018 |

[5] | Baake, E; Hustedt, T, Moment closure in a Moran model with recombination, Markov Process Relat Fields, 17, 429-446, (2011) · Zbl 1260.92090 |

[6] | Baake, E; Wangenheim, U, Single-crossover recombination and ancestral recombination trees, J Math Biol, 68, 1371-1402, (2014) · Zbl 1284.92063 |

[7] | Baake, E; Baake, M; Salamat, M, The general recombination equation in continuous time and its solution, Discrete Contin Dyn Syst, 36, 63-95, (2016) · Zbl 1325.34064 |

[8] | Bennett, JH, On the theory of random mating, Ann Eugen, 18, 311-317, (1954) |

[9] | Berge C (1971) Principles of combinatorics. Academic Press, New York · Zbl 0227.05002 |

[10] | Bhaskar, A; Song, YS, Closed-form asymptotic sampling distributions under the coalescent with recombination for an arbitrary number of loci, Adv Appl Probab, 44, 391-407, (2012) · Zbl 1241.92054 |

[11] | Bobrowski, A; Kimmel, M, A random evolution related to a Fisher-wright-Moran model with mutation, recombination and drift, Math Methods Appl Sci, 26, 1587-1599, (2003) · Zbl 1032.92025 |

[12] | Bobrowski, A; Wojdyła, T; Kimmel, M, Asymptotic behavior of a Moran model with mutations, drift and recombination among multiple loci, J Math Biol, 61, 455-473, (2010) · Zbl 1205.92051 |

[13] | Bürger R (2000) The mathematical theory of selection, recombination, and mutation. Wiley, New York · Zbl 0959.92018 |

[14] | Donnelly, P; Tautu, P (ed.), Dual processes in population genetics, 94-105, (1986), Berlin |

[15] | Durrett R (2008) Probability models for DNA sequence evolution, 2nd edn. Springer, New York · Zbl 1311.92007 |

[16] | Dyson, FJ, Statistical theory of energy levels of complex systems III, J Math Phys, 3, 166-175, (1962) · Zbl 0105.41604 |

[17] | Ethier SN, Kurtz TG (1986) Markov processes: characterization and convergence. Wiley, New York (reprint 2005) |

[18] | Geiringer, H, On the probability theory of linkage in Mendelian heredity, Ann Math Stat, 15, 25-57, (1944) · Zbl 0063.01560 |

[19] | Golding, GB, The sampling distribution of linkage disequilibrium, Genetics, 108, 257-274, (1984) |

[20] | Gorelick, R; Laubichler, MD, Decomposing multilocus linkage disequilibrium, Genetics, 166, 1581-1583, (2004) |

[21] | Griffiths, RC; Marjoram, R, Ancestral inference from samples of DNA sequences with recombination, J Comput Biol, 3, 479-502, (1996) |

[22] | Hastings, A, Linkage disequilibrium, selection, and recombination at three loci, Genetics, 106, 153-14, (1984) |

[23] | Hein J, Schierup MH, Wiuf C (2005) Gene genealogies, variation and evolution: a primer in coalescent theory. Oxford University Press, Oxford · Zbl 1113.92048 |

[24] | Hudson RR (1983) Properties of a neutral allele model with intragenetic recombination. Theor Popul Biol 23:183-201 · Zbl 0505.62090 |

[25] | Jansen, S; Kurt, N, On the notion(s) of duality for Markov processes, Probab Surv, 11, 59-120, (2014) · Zbl 1292.60077 |

[26] | Jenkins, PA; Song, YS, An asymptotic sampling formula for the coalescent with recombination, Ann Appl Probab, 20, 1005-1028, (2010) · Zbl 1193.92077 |

[27] | Jenkins, PA; Griffiths, R, Inference from samples of DNA sequences using a two-locus model, J Comput Biol, 18, 109-127, (2011) |

[28] | Jenkins, PA; Fearnhead, P; Song, YS, Tractable stochastic models of evolution for loosely linked loci, Electron J Probab, 20, 1-26, (2015) |

[29] | Liggett TM (1985) Interacting particle systems. Springer, Berlin (reprint 2005) · Zbl 0559.60078 |

[30] | Mano, S, Duality between the two-locus wright-Fisher diffusion model and the ancestral process with recombination, J Appl Probab, 50, 256-271, (2013) · Zbl 1302.92075 |

[31] | McVean, GAT; Cardin, NJ, Approximating the coalescent with recombination, Philos Trans R Soc B, 360, 1387-1393, (2005) |

[32] | Mehta ML (1991) Random matrices. Academic Press, San Diego · Zbl 0780.60014 |

[33] | Möhle, M, Forward and backward diffusion approximations for haploid exchangeable population models, Stoch Proc Appl, 95, 133-149, (2001) · Zbl 1056.92039 |

[34] | Ohta, T; Kimura, M, Linkage disequilibrium due to random genetic drift, Genet Res, 13, 47-55, (1969) |

[35] | Polanska, J; Kimmel, M, A model of dynamics of mutation, genetic drift and recombination in DNA-repeat genetic loci, Arch Control Sci, 9, 143-157, (1999) · Zbl 1153.92330 |

[36] | Polanska, J; Kimmel, M, A simple model of linkage disequilibrium and genetic drift in human genomic SNPs: importance of demography and SNP age, Hum Hered, 60, 181-195, (2005) |

[37] | Rota, G-C, On the foundations of combinatorial theory I. theory of Möbius functions, Z Wahrscheinlichkeitstheorie, 2, 340-368, (1964) · Zbl 0121.02406 |

[38] | Song, YS; Song, JS, Analytic computation of the expectation of the linkage disequilibrium coefficient \(r^2\), Theor Popul Biol, 71, 49-60, (2007) · Zbl 1118.92038 |

[39] | Stanley RP (1986) Enumerative combinatorics, vol I. Wadsworth & Brooks/Cole, Monterey · Zbl 0608.05001 |

[40] | Wangenheim, U; Baake, E; Baake, M, Single-crossover recombination in discrete time, J Math Biol, 60, 727-760, (2010) · Zbl 1208.92050 |

[41] | Wakeley J (2009) Coalescent theory: an introduction. Roberts and Co., Greenwood Village · Zbl 1366.92001 |

[42] | Wang, Y; Rannala, B, Bayesian inference of fine-scale recombination rates using population genomic data, Philos Trans R Soc B, 363, 3921-3930, (2008) |

[43] | Wiuf, C; Hein, J, On the number of ancestors to a DNA sequence, Genetics, 147, 1459-1468, (1997) · Zbl 0920.90109 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.