×

Nonparametric maximum likelihood approach to multiple change-point problems. (English) Zbl 1305.62158

Summary: In multiple change-point problems, different data segments often follow different distributions, for which the changes may occur in the mean, scale or the entire distribution from one segment to another. Without the need to know the number of change-points in advance, we propose a nonparametric maximum likelihood approach to detecting multiple change-points. Our method does not impose any parametric assumption on the underlying distributions of the data sequence, which is thus suitable for detection of any changes in the distributions. The number of change-points is determined by the Bayesian information criterion and the locations of the change-points can be estimated via the dynamic programming algorithm and the use of the intrinsic order structure of the likelihood function. Under some mild conditions, we show that the new method provides consistent estimation with an optimal rate. We also suggest a prescreening procedure to exclude most of the irrelevant points prior to the implementation of the nonparametric likelihood method. Simulation studies show that the proposed method has satisfactory performance of identifying multiple change-points in terms of estimation accuracy and computation time.

MSC:

62G05 Nonparametric estimation
62G20 Asymptotic properties of nonparametric inference
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Arlot, S., Celisse, A. and Harchaoui, Z. (2012). Kernel change-point detection. Available at .
[2] Bai, J. and Perron, P. (1998). Estimating and testing linear models with multiple structural changes. Econometrica 66 47-78. · Zbl 1056.62523 · doi:10.2307/2998540
[3] Bai, J. and Perron, P. (2003). Computation and analysis of multiple structural change models. J. Appl. Econometrics 18 10-22.
[4] Bellman, R. E. and Dreyfus, S. E. (1962). Applied Dynamic Programming . Princeton Univ. Press, Princeton, NJ. · Zbl 0106.34901
[5] Boysen, L., Kempe, A., Liebscher, V., Munk, A. and Wittich, O. (2009). Consistencies and rates of convergence of jump-penalized least squares estimators. Ann. Statist. 37 157-183. · Zbl 1155.62034 · doi:10.1214/07-AOS558
[6] Braun, J. V., Braun, R. K. and Müller, H.-G. (2000). Multiple changepoint fitting via quasilikelihood, with application to DNA sequence segmentation. Biometrika 87 301-314. · Zbl 0963.62067 · doi:10.1093/biomet/87.2.301
[7] Carlstein, E. (1988). Nonparametric change-point estimation. Ann. Statist. 16 188-197. · Zbl 0637.62041 · doi:10.1214/aos/1176350699
[8] Chen, J. and Gupta, A. K. (1997). Testing and locating variance changepoints with application to stock prices. J. Amer. Statist. Assoc. 92 739-747. · Zbl 1090.62565 · doi:10.2307/2965722
[9] Chen, H. and Zhang, N. R. (2012). Graph-based change-point detection. Available at . · Zbl 1308.62090
[10] Csörgő, M. and Horváth, L. (1997). Limit Theorems in Change-Point Analysis . Wiley, Chichester. · Zbl 0884.62023
[11] Darkhovskh, B. S. (1976). A nonparametric method for the a posteriori detection of the “disorder” time of a sequence of independent random variables. Theory Probab. Appl. 21 178-183. · Zbl 0397.62024 · doi:10.1137/1121019
[12] Donoho, D. L. and Johnstone, I. M. (1995). Adapting to unknown smoothness via wavelet shrinkage. J. Amer. Statist. Assoc. 90 1200-1224. · Zbl 0869.62024 · doi:10.2307/2291512
[13] Dümbgen, L. (1991). The asymptotic behavior of some nonparametric change-point estimators. Ann. Statist. 19 1471-1495. · Zbl 0776.62032 · doi:10.1214/aos/1176348257
[14] Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407-499. · Zbl 1091.62054 · doi:10.1214/009053604000000067
[15] Einmahl, J. H. J. and McKeague, I. W. (2003). Empirical likelihood based hypothesis testing. Bernoulli 9 267-290. · Zbl 1015.62048 · doi:10.3150/bj/1068128978
[16] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348-1360. · Zbl 1073.62547 · doi:10.1198/016214501753382273
[17] Fearnhead, P. and Vasileiou, D. (2009). Bayesian analysis of isochores. J. Amer. Statist. Assoc. 104 132-141. · Zbl 1388.62319 · doi:10.1198/jasa.2009.0009
[18] Fowlkes, E. B. and Mallows, C. L. (1983). A method for comparing two hierarchical clusterings. J. Amer. Statist. Assoc. 78 553-569. · Zbl 0545.62042 · doi:10.2307/2288117
[19] Guan, Z. (2004). A semiparametric changepoint model. Biometrika 91 849-862. · Zbl 1055.62029 · doi:10.1093/biomet/91.4.849
[20] Hao, N., Niu, Y. and Zhang, H. (2013). Multiple change-point detection via a screening and ranking algorithm. Statist. Sinica 23 1553-1572. · Zbl 1417.62236
[21] Harchaoui, Z. and Lévy-Leduc, C. (2010). Multiple change-point estimation with a total variation penalty. J. Amer. Statist. Assoc. 105 1480-1493. · Zbl 1388.62211 · doi:10.1198/jasa.2010.tm09181
[22] Hawkins, D. M. (2001). Fitting multiple change-point models to data. Comput. Statist. Data Anal. 37 323-341. · Zbl 0990.62019 · doi:10.1016/S0167-9473(00)00068-2
[23] Jager, L. and Wellner, J. A. (2007). Goodness-of-fit tests via phi-divergences. Ann. Statist. 35 2018-2053. · Zbl 1126.62030 · doi:10.1214/0009053607000000244
[24] Killick, R., Fearnhead, P. and Eckley, I. A. (2012). Optimal detection of changepoints with a linear computational cost. J. Amer. Statist. Assoc. 107 1590-1598. · Zbl 1258.62091 · doi:10.1080/01621459.2012.737745
[25] Lavielle, M. (2005). Using penalized contrasts for the change-points problems. Signal Process. 85 1501-1510. · Zbl 1160.94341 · doi:10.1016/j.sigpro.2005.01.012
[26] Lee, C.-B. (1996). Nonparametric multiple change-point estimators. Statist. Probab. Lett. 27 295-304. · Zbl 0847.62022 · doi:10.1016/0167-7152(95)00089-5
[27] Matteson, D. S. and James, N. A. (2014). A nonparametric approach for multiple change point analysis of multivariate data. J. Amer. Statist. Assoc. 109 334-345. · Zbl 1367.62260 · doi:10.1080/01621459.2013.849605
[28] Niu, Y. S. and Zhang, H. (2012). The screening and ranking algorithm to detect DNA copy number variations. Ann. Appl. Stat. 6 1306-1326. · Zbl 1401.92145 · doi:10.1214/12-AOAS539
[29] Oliver, J. L., Carpena, P., Hackenberg, M. and Bernaola-Galván, P. (2004). IsoFinder: Computational prediction of isochores in genome sequences. Nucleic Acids Res. 32 W287-W292.
[30] Rigaill, G. (2010). Pruned dynamic programming for optimal multiple change-point detection. Available at .
[31] Shen, X. and Ye, J. (2002). Adaptive model selection. J. Amer. Statist. Assoc. 97 210-221. · Zbl 1073.62509 · doi:10.1198/016214502753479356
[32] Shorack, G. R. and Wellner, J. A. (1986). Empirical Processes with Applications to Statistics . Wiley, New York. · Zbl 1170.62365
[33] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58 267-288. · Zbl 0850.62538
[34] Wellner, J. A. (1978). Limit theorems for the ratio of the empirical distribution function to the true distribution function. Z. Wahrsch. Verw. Gebiete 45 73-88. · Zbl 0382.60031 · doi:10.1007/BF00635964
[35] Yao, Y.-C. (1988). Estimating the number of change-points via Schwarz’ criterion. Statist. Probab. Lett. 6 181-189. · Zbl 0642.62016 · doi:10.1016/0167-7152(88)90118-6
[36] Yao, Y.-C. and Au, S. T. (1989). Least-squares estimation of a step function. Sankhyā Ser. A 51 370-381. · Zbl 0711.62031
[37] Zhang, J. (2002). Powerful goodness-of-fit tests based on the likelihood ratio. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 281-294. · Zbl 1067.62046 · doi:10.1111/1467-9868.00337
[38] Zhang, J. (2006). Powerful two-sample tests based on the likelihood ratio. Technometrics 48 95-103. · doi:10.1198/004017005000000328
[39] Zou, C., Yin, G., Feng, L. and Wang, Z. Supplement to “Nonparametric maximum likelihood approach to multiple change-point problems.” . · Zbl 1305.62158
[40] Zou, C., Liu, Y., Qin, P. and Wang, Z. (2007). Empirical likelihood ratio test for the change-point problem. Statist. Probab. Lett. 77 374-382. · Zbl 1108.62045 · doi:10.1016/j.spl.2006.08.003
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.