×

A similarity measure for second order properties of non-stationary functional time series with applications to clustering and testing. (English) Zbl 1477.62255

This paper develops a similarity measure for spectral density operators of a collection of functional time series, based on the aggregation of Hilbert-Schmidt differences of the individual time-varying spectral density operators. Under fairly general conditions, the asymptotic properties of the corresponding estimator are derived and asymptotic normality is established. Applications are given to clustering and testing.

MSC:

62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
62M15 Inference from stochastic processes and spectral analysis
62R10 Functional data analysis
62G10 Nonparametric hypothesis testing
62H30 Classification and discrimination; cluster analysis (statistical aspects)

Software:

clusfind; NbClust; SLEX
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Abraham, C., Cornillon, P.A., Matzner-Løber, E. and Molinari, N. (2003). Unsupervised curve clustering using B-splines. Scand. J. Stat. 30 581-595. Zentralblatt MATH: 1039.91067
Digital Object Identifier: doi:10.1111/1467-9469.00350
· Zbl 1039.91067 · doi:10.1111/1467-9469.00350
[2] Aghabozorgi, S., Sirkhorshidi, A.S. and Wah, Y.W. (2015). Time-series clustering – a decade review. Inform. Sci. 53 16-38.
[3] Aston, J.A.D. and Kirch, C. (2012). Detecting and estimating changes in dependent functional data. J. Multivariate Anal. 109 204-220. Zentralblatt MATH: 1241.62121
Digital Object Identifier: doi:10.1016/j.jmva.2012.03.006
· Zbl 1241.62121 · doi:10.1016/j.jmva.2012.03.006
[4] Aue, A. and van Delft, A. (2020). Testing for stationarity of functional time series in the frequency domain. Ann. Statist. 48 2505-2547. · Zbl 1455.62230
[5] Bauwens, L. and Rombouts, J.V.K. (2007). Bayesian clustering of many Garch models. Econometric Rev. 26 365-386. Zentralblatt MATH: 1112.62016
Digital Object Identifier: doi:10.1080/07474930701220576
· Zbl 1112.62016 · doi:10.1080/07474930701220576
[6] Böhm, H., Ombao, H., von Sachs, R. and Sanes, J. (2010). Classification of multivariate non-stationary signals: The SLEX-shrinkage approach. J. Statist. Plann. Inference 140 3754-3763. Zentralblatt MATH: 1233.62128
Digital Object Identifier: doi:10.1016/j.jspi.2010.04.040
· Zbl 1233.62128 · doi:10.1016/j.jspi.2010.04.040
[7] Bosq, D. (2000). Linear Processes in Function Spaces: Theory and Applications. Lecture Notes in Statistics 149. New York: Springer. Zentralblatt MATH: 0962.60004
· Zbl 0962.60004
[8] Calinski, T. and Harabasz, J. (1974). A dendrite method for cluster analysis. Commun. Stat. 3 1-27. Zentralblatt MATH: 0273.62010
· Zbl 0273.62010
[9] Chamroukhi, F. and Nguyen, H.D. (2019). Model-based clustering and classification of functional data. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 9 e1298.
[10] Chandler, G. and Polonik, W. (2006). Discrimination of locally stationary time series based on the excess mass functional. J. Amer. Statist. Assoc. 101 240-253. Zentralblatt MATH: 1118.62358
Digital Object Identifier: doi:10.1198/016214505000000899
· Zbl 1118.62358 · doi:10.1198/016214505000000899
[11] Charrad, M., Ghazzali, N., Boiteau, V. and Niknafs, A. (2014). NbClust: An R package for determining the relevant number of clusters in a data set. J. Stat. Softw. 61 1-36.
[12] Chung, F. and Radcliffe, M. (2011). On the spectra of general random graphs. Electron. J. Combin. 18 Paper 215, 14. Zentralblatt MATH: 1229.05248
· Zbl 1229.05248
[13] Chung, F.R.K. (1997). Spectral Graph Theory. CBMS Regional Conference Series in Mathematics 92. Washington, DC: Published for the Conference Board of the Mathematical Sciences; Providence, RI: Amer. Math. Soc.
[14] Coates, D.S. and Diggle, P.J. (1986). Tests for comparing two estimated spectral densities. J. Time Series Anal. 7 7-20. Zentralblatt MATH: 0581.62076
Digital Object Identifier: doi:10.1111/j.1467-9892.1986.tb00482.x
· Zbl 0581.62076 · doi:10.1111/j.1467-9892.1986.tb00482.x
[15] Corduas, M. and Piccolo, D. (2008). Time series clustering and classification by the autoregressive metric. Comput. Statist. Data Anal. 52 1860-1872. Zentralblatt MATH: 05564607
Digital Object Identifier: doi:10.1016/j.csda.2007.06.001
· Zbl 1452.62624 · doi:10.1016/j.csda.2007.06.001
[16] Dahlhaus, R. (1997). Fitting time series models to nonstationary processes. Ann. Statist. 25 1-37. Zentralblatt MATH: 0871.62080
Digital Object Identifier: doi:10.1214/aos/1034276620
Project Euclid: euclid.aos/1034276620
· Zbl 0871.62080 · doi:10.1214/aos/1034276620
[17] Davis, C. and Kahan, W.M. (1970). The rotation of eigenvectors by a perturbation. III. SIAM J. Numer. Anal. 7 1-46. Zentralblatt MATH: 0198.47201
Digital Object Identifier: doi:10.1137/0707001
· Zbl 0198.47201 · doi:10.1137/0707001
[18] Delaigle, A., Hall, P. and Pham, T. (2019). Clustering functional data into groups by using projections. J. R. Stat. Soc. Ser. B. Stat. Methodol. 81 271-304. Zentralblatt MATH: 1420.62270
Digital Object Identifier: doi:10.1111/rssb.12310
· Zbl 1420.62270 · doi:10.1111/rssb.12310
[19] Dette, H. (2009). Bootstrapping frequency domain tests in multivariate time series with an application to comparing spectral densities. J. R. Stat. Soc. Ser. B. Stat. Methodol. 71 831-857. Zentralblatt MATH: 1248.62145
Digital Object Identifier: doi:10.1111/j.1467-9868.2009.00709.x
· Zbl 1248.62145 · doi:10.1111/j.1467-9868.2009.00709.x
[20] Dette, H. and Hildebrandt, T. (2012). A note on testing hypotheses for stationary processes in the frequency domain. J. Multivariate Anal. 104 101-114. Zentralblatt MATH: 1236.62101
Digital Object Identifier: doi:10.1016/j.jmva.2011.07.002
· Zbl 1236.62101 · doi:10.1016/j.jmva.2011.07.002
[21] Eichler, M. (2008). Testing nonparametric and semiparametric hypotheses in vector stationary processes. J. Multivariate Anal. 99 968-1009. Zentralblatt MATH: 1136.62371
Digital Object Identifier: doi:10.1016/j.jmva.2007.06.003
· Zbl 1136.62371 · doi:10.1016/j.jmva.2007.06.003
[22] Euán, C., Ombao, H. and Ortega, J. (2018). Spectral synchronicity in brain signals. Stat. Med. 37 2855-2873.
[23] Ferraty, F. and Vieu, P. (2006). Nonparametric Functional Data Analysis: Theory and Practice. Springer Series in Statistics. New York: Springer. Zentralblatt MATH: 1119.62046
· Zbl 1119.62046
[24] Floriello, D. and Vitelli, V. (2017). Sparse clustering of functional data. J. Multivariate Anal. 154 1-18. Zentralblatt MATH: 1353.62069
Digital Object Identifier: doi:10.1016/j.jmva.2016.10.008
· Zbl 1353.62069 · doi:10.1016/j.jmva.2016.10.008
[25] Fokianos, K. and Promponas, V.J. (2012). Biological applications of time series frequency domain clustering. J. Time Series Anal. 33 744-756. Zentralblatt MATH: 1281.62232
Digital Object Identifier: doi:10.1111/j.1467-9892.2011.00758.x
· Zbl 1281.62232 · doi:10.1111/j.1467-9892.2011.00758.x
[26] Frühwirth-Schnatter, S. and Kaufmann, S. (2008). Model-based clustering of multiple time series. J. Bus. Econom. Statist. 26 78-89.
[27] Gordon, A.D. (1999). Classification, 2nd ed. London: Chapman and Hall-CRC. · Zbl 0929.62068
[28] Hartigan, J.A. (1975). Clustering Algorithms. Wiley Series in Probability and Mathematical Statistics. New York: Wiley. · Zbl 0372.62040
[29] Harvill, J.L., Kohli, P. and Ravishanker, N. (2017). Clustering nonlinear, nonstationary time series using BSLEX. Methodol. Comput. Appl. Probab. 19 935-955. Zentralblatt MATH: 1384.37110
Digital Object Identifier: doi:10.1007/s11009-016-9528-1
· Zbl 1384.37110 · doi:10.1007/s11009-016-9528-1
[30] Heard, N.A., Holmes, C.C. and Stephens, D.A. (2006). A quantitative study of gene regulation involved in the immune response of anopheline mosquitoes: An application of Bayesian hierarchical clustering of curves. J. Amer. Statist. Assoc. 101 18-29. Zentralblatt MATH: 1118.62368
Digital Object Identifier: doi:10.1198/016214505000000187
· Zbl 1118.62368 · doi:10.1198/016214505000000187
[31] Holan, S.H. and Ravishanker, N. (2018). Time series clustering and classification via frequency domain methods. Wiley Interdiscip. Rev.: Comput. Stat. 10 e1444, 15.
[32] Horváth, L., Hušková, M. and Rice, G. (2013). Test of independence for functional data. J. Multivariate Anal. 117 100-119. · Zbl 1277.62124
[33] Huang, H.-Y., Ombao, H. and Stoffer, D.S. (2004). Discrimination and classification of nonstationary time series using the SLEX model. J. Amer. Statist. Assoc. 99 763-774. Zentralblatt MATH: 1117.62357
Digital Object Identifier: doi:10.1198/016214504000001105
· Zbl 1117.62357 · doi:10.1198/016214504000001105
[34] Ieva, F., Paganoni, A.M., Pigoli, D. and Vitelli, V. (2013). Multivariate functional clustering for the morphological analysis of electrocardiograph curves. J. R. Stat. Soc. Ser. C. Appl. Stat. 62 401-418. Mathematical Reviews (MathSciNet): MR3060623
Digital Object Identifier: doi:10.1111/j.1467-9876.2012.01062.x
 · doi:10.1111/j.1467-9876.2012.01062.x
[35] Jacques, J. and Preda, C. (2014). Model-based clustering for multivariate functional data. Comput. Statist. Data Anal. 71 92-106. Zentralblatt MATH: 06975374
Digital Object Identifier: doi:10.1016/j.csda.2012.12.004
· Zbl 1471.62096 · doi:10.1016/j.csda.2012.12.004
[36] Jacques, J. and Preda, C. (2014). Functional data clustering: A survey. Adv. Data Anal. Classif. 8 231-255. Zentralblatt MATH: 1414.62018
Digital Object Identifier: doi:10.1007/s11634-013-0158-y
· Zbl 1414.62018 · doi:10.1007/s11634-013-0158-y
[37] Jentsch, C. and Pauly, M. (2015). Testing equality of spectral densities using randomization techniques. Bernoulli 21 697-739. Zentralblatt MATH: 1320.62081
Digital Object Identifier: doi:10.3150/13-BEJ584
Project Euclid: euclid.bj/1429624958
· Zbl 1320.62081 · doi:10.3150/13-BEJ584
[38] Juárez, M.A. and Steel, M.F.J. (2010). Model-based clustering of non-Gaussian panel data based on skew-\(t\) distributions. J. Bus. Econom. Statist. 28 52-66. · Zbl 1198.62097
[39] Kakizawa, Y., Shumway, R.H. and Taniguchi, M. (1998). Discrimination and clustering for multivariate time series. J. Amer. Statist. Assoc. 93 328-340. Zentralblatt MATH: 0906.62060
Digital Object Identifier: doi:10.1080/01621459.1998.10474114
· Zbl 0906.62060 · doi:10.1080/01621459.1998.10474114
[40] Kalpakis, K., Gada, D. and Puttagunta, V. (2001). Distance measures for effective clustering of arima time-series. In Proceedings of the 2001 IEEE International Conference on Data Mining, San Jose 273-280.
[41] Kaufman, L. and Rousseeuw, P.J. (1990). Finding Groups in Data: An Introduction to Cluster Analysis. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. New York: Wiley. A Wiley-Interscience Publication. Zentralblatt MATH: 1345.62009
· Zbl 1345.62009
[42] Krzanowski, W.J. and Lai, Y.T. (1988). A criterion for determining the number of groups in a data set using sum-of-squares clustering. Biometrics 44 23-34. Zentralblatt MATH: 0707.62122
Digital Object Identifier: doi:10.2307/2531893
· Zbl 0707.62122 · doi:10.2307/2531893
[43] Lei, J. and Rinaldo, A. (2015). Consistency of spectral clustering in stochastic block models. Ann. Statist. 43 215-237. Zentralblatt MATH: 1308.62041
Digital Object Identifier: doi:10.1214/14-AOS1274
Project Euclid: euclid.aos/1418135620
· Zbl 1308.62041 · doi:10.1214/14-AOS1274
[44] Leucht, A., Paporoditis, E. and Sapatinas, T. (2018). Testing equality of spectral density operators for functional linear processes. arXiv:1804.03366. arXiv: 1804.03366
[45] Milligan, G.W. and Cooper, M.C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika 50 159-179.
[46] Ng, A., Jordan, S. and Weiss, Y. (2002). On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14 (T. Dietterich, S. Becker and Z. Ghahramani, eds.) MIT Press.
[47] Ombao, H.C., Raz, J.A., von Sachs, R. and Malow, B.A. (2001). Automatic statistical analysis of bivariate nonstationary time series. J. Amer. Statist. Assoc. 96 543-560. Zentralblatt MATH: 1018.62080
Digital Object Identifier: doi:10.1198/016214501753168244
· Zbl 1018.62080 · doi:10.1198/016214501753168244
[48] Paparoditis, E. and Sapatinas, T. (2016). Bootstrap-based testing of equality of mean functions or equality of covariance operators for functional data. Biometrika 103 727-733. Zentralblatt MATH: 07072148
Digital Object Identifier: doi:10.1093/biomet/asw033
· Zbl 1506.62546 · doi:10.1093/biomet/asw033
[49] Peng, J. and Müller, H.-G. (2008). Distance-based clustering of sparsely observed stochastic processes, with applications to online auctions. Ann. Appl. Stat. 2 1056-1077. Zentralblatt MATH: 1149.62053
Digital Object Identifier: doi:10.1214/08-AOAS172
Project Euclid: euclid.aoas/1223908052
· Zbl 1149.62053 · doi:10.1214/08-AOAS172
[50] Rohe, K., Chatterjee, S. and Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist. 39 1878-1915. Zentralblatt MATH: 1227.62042
Digital Object Identifier: doi:10.1214/11-AOS887
Project Euclid: euclid.aos/1314190618
· Zbl 1227.62042 · doi:10.1214/11-AOS887
[51] Sakiyama, K. and Taniguchi, M. (2004). Discriminant analysis for locally stationary processes. J. Multivariate Anal. 90 282-300. Zentralblatt MATH: 1050.62066
Digital Object Identifier: doi:10.1016/j.jmva.2003.08.002
· Zbl 1050.62066 · doi:10.1016/j.jmva.2003.08.002
[52] Savvides, A., Promponas, V.J. and Fokianos, K. (2008). Clustering of biological time series by cepstral coefficients based distances. Pattern Recognit. 41 2398-2412. Zentralblatt MATH: 1138.68515
Digital Object Identifier: doi:10.1016/j.patcog.2008.01.002
· Zbl 1138.68515 · doi:10.1016/j.patcog.2008.01.002
[53] Shi, J. and Malik, J. (2002). Nomalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22 888-905.
[54] Stewart, G.W. and Sun, J.G. (1990). Matrix Perturbation Theory. Computer Science and Scientific Computing. Boston, MA: Academic Press. · Zbl 0706.65013
[55] Tavakoli, S. and Panaretos, V.M. (2016). Detecting and localizing differences in functional time series dynamics: A case study in molecular biophysics. J. Amer. Statist. Assoc. 111 1020-1035.
[56] Theodoridis, S. and Koutroumbas, K. (2008). Pattern Recognition, 4th ed. New York: Academic Press. Zentralblatt MATH: 1093.68103
· Zbl 1093.68103
[57] Tibshirani, R., Walther, G. and Hastie, T. (2001). Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc. Ser. B. Stat. Methodol. 63 411-423. Zentralblatt MATH: 0979.62046
Digital Object Identifier: doi:10.1111/1467-9868.00293
· Zbl 0979.62046 · doi:10.1111/1467-9868.00293
[58] van Delft, A. (2020). A note on quadratic forms of stationary functional time series under mild conditions. Stochastic Process. Appl. 130 4206-4251. Zentralblatt MATH: 07203586
Digital Object Identifier: doi:10.1016/j.spa.2019.12.002
· Zbl 1461.62242 · doi:10.1016/j.spa.2019.12.002
[59] van Delft, A., Characiejus, V. and Dette, H. (2019). A nonparametric test for stationarity in functional time series. Statist. Sinica. To appear.
[60] van Delft, A. and Dette, H. (2020). Supplement to “A similarity measure for second order properties of non-stationary functional time series with applications to clustering and testing.” https://doi.org/10.3150/20-BEJ1246SUPP
[61] van Delft, A. and Eichler, M. (2018). Locally stationary functional time series. Electron. J. Stat. 12 107-170. Zentralblatt MATH: 06841001
Digital Object Identifier: doi:10.1214/17-EJS1384
· Zbl 1473.62317 · doi:10.1214/17-EJS1384
[62] Vlachos, M., Lin, J., Keogh, E. and Gunopulos, D. (2003). A wavelet-based anytime algorithm for k-means clustering of time series. In Proc. Workshop on Clustering High Dimensionality Data and Its Applications, San Francisco.
[63] von Luxburg, U. (2007). A tutorial on spectral clustering. Stat. Comput. 17 395-416.
[64] von Luxburg, U., Belkin, M. and Bousquet, O. (2008). Consistency of spectral clustering. Ann. Statist. 36 555-586. Zentralblatt MATH: 1133.62045
Digital Object Identifier: doi:10.1214/009053607000000640
Project Euclid: euclid.aos/1205420511
· Zbl 1133.62045 · doi:10.1214/009053607000000640
[65] Zhang, X. · Zbl 1388.62274 · doi:10.3150/13-BEJ592
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.