×

Spectral clustering for non-reversible Markov chains. (English) Zbl 1413.62082

Summary: Spectral clustering methods are based on solving eigenvalue problems for the identification of clusters, e.g., the identification of metastable subsets of a Markov chain. Usually, real-valued eigenvectors are mandatory for this type of algorithms. The Perron Cluster Analysis (PCCA+) is a well-known spectral clustering method of Markov chains. It is applicable for reversible Markov chains, because reversibility implies a real-valued spectrum. We also extend this spectral clustering method to non-reversible Markov chains and give some illustrative examples. The main idea is to replace the eigenvalue problem by a real-valued Schur decomposition. By this extension non-reversible Markov chains can be analyzed. Furthermore, the chains do not need to have a positive stationary distribution. In addition to metastabilities, dominant cycles and sinks can also be identified. This novel method is called GenPCCA (i.e., generalized PCCA), since it includes the case of non-reversible processes. We also apply the method to real-world eye-tracking data.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
60J10 Markov chains (discrete-time Markov processes on discrete state spaces)
62M15 Inference from stochastic processes and spectral analysis

Software:

Matlab
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Bowman GR, Pande VS, Noé F (2014) An introduction to Markov State Models and their application to long timescale molecular simulation. Springer, New York · Zbl 1290.92004
[2] Brandts, JH, Matlab code for sorted real Schur forms, Numer Linear Algebra Appl, 9, 249-261, (2002) · Zbl 1071.65500
[3] Courtois, PJ, Error analysis in nearly-completely decomposable stochastic systems, Econometrica J Econom Soc, 43, 691-709, (1975) · Zbl 0333.62070
[4] Deuflhard, P.; Weber, M., Robust Perron cluster analysis in conformation dynamics, Linear Algebra Appl, 398, 161-184, (2005) · Zbl 1070.15019
[5] Deuflhard, P.; Huisinga, W.; Fischer, A.; Schuette, Ch, Identification of almost invariant aggregates in reversible nearly uncoupled Markov chains, Linear Algebra Appl, 315, 39-59, (2000) · Zbl 0963.65008
[6] Fackeldey K, Weber M (2017) GenPCCA-Markov state models for non-equilibrium steady states. In Big data clustering: data preprocessing, variable selection, and dimension reduction, pp 70-80
[7] Fill, JA, Eigenvalue bounds on convergence to stationarity for nonreversible Markov chains, with an application to the exclusion process, Ann Appl Probab, 1, 62-87, (1991) · Zbl 0726.60069
[8] Fritzsche, D.; Mehrmann, V.; Szyld, DB; Virnik, E., An SVD approach to identifying metastable states of Markov chains, ETNA, 29, 46-69, (2007) · Zbl 1171.15008
[9] Froyland, G.; Santitissadeekorn, N.; Monahan, A., Transport in time-dependent dynamical systems: finite-time coherent sets, Chaos Interdiscip J Nonlinear Sci, 20, 043116, (2010) · Zbl 1311.37008
[10] Froyland G, Padberg-Gehle K (2014) Almost-invariant and finite-time coherent sets: directionality, duration, and diffusion. In: Froyland G, Bahsoun W, Bose C (eds) Ergodic theory, open dynamics, and coherent structures, vol 70 of proceedings in mathematics and statistics, pp 171-216 · Zbl 1352.37078
[11] Golub G, van Loan CF (1985) Matrix computation. John Hopkins University Press, Baltimore · Zbl 0592.65011
[12] Hartfiel, DJ; Meyer Carl, D., On the structure of stochastic matrices with a subdominant eigenvalue near 1, Linear Algebra Appl, 272, 193-203, (1998) · Zbl 0892.15017
[13] Huisinga W (2001) Metastability of Markovian systems. PhD thesis, Freie Universität Berlin
[14] Huisinga, W.; Meyn, S.; Schütte, C., Phase transitions and metastability in Markovian and molecular systems, Ann Appl Probab, 14, 419-458, (2004) · Zbl 1041.60026
[15] Jacobi, MN, A robust spectral method for finding lumpings and meta stable states of non-reversible Markov chains, ETNA, 37, 296-306, (2010) · Zbl 1206.15009
[16] Kliegl R, Laubrock J, Köstler A (2015) Augenblicke bei der Bildbetrachtung. Eine kognitionswissenschaftliche Spekulation zum Links und Rechts im Bild. In: Lepper VM, Deuflhard P, Markschies C (eds) Räume-Bilder-Kulturen. de Gruyter, Berlin, pp 77-90
[17] Konstantinov, MM; Petkov, PH; Christov, ND, Nonlocal perturbation analysis of the Schur system of a matrix, SIAM J Matrix Anal Appl, 15, 383-392, (1994) · Zbl 0798.15010
[18] Kube, S.; Weber, M., A coarse graining method for the identification of transition rates between molecular conformations, J Chem Phys, 126, 024103, (2007)
[19] Langville Amy, N.; Meyer Carl, D., A survey of eigenvector methods for web information retrieval, SIAM Rev, 47, 135-161, (2005) · Zbl 1075.65053
[20] Langville Amy N, Meyer Carl D (2012) Google’s PageRank and beyond the science of search engine rankings. Princeton University Press, Princeton · Zbl 1270.68005
[21] Li, C.; Biswas, G.; Dale, M.; Dale, P.; Hoffmann, F. (ed.), Building models of ecological dynamics using hmm based temporal data clustering-a preliminary study, No. 2189, 53-62, (2001), Berlin · Zbl 1029.68794
[22] MATLAB (2010) Version 7.10.0 (R2010a). The MathWorks Inc., Natick
[23] Möller-Levet CS, Klawonn F, Cho K-H, Wolkenhauer O (2003) Fuzzy clustering of short time-series and unevenly distributed sampling points. In: LNCS, proceedings of the IDA2003, Springer, pp 28-30
[24] Reuter, B.; Weber, M.; Fackeldey, K.; Röblitz, S.; Garcia, M., A generalized markov state modeling method for non-equilibrium biomolecular dynamics exemplified on peptide conformational dynamics driven by an oscillating electric field, J Chem Theory Comput, 14, 3579, (2018)
[25] Röblitz, S.; Weber, M., Fuzzy spectral clustering by PCCA+: application to Markov state models and data classification, Adv Data Anal Classif, 7, 147-179, (2013) · Zbl 1273.62145
[26] Runolfsson T, Ma Y (2007) Model reduction of nonreversible Markov chains. In: Decision and control, 2007 46th IEEE conference on, pp 3739-3744
[27] Sarich M, Schütte C (2014) Utilizing hitting times for finding metastable sets in non-reversible Markov chains. Technical Report 14-32, ZIB, Takustr.7, 14195 Berlin
[28] Schütte Ch (1999) Conformational dynamics: modelling, theory, algorithm and application to biomolecules. Habilitation thesis, Freie Universität Berlin
[29] Schütte C, Sarich M (2012) Metastability and Markov state models in molecular dynamics: modeling, analysis, algorithmic approaches, courant lecture notes, vol 24. AMS, Washington, DC · Zbl 1305.60004
[30] Shumway, RH, Time-frequency clustering and discriminant analysis, Stat Probab Lett, 63, 307-314, (2003) · Zbl 1116.62364
[31] Simon, HA; Ando, A., Aggregation of variables in dynamic systems, Econometrica, 29, 111-138, (1961) · Zbl 0121.15103
[32] Stewart WJ (1994) Introduction to the numerical solution of Markov chains. Princeton University Press, Princeton · Zbl 0821.65099
[33] Tifenbach, R., On an SVD-based algorithm for identifying meta-stable states of Markov chains, ETNA, 38, 17-33, (2011) · Zbl 1321.65056
[34] Tjakra, JD; Bao, J.; Hudon, N.; Yang, R., Analysis of collective dynamics of particulate systems modeled by Markov chains, Powder Technol, 235, 228-237, (2013)
[35] Vlachos M, Lin J, Keogh E, Gunopulos D (2003) A wavelet-based anytime algorithm for k-means clustering of time series. In: In proc. workshop on clustering high dimensionality data and its applications, pp 23-30
[36] Vogt W (2004) Zur numerik großdimensionaler eigenwertprobleme, preprint 22-24
[37] Warren Liao, T., A clustering procedure for exploratory mining of vector time series, Pattern Recognit., 40, 2550-2562, (2007) · Zbl 1118.68632
[38] Weber M (2002) Clustering by using a simplex structure. ZIB Report 04-03, Zuse Institut Berlin
[39] Weber M (2006) Meshless methods in conformation dynamics. PhD thesis, Freie Universität Berlin · Zbl 1202.62088
[40] Weber M (2011) A subspace approach to molecular Markov state models via a new infinitesimal generator. Habilitation thesis
[41] Weber, M.; Rungsarityotin, W.; Schliep, A., An indicator for the number of clusters using a linear map to simplex structure, 103-110, (2006), Heidelberg
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.