Bayesian unsupervised classification framework based on stochastic partitions of data and a parallel search strategy. (English) Zbl 1231.62031

Summary: Advantages of statistical model-based unsupervised classification over heuristic alternatives have been widely demonstrated in the scientific literature. However, the existing model-based approaches are often both conceptually and numerically instable for large and complex data sets. We consider a Bayesian model-based method for unsupervised classification of discrete valued vectors, that has certain advantages over standard solutions based on latent class models. Our theoretical formulation defines a posterior probability measure on the space of classification solutions corresponding to stochastic partitions of the observed data. To efficiently explore the classification space we use a parallel search strategy based on non-reversible stochastic processes. A decision-theoretic approach is utilized to formalize the inferential process in the context of unsupervised classification. Both real and simulated data sets are used for the illustration of the discussed methods.


62F15 Bayesian inference
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62M99 Inference from stochastic processes
68T99 Artificial intelligence
65C40 Numerical analysis or methods applied to Markov chains
62C99 Statistical decision theory
65C60 Computational problems in statistics (MSC2010)


Full Text: DOI Link


[1] Aarts EHL, Korst J (1989) Simulated annealing and Boltzmann machines. Wiley, Chichester · Zbl 0674.90059
[2] Bernardo JM, Smith AFM (1994) Bayesian theory. Wiley, Chichester
[3] Bock H-H (1996) Probabilistic models in cluster analysis. Comput Stat Data Anal 23: 5–28 · Zbl 0900.62324
[4] Cerquides J, De Mántaras RL (2005) TAN classifiers based on decomposable distributions. Mach Learn 59: 323–354 · Zbl 1105.68091
[5] Corander J, Tang J (2007) Bayesian analysis of population structure based on linked molecular information. Math Biosci 205: 19–31 · Zbl 1107.62115
[6] Corander J, Ekdahl M, Koski T (2008) Parallell interacting MCMC for learning of topologies of graphical models. Data Mining Knowl Discovery (in press)
[7] Corander J, Gyllenberg M, Koski T (2006) Bayesian model learning based on a parallel MCMC strategy. Stat Comput 16: 355–362
[8] Corander J, Gyllenberg M, Koski T (2007) Random partition models and exchangeability for Bayesian identification of population structure. Bull Math Biol 69: 797–815 · Zbl 1141.62310
[9] Corander J, Waldmann P, Marttinen P, Sillanpää MJ (2004) BAPS 2: enhanced possibilities for the analysis of genetic population structure. Bioinformatics 20: 2363–2369
[10] Dawson KJ, Belkhir K (2001) A Bayesian approach to the identification of panmictic populations and the assignment of individuals. Genet Res Camb 78: 59–77
[11] Duda RO, Hart PE, Stork DG (2000) Pattern classification, 2nd edn. Wiley, New York
[12] Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29: 131–163 · Zbl 0892.68077
[13] Gidas B (1985) Nonstationary Markov chains and convergence of the annealing algorithm. J Stat Phys 39: 73–130 · Zbl 0642.60049
[14] Green P (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82: 711–732 · Zbl 0861.62023
[15] Gyllenberg HG, Gyllenberg M, Koski T, Lund T, Schindler J, Verlaan J (1997) Classification of Enterobacteriaceae by minimization of stochastic complexity. Microbiology 143: 721–732
[16] Gyllenberg M, Koski T, Verlaan M (1997) Classification of binary vectors by stochastic complexity. J Multivar Anal 63: 47–72 · Zbl 1090.62542
[17] Gyllenberg M, Koski T (2001) Probabilistic models for bacterial taxonomy. Int Stat Rev 69: 249–276 · Zbl 1180.92004
[18] Gyllenberg M, Koski T (2002) Bayesian predictiveness, exchangeability and sufficientness in bacterial taxonomy. Math Biosci 177, 178:161–184 · Zbl 1003.62023
[19] Hand DJ, Yu K (2001) Idiot’s Bayes: not so stupid after all. Int Stat Rev 69: 385–398 · Zbl 1213.62010
[20] Hubert L, Arabie P (1985) Comparing partitions. J Classif 2: 193–218 · Zbl 0587.62128
[21] Häggström O (2002) Finite Markov chains and algorithmic applications. Cambridge University Press, Cambridge · Zbl 0999.60001
[22] Isaacson DL, Madsen RW (1976) Markov chains: theory and applications. Wiley, New York
[23] Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, London
[24] Marttinen P, Corander J, Törönen P, Holm L (2006) Bayesian search of functionally divergent protein subgroups and their function specific residues. Bioinformatics 22: 2466–2474 · Zbl 05325250
[25] Perks W. (1947) Some observations on inverse probability including a new indifference rule. J Inst Actuaries 73: 285–334 · Zbl 0031.06001
[26] Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66: 846–850
[27] Rissanen J (1987) Stochastic complexity. J Roy Stat Soc B 49: 223–239 · Zbl 0654.62008
[28] Rissanen J (1995) Fisher information and stochastic complexity. IEEE Trans Inf Theory 42: 40–47 · Zbl 0856.94006
[29] Robert CP, Casella G (1999) Monte Carlo statistical methods. Springer, New York · Zbl 0935.62005
[30] Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, Zhivotovsky LA, Feldmann MW (2002) Genetic structure of human populations. Science 298: 2381–2385
[31] Schervish MJ (1995) Theory of statistics. Springer, New York
[32] Sisson SA (2005) Transdimensional Markov chains: a decade of progress and future perspectives. J Am Stat Assoc 100: 1077–1089 · Zbl 1117.62428
[33] Van Cutsem B (1996) Combinatorial structures and structures for classification. Comput Stat Data Anal 23: 165–188 · Zbl 0900.62320
[34] Zabell SL (1982) W.E. Johnson’s ’Sufficientness’ postulate. Ann Stat 10: 1091–1099 · Zbl 0512.62007
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.