×

A variational Bayesian methodology for hidden Markov models utilizing Student’s-\(t\) mixtures. (English) Zbl 1211.68349

Summary: The Student’s \(t\) hidden Markov model (SHMM) has been recently proposed as a robust to outliers form of conventional continuous density hidden Markov models, trained by means of the expectation-maximization algorithm. In this paper, we derive a tractable variational Bayesian inference algorithm for this model. Our innovative approach provides an efficient and more robust alternative to EM-based methods, tackling their singularity and overfitting proneness, while allowing for the automatic determination of the optimal model size without cross-validation. We highlight the superiority of the proposed model over the competition using synthetic and real data. We also demonstrate the merits of our methodology in applications from diverse research fields, such as human computer interaction, robotics and semantic audio analysis.

MSC:

68T10 Pattern recognition, speech recognition
62F15 Bayesian inference
65C40 Numerical analysis or methods applied to Markov chains

Software:

UCI-ml; PRMLT
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] O. Cappé, E. Moulines, T. Rydén, Inference in Hidden Markov Models. Springer Series in Statistics, New York, 2005.; O. Cappé, E. Moulines, T. Rydén, Inference in Hidden Markov Models. Springer Series in Statistics, New York, 2005.
[2] Rabiner, L. R., A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE, 77, 245-255 (1989)
[3] Dempster, A.; Laird, N.; Rubin, D., Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, B, 39, 1, 1-38 (1977) · Zbl 0364.62022
[4] Chatzis, S.; Kosmopoulos, D.; Varvarigou, T., Robust sequential data modeling using an outlier tolerant hidden Markov model, IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 9, 1657-1669 (2009)
[5] Yamazaki, K.; Watanabe, S., Singularities in mixture models and upper bounds of stochastic complexity, Neural Networks, 16, 7, 1029-1038 (2003) · Zbl 1255.68130
[6] C. Archambeau, J.A. Lee, M. Verleysen, On the convergence problems of the EM algorithm for finite Gaussian mixtures, in: Eleventh European Symposium on Artificial Neural Networks, 2003, pp. 99-106.; C. Archambeau, J.A. Lee, M. Verleysen, On the convergence problems of the EM algorithm for finite Gaussian mixtures, in: Eleventh European Symposium on Artificial Neural Networks, 2003, pp. 99-106.
[7] G. McLachlan, D. Peel, Finite Mixture Models, Wiley Series in Probability and Statistics, New York, 2000.; G. McLachlan, D. Peel, Finite Mixture Models, Wiley Series in Probability and Statistics, New York, 2000. · Zbl 0963.62061
[8] Bishop, C. M., Pattern Recognition and Machine Learning (2006), Springer: Springer New York · Zbl 1107.68072
[9] Diebolt, J.; Robert, C. P., Estimation of finite mixture distributions through Bayesian sampling, Journal of the Royal Statistical Society Series B, 56, 363-375 (1994) · Zbl 0796.62028
[10] Richardson, S.; Green, P. J., On Bayesian analysis of mixtures with unknown number of components, Journal of the Royal Statistical Society Series B, 59, 731-792 (1997) · Zbl 0891.62020
[11] Jordan, M. I.; Ghahramani, Z.; Jaakkola, T. S.; Saul, L. K., An introduction to variational methods for graphical models, (Jordan, M. I., Learning in Graphical Models (1998), Kluwer: Kluwer Dordrecht), 105-162 · Zbl 0910.68175
[12] C.M. Bishop, M.E. Tipping, Variational relevance vector machines, in: Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence, 2000, pp. 46-53.; C.M. Bishop, M.E. Tipping, Variational relevance vector machines, in: Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence, 2000, pp. 46-53.
[13] Constantinopoulos, C.; Likas, A., Unsupervised learning of Gaussian mixtures based on variational component splitting, IEEE Transactions on Neural Networks, 18, 745-755 (2007)
[14] Roberts, S. J.; Penny, W. D., Variational Bayes for generalized autoregressive models, IEEE Transactions on Signal Processing, 50, 2245-2257 (2002) · Zbl 1369.94269
[15] Smidl, V.; Quinn, A., Mixture-based extension of the AR model and its recursive Bayesian identification, IEEE Transactions on Signal Processing, 53, 3530-3542 (2005) · Zbl 1373.62441
[16] Archambeau, C.; Verleysen, M., Robust Bayesian clustering, Neural Networks, 20, 129-138 (2007) · Zbl 1158.68440
[17] Svensén, M.; Bishop, C. M., Robust Bayesian mixture modelling, Neurocomputing, 64, 235-252 (2005)
[18] Ghahramani, Z.; Beal, M., Variational inference for Bayesian mixture of factor analysers, Advances Neural Information Processing Systems, 12, 449-455 (1999)
[19] M.J. Beal, Variational algorithms for approximate Bayesian inference, Ph.D. Thesis, Gatsby Computational Neuroscience Unit, University College London, 2003.; M.J. Beal, Variational algorithms for approximate Bayesian inference, Ph.D. Thesis, Gatsby Computational Neuroscience Unit, University College London, 2003.
[20] Chatzis, S.; Kosmopoulos, D.; Varvarigou, T., Signal modeling and classification using a robust latent space model based on \(t\) distributions, IEEE Transactions on Signal Processing, 56, 3, 949-963 (2008) · Zbl 1390.94123
[21] D. MacKay, Ensemble learning for hidden Markov models, Technical Report, Department of Physics, University of Cambridge, 1997.; D. MacKay, Ensemble learning for hidden Markov models, Technical Report, Department of Physics, University of Cambridge, 1997.
[22] Ji, S.; Krishnapuram, B.; Carin, L., Variational Bayes for continuous hidden Markov models and its application to active learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 4, 522-532 (2006)
[23] Rezek, I.; Roberts, S. J., Ensemble hidden Markov models with extended observation densities for biosignal analysis, (Husmeier, D.; Dybowski, R.; Roberts, S., Probabilistic Modeling in Biomedicine and Medical Bioinformatics (2005), Springer, Verlag)
[24] Liu, C.; Rubin, D., ML estimation of the \(t\) distribution using EM and its extensions, ECM and ECME, Statistica Sinica, 5, 1, 19-39 (1995) · Zbl 0824.62047
[25] H. Attias, A variational Bayesian framework for graphical models, in: Proceedings of the Annual Conference on Neural Information Processing Systems, 2000.; H. Attias, A variational Bayesian framework for graphical models, in: Proceedings of the Annual Conference on Neural Information Processing Systems, 2000.
[26] Jaakkola, T.; Jordan, M. I., Bayesian parameter estimation via variational methods, Statistics and Computing, 10, 25-37 (2000)
[27] Winn, J.; Bishop, C. M., Variational message passing, Journal of Machine Learning Research, 6, 661-694 (2005) · Zbl 1222.68332
[28] Chandler, D., Introduction to Modern Statistical Mechanics (1987), Oxford University Press: Oxford University Press New York
[29] Kudo, M.; Toyama, J.; Shimbo, M., Multidimensional curve classification using passing-through regions, Pattern Recognition Letters, 20, 11-13, 1103-1111 (1999)
[30] A. Asuncion, D.J. Newman, UCI machine learning repository, 2007. URL: \( \langle\) http://www.ics.uci.edu/∼mlearn/MLRepository.html \(\rangle \); A. Asuncion, D.J. Newman, UCI machine learning repository, 2007. URL: \( \langle\) http://www.ics.uci.edu/∼mlearn/MLRepository.html \(\rangle \)
[31] T. Giannakopoulos, D. Kosmopoulos, A. Aristidou, S. Theodoridis, Violence content classification using audio features, in: Proceedings of the Advances in Artificial Intelligence, 2006, pp. 502-507.; T. Giannakopoulos, D. Kosmopoulos, A. Aristidou, S. Theodoridis, Violence content classification using audio features, in: Proceedings of the Advances in Artificial Intelligence, 2006, pp. 502-507.
[32] Bartsch, M. A.; Wakefield, G. H., Audio thumbnailing of popular music using chroma-based representations, IEEE Transactions on Multimedia, 7, 1, 96-104 (2005)
[33] Camarinha-matos, L. M.; Lopes, L. S.; Member, S.; Barata, J., Integration and learning in supervision of flexible assembly systems, IEEE Transactions on Robotics and Automation, 12, 202-219 (1996)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.