×

Fixed points of the EM algorithm and nonnegative rank boundaries. (English) Zbl 1308.62035

Summary: Mixtures of \(r\) independent distributions for two discrete random variables can be represented by matrices of nonnegative rank \(r\). Likelihood inference for the model of such joint distributions leads to problems in real algebraic geometry that are addressed here for the first time. We characterize the set of fixed points of the Expectation-Maximization algorithm, and we study the boundary of the space of matrices with nonnegative rank at most \(3\). Both of these sets correspond to algebraic varieties with many irreducible components.

MSC:

62F10 Point estimation
13P25 Applications of commutative algebra (e.g., to statistics, control theory, optimization, etc.)

Software:

Magma; Macaulay2
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Aggarwal, A., Booth, H., O’Rourke, J., Suri, S. and Yap, C.-K. (1989). Finding minimal convex nested polygons. Inform. and Comput. 83 98-110. · Zbl 0679.68068 · doi:10.1016/0890-5401(89)90049-7
[2] Allman, E. S., Matias, C. and Rhodes, J. A. (2009). Identifiability of parameters in latent structure models with many observed variables. Ann. Statist. 37 3099-3132. · Zbl 1191.62003 · doi:10.1214/09-AOS689
[3] Allman, E. S., Rhodes, J., Sturmfels, B. and Zwiernik, P. (2014). Tensors of nonnegative rank two. Linear Algebra Appl. Special Issue on Statistics . · Zbl 1312.15033 · doi:10.1016/j.laa.2013.10.046
[4] Allman, E. S., Rhodes, J. A. and Taylor, A. (2014). A semialgebraic description of the general Markov model on phylogenetic trees. SIAM J. Discrete Math. 28 736-755. · Zbl 1300.60085 · doi:10.1137/120901568
[5] Basu, S., Pollack, R. and Roy, M.-F. (2003). Algorithms in Real Algebraic Geometry. Algorithms and Computation in Mathematics 10 . Springer, Berlin. · Zbl 1031.14028
[6] Beale, E. (1977). Discussion of “Maximum likelihood from incomplete data via the EM algorithm” by A. Dempster, N. Laird and D. Rubin. J. Roy. Statist. Soc. Ser. B 39 22-23.
[7] Bocci, C., Carlini, E. and Rapallo, F. (2011). Perturbation of matrices and nonnegative rank with a view toward statistical models. SIAM J. Matrix Anal. Appl. 32 1500-1512. · Zbl 1242.15031 · doi:10.1137/110825455
[8] Bosma, W., Cannon, J. and Playoust, C. (1997). The Magma algebra system. I. The user language. J. Symbolic Comput. 24 235-265. · Zbl 0898.68039 · doi:10.1006/jsco.1996.0125
[9] Brion, M. and Kumar, S. (2005). Frobenius Splitting Methods in Geometry and Representation Theory. Progress in Mathematics 231 . Birkhäuser, Boston. · Zbl 1072.14066
[10] Cohen, J. E. and Rothblum, U. G. (1993). Nonnegative ranks, decompositions, and factorizations of nonnegative matrices. Linear Algebra Appl. 190 149-168. · Zbl 0784.15001 · doi:10.1016/0024-3795(93)90224-C
[11] Cox, D., Little, J. and O’Shea, D. (1992). Ideals , Varieties , and Algorithms : An Introduction to Computational Algebraic Geometry and Commutative Algebra . Springer, New York. · Zbl 0756.13017
[12] Critch, A. (2013). Binary hidden Markov models and varieties. J. Algebr. Stat. 4 1-30. · Zbl 1346.13057 · doi:10.18409/jas.v4i1.17
[13] Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B 39 1-22. · Zbl 0364.62022
[14] Drton, M., Sturmfels, B. and Sullivant, S. (2009). Lectures on Algebraic Statistics. Oberwolfach Seminars 39 . Birkhäuser, Basel. · Zbl 1166.13001
[15] Eggermont, R., Horobeţ, E. and Kubjas, K. Algebraic boundary of matrices of nonnegative rank at most three. Available at . · Zbl 1350.13009 · doi:10.1016/j.laa.2016.06.036
[16] Eisenbud, D. and Sturmfels, B. (1996). Binomial ideals. Duke Math. J. 84 1-45. · Zbl 0873.13021 · doi:10.1215/S0012-7094-96-08401-X
[17] Fawzi, H. and Parrilo, P. Self-scaled bounds for atomic cone ranks: Applications to nonnegative rank and cp-rank. Available at . · Zbl 1346.90662 · doi:10.1007/s10107-015-0937-7
[18] Fienberg, S. (1977). Discussion of “Maximum likelihood from incomplete data via the EM algorithm” by A. Dempster, N. Laird and D. Rubin. J. Roy. Statist. Soc. Ser. B 39 29-30.
[19] Fienberg, S. E., Hersh, P., Rinaldo, A. and Zhou, Y. (2010). Maximum likelihood estimation in latent class models for contingency table data. In Algebraic and Geometric Methods in Statistics 27-62. Cambridge Univ. Press, Cambridge.
[20] Garcia, L. D., Stillman, M. and Sturmfels, B. (2005). Algebraic geometry of Bayesian networks. J. Symbolic Comput. 39 331-355. · Zbl 1126.68102 · doi:10.1016/j.jsc.2004.11.007
[21] Georgi, B. and Schliep, A. (2006). Context-specific independence mixture modeling for positional weight matrices. Bioinformatics 22 166-173.
[22] Grayson, D. and Stillman, M. Macaulay2, a software system for research in algebraic geometry. Available at .
[23] Gross, E. and Rodriguez, J. Maximum likelihood geometry in the presence of data zeros. In Proceedings of the 39 th International Symposium on Symbolic and Algebraic Computation 232-239. ACM, New York. · Zbl 1325.68285 · doi:10.1145/2608628.2608659
[24] Hauenstein, J., Rodriguez, J. and Sturmfels, B. (2014). Maximum likelihood for matrices with rank constraints. J. Algebr. Stat. 5 18-38. · Zbl 1345.62043 · doi:10.18409/jas.v5i1.23
[25] Huh, J. and Sturmfels, B. (2014). Likelihood geometry. In Combinatorial Algebraic Geometry. Lecture Notes in Math. 2108 63-117. Springer, Cham. · Zbl 1328.14004 · doi:10.1007/978-3-319-04870-3_3arxiv:1305.7462
[26] Lee, D. and Seung, H. (1999). Learning the parts of objects by nonnegative matrix factorization. Nature 401 788-791. · Zbl 1369.68285
[27] Miller, E. and Sturmfels, B. (2005). Combinatorial Commutative Algebra. Graduate Texts in Mathematics 227 . Springer, New York. · Zbl 1090.13001
[28] Moitra, A. (2012). An almost optimal algorithm for computing nonnegative rank. In Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms 1454-1464. SIAM, Philadelphia. · Zbl 1334.68102 · doi:10.1137/140990139
[29] Mond, D., Smith, J. and van Straten, D. (2003). Stochastic factorizations, sandwiched simplices and the topology of the space of explanations. R. Soc. Lond. Proc. Ser. A Math. Phys. Eng. Sci. 459 2821-2845. · Zbl 1051.60076 · doi:10.1098/rspa.2003.1150
[30] Murray, G. (1977). Discussion of “Maximum likelihood from incomplete data via the EM algorithm” by A. Dempster, N. Laird and D. Rubin. J. Roy. Statist. Soc. Ser. B 39 27-28.
[31] Pachter, L. and Sturmfels, B. (2005). Algebraic Statistics for Computational Biology . Cambridge Univ. Press, New York. · Zbl 1108.62118 · doi:10.1017/CBO9780511610684
[32] Sturmfels, B. (2002). Solving Systems of Polynomial Equations. CBMS Regional Conference Series in Mathematics 97 . AMS, Washington, DC. · Zbl 1101.13040
[33] Vavasis, S. A. (2009). On the complexity of nonnegative matrix factorization. SIAM J. Optim. 20 1364-1377. · Zbl 1206.65130 · doi:10.1137/070709967
[34] White, N. L. (1997). Geometric applications of the Grassmann-Cayley algebra. In Handbook of Discrete and Computational Geometry 881-892. CRC, Boca Raton. · Zbl 0902.15024
[35] Zhu, M., Jiang, G. and Gao, S. (2011). Solving the 100 Swiss francs problem. Math. Comput. Sci. 5 195-207. · Zbl 06112129 · doi:10.1007/s11786-011-0068-3
[36] Zwiernik, P. and Smith, J. Q. (2011). Implicit inequality constraints in a binary tree model. Electron. J. Stat. 5 1276-1312. · Zbl 1274.62355 · doi:10.1214/11-EJS640
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.