×

An alternating direction and projection algorithm for structure-enforced matrix factorization. (English) Zbl 1387.90244

Summary: Structure-enforced matrix factorization (SeMF) represents a large class of mathematical models appearing in various forms of principal component analysis, sparse coding, dictionary learning and other machine learning techniques useful in many applications including neuroscience and signal processing. In this paper, we present a unified algorithm framework, based on the classic alternating direction method of multipliers (ADMM), for solving a wide range of SeMF problems whose constraint sets permit low-complexity projections. We propose a strategy to adaptively adjust the penalty parameters which is the key to achieving good performance for ADMM. We conduct extensive numerical experiments to compare the proposed algorithm with a number of state-of-the-art special-purpose algorithms on test problems including dictionary learning for sparse representation and sparse nonnegative matrix factorization. Results show that our unified SeMF algorithm can solve different types of factorization problems as reliably and as efficiently as special-purpose algorithms. In particular, our SeMF algorithm provides the ability to explicitly enforce various combinatorial sparsity patterns that, to our knowledge, has not been considered in existing approaches.

MSC:

90C30 Nonlinear programming

Software:

Yall1; PDCO
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Paatero, P., Tapper, U.: Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5(2), 111-126 (1994) · doi:10.1002/env.3170050203
[2] Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Conference on Neural Information Processing Systems (NIPS), pp. 556-562. MIT Press, MIT (2001)
[3] Gonzalez, E., Zhang, Y.: Accelerating the lee-seung algorithm for nonnegative matrix factorization. Technical Report TR05-02, CAAM, Rice University (2005)
[4] Lin, C.J.: On the convergence of multiplicative update algorithms for nonnegative matrix factorization. IEEE Trans. Neural Netw. 18(6), 1589-1596 (2007) · doi:10.1109/TNN.2007.895831
[5] Albright, R., Cox, J., Duling, D., Langville, A.N., Meyer, C.D.: Algorithms, initializations, and convergence for the nonnegative matrix factorization. NCSU Technical Report, Math 81706 (2006) · Zbl 0984.94010
[6] Lin, C.J.: Projected gradient methods for non-negative matrix factorization. Neural Comput. 19(10), 2756-2779 (2007) · Zbl 1173.90583 · doi:10.1162/neco.2007.19.10.2756
[7] Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788-791 (1999) · Zbl 1369.68285 · doi:10.1038/44565
[8] Hoyer, P.O.: Non-negative sparse coding. Neural Networks for Signal Processing XII (Proc. IEEE Workshop on Neural Networks for Signal Processing), pp. 557-565. Martigny, Switzerland (2002)
[9] Feng, T., Li, S.Z., Shum, H.Y., Zhang, H.J.: Local non-negative matrix factorization as a visual representation. In: Proceedings of the 2nd International Conference on Development and Learning, pp. 178-183 (2002) · Zbl 0254.90045
[10] Hoyer, P.O., Dayan, P.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5, 1457-1469 (2004) · Zbl 1222.68218
[11] Montano, A.P., Carazo, J.M., Kochi, K., Lehmann, D., Pascual-Marqui, R.D.: Nonsmooth nonnegative matrix factorization (nsnmf). IEEE Trans. Pattern Anal. 28(3), 403-415 (2006) · doi:10.1109/TPAMI.2006.60
[12] Jenatton, R., Obozinski, G., Bach, F.: Structured sparse principal component analysis. International Conference on Artificial Intellgence and Statistics (AISTATS) (2010) · Zbl 1177.65088
[13] Peharz, R., Pernkopf, F.: Sparse nonnegative matrix factorization with \[\ell_0\] ℓ0-constraints. Neurocomputing 80, 38-46 (2012) · doi:10.1016/j.neucom.2011.09.024
[14] Zheng, W.S., Lai, J.H., Liao, S.C., He, R.: Extracting non-negative basis images using pixel dispersion penalty. Pattern Recogn. 45(8), 2912-2926 (2012) · doi:10.1016/j.patcog.2012.01.022
[15] Aharon, M., Elad, M., Bruckstein, A.: K-svd: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311-4322 (2006) · Zbl 1375.94040 · doi:10.1109/TSP.2006.881199
[16] Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online dictionary learning for sparse coding. In: Proceedings of the 26th Annual International Conference on Machine Learning 8, pp. 689-696 (2009) · Zbl 1242.62087
[17] Jenatton, R., Audibert, J.Y., Bach, F.: Structured variable selection with sparsity-inducing norms. J. Mach. Learn. Res. 12, 2777-2824 (2011) · Zbl 1280.68170
[18] Szabo, Z., Poczos, B., Lorincz, A.: Online group-structured dictionary learning. In: Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 11, pp. 2865-2872, Washington, DC (2011) · Zbl 1173.90583
[19] Glowinski, R., Marroco, A.: Sur lapproximation, par elements finis dordre un, et la resolution, par penalisation-dualite dune classe de problemes de dirichlet non lineaires. Revue francaise dautomatique, informatique, recherche operationnelle. Analyse numerique 9(2), 41-76 (1975) · Zbl 0368.65053 · doi:10.1051/m2an/197509R200411
[20] Gabay, D., Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Comput. Math. Appl. 2(1), 17-40 (1976) · Zbl 0352.65034 · doi:10.1016/0898-1221(76)90003-1
[21] Yang, J., Zhang, Y.: Alternating direction algorithms for L1-problems in compressive sensing. SIAM J. Sci. Comput. 33(1), 250-278 (2011) · Zbl 1256.65060 · doi:10.1137/090777761
[22] Goldstein, T., Osher, S.: The split Bregman method for L1 regularized problems. SIAM J. Imaging Sci. 2(2), 323-343 (2009) · Zbl 1177.65088 · doi:10.1137/080725891
[23] Glowinski, R., Le Tallec, P.: Augmented Lagrangian and Operator Splitting Methods in Nonlinear Mechanics. SIAM, Philadelphia (1989) · Zbl 0698.73001 · doi:10.1137/1.9781611970838
[24] He, B., Liao, L., Han, D., Yang, H.: A new inexact alternating directions method for monotone variational inequalities. Math. Program. A(92), 103-118 (2002) · Zbl 1009.90108 · doi:10.1007/s101070100280
[25] Deng, W., Yin, W.: On the global and linear convergence of the generalized alternating direction method of multipliers. Technical Report TR12-14, CAAM, Rice University (2012) · Zbl 1379.65036
[26] Powell, M.J.D.: A Method for Nonlinear Constraints in Minimization Problems. Optimization. Academic Press, London (1969) · Zbl 0194.47701
[27] Hestenes, M.R.: Multiplier and gradient methods. J. Optimiz. Theory Appl. 4, 303-320 (1969) · Zbl 0174.20705 · doi:10.1007/BF00927673
[28] Rockafellar, R.T.: The multiplier method of hestenes and powell applied to convex programming. J. Optimiz. Theory Appl. 12, 555-562 (1973) · Zbl 0254.90045 · doi:10.1007/BF00934777
[29] Fortin, M., Glowinski, R.: Augmented Lagrangian Methods. North-Holland, Amsterdam (1983) · Zbl 0525.65045
[30] Zhang, Y.: An alternating direction algorithm for nonnegative matrix factorization. Technical Report TR10-03, CAAM, Rice University (2010)
[31] He, B.S., Yang, H., Wang, S.L.: Alternating direction method with self adaptive penalty parameters for monotone variational inequalities. J. Optimiz. Theory Appl. 106(2), 337-356 (2000) · Zbl 0997.49008 · doi:10.1023/A:1004603514434
[32] Wen, Z., Goldfarb, D., Yin, W.: Alternating direction augmented Lagrangian methods for semidefinite programming. Math. Progam. Comput. 2, 203-230 (2010) · Zbl 1206.90088 · doi:10.1007/s12532-010-0017-1
[33] Lin, Z., Liu, R., Su, Z.: Linearized alternating direction method with adaptive penalty for low rank representation. Advances in Neural Information Processing Systems(NIPS) 24 (2011) · Zbl 1280.68170
[34] Samaria, F.S., Harter, A.C.: Parameterisation of a stochastic model for human face identification. In: Proceedings of the 2nd IEEE Workshop on Applications of Computer Vision, pp. 138-142 (1994)
[35] Donoho, D., Stodden, V.: When does non-negative matrix factorization give a correct decomposition into Parts? In: Proceedings of 17th Annual Conferance Neural Information Processing Systems NIPS (2003) · Zbl 1009.90108
[36] Mallat, S., Zhang, Z.F.: Matching pursuit with time-frequency dictionaries. IEEE Trans. Signal Process. 41, 3397-3415 (1993) · Zbl 0842.94004 · doi:10.1109/78.258082
[37] Chen, S., Billings, S.A., Luo, W.: Orthogonal least squares methods and their application to non-linear system identification. Int. J. Control 50, 1873-1896 (1989) · Zbl 0686.93093 · doi:10.1080/00207178908953472
[38] Davis, G.M., Mallat, S.G., Zhang, Z.F.: Adaptive time-frequency decompositions. Opt. Eng. 33(7), 2183-2191 (1994) · doi:10.1117/12.173207
[39] Pati, Y.C., Rezaiifar, R., Krishnaprasad, P.S.: Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: 1993 Conference Record of The Twenty-Seventh Asilomar Conference on Signals, Systems and Computers 1, pp. 40-44 (1993)
[40] Tropp, J.A.: Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inf. Theory 50(10), 2231-2242 (2004) · Zbl 1288.94019 · doi:10.1109/TIT.2004.834793
[41] Chen, S., Donoho, D., Saunders, M.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20, 33-61 (1998) · Zbl 0919.94002 · doi:10.1137/S1064827596304010
[42] Gorodnitsky, I.F., Rao, B.D.: Sparse signal reconstruction from limited data using focuss: A re-weighted norm minimization algorithm. IEEE Trans. Signal Process. 45, 600-616 (1997) · doi:10.1109/78.558475
[43] Rao, B.D., Kreutz-Delgado, K.: An affine scaling methodology for best basis selection. IEEE Trans. Signal Process. 47(1), 187-200 (1999) · Zbl 0984.94010 · doi:10.1109/78.738251
[44] Rao, B.D., Kreutz-Delgado, K.: Deriving algorithms for computing sparse solutions to linear inverse problems. In: IEEE Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers 1, pp. 955-959 (1998) · Zbl 1375.94040
[45] Rao, B.D., Engan, K., Cotter, S.F., Palmer, J., Kreutz-Delgado, K.: Subset selection in noise based on diversity measure minimization. IEEE Trans. Signal Process. 51(3), 760-770 (2003) · doi:10.1109/TSP.2002.808076
[46] Gersho, A., Gray, R.M.: Vector Quantization and Signal Compression. Kluwer Academic Publishers, Norwell (1991) · Zbl 0782.94001
[47] Lewicki, M.S., Olshausen, B.A.: A probabilistic framework for the adaptation and comparison of image codes. J. Opt. Soc. Am. A Opt. Image Sci. Vis. 16(7), 1587-1601 (1999) · Zbl 1099.93511 · doi:10.1364/JOSAA.16.001587
[48] Olshausen, B.A., Field, D.J.: Natural image statistics and efficient coding. Netw. Comput. Neural 7(2), 333-339 (1996) · doi:10.1088/0954-898X_7_2_014
[49] Olshausen, B.A., Field, B.J.: Sparse coding with an overcomplete basis set: a strategy employed by v1? Vis. Res. 37, 3311-3325 (1997) · doi:10.1016/S0042-6989(97)00169-7
[50] Lewicki, M.S., Sejnowski, T.J.: Learning overcomplete representations. Neural Comput. 12, 337-365 (2000) · doi:10.1162/089976600300015826
[51] Engan, K., Aase, S.O., Husøy, J.H.: Multi-frame compression: theory and design. EURASIP Signal Process. 80(10), 2121-2140 (2000) · Zbl 1098.94538 · doi:10.1016/S0165-1684(00)00072-4
[52] Engan, K., Aase, S.O., Hakon-Husoy, J.H.: Method of optimal directions for frame design. In: IEEE International Conferance on Acoustics, Speech, and Signal Processing 5, pp. 2443-2446 (1999)
[53] Engan, K., Rao, B.D., Kreutz-Delgado, K.: Frame design using focuss with method of optimal directions (MOD). Norwegian Signal Processing Symposium, pp. 65-69 (1999)
[54] Kreutz-Delgado, K., Murray, J.F., Rao, B.D., Engan, K., Lee, T., Sejnowski, T.J.: Dictionary learning algorithms for sparse representation. Neural Comput. 15(2), 349-396 (2003) · Zbl 1047.68101 · doi:10.1162/089976603762552951
[55] Murray, J.F., Kreutz-Delgado, K.: An improved focuss-based learning algorithm for solving sparse linear inverse problems. In: IEEE International Conference on Signals, Systems and Computers 1, pp. 347-351 (2001)
[56] Kreutz-Delgado, K., Rao, B.D.: Focuss-based dictionary learning algorithms. Wavel. Appl. Signal Image Process. VIII 4119, 1-53 (2000)
[57] Li, S.Z., Hou, X.W., Zhang, H.J., Cheng, Q.S.: Learning spatially localized, parts-based representation. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR2001) 1, pp. 207-212 (2001)
[58] Zheng, W.S., Li, S.Z., Lai, J.H., Liao, S.C.: On constrained sparse matrix factorization. In: IEEE 11th International Conference on Computer Vision (ICCV2007), pp. 1-8 (2007) · Zbl 0919.94002
[59] Bittorf, V., Recht, B., Ré, E., Tropp, J.A.: Factoring nonnegative matrices with linear programs. In: Advances in Neural Information Processing Systems (NIPS ’12), pp. 1223-1231 (2012)
[60] Araújo, U.M.C., Saldanha, B.T.C., Galvao, R.K.H., Yoneyama, T., Chame, H.C., Visani, V.: The successive projections algorithm for variable selection in spectroscopic multicomponent analysis. Chemom. Intell. Lab. Syst. 57(2), 65-73 (2001) · doi:10.1016/S0169-7439(01)00119-8
[61] Kumar, A., Sindhwani, V., Kambadur, P.: Fast conical hull algorithms for near-separable non-negative matrix factorization. In Interantional Conferance on Machine Learning (ICML ’13) 28, pp. 231-239 (2013) · Zbl 1369.68285
[62] Gillis, N., Luce, R.: Robust near-separable nonnegative matrix factorization using linear optimization. J. Mach. Learn. Res. 15, 1249-1280 (2014) · Zbl 1319.90044
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.