×

zbMATH — the first resource for mathematics

Sparse permutation invariant covariance estimation. (English) Zbl 1320.62135
Summary: The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in high-dimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lasso-type penalty. We establish a rate of convergence in the Frobenius norm as both data dimension \(p\) and sample size \(n\) are allowed to grow, and show that the rate depends explicitly on how sparse the true concentration matrix is. We also show that a correlation-based version of the method exhibits better rates in the operator norm. We also derive a fast iterative algorithm for computing the estimator, which relies on the popular Cholesky decomposition of the inverse but produces a permutation-invariant estimator. The method is compared to other estimators on simulated data and on a real data example of tumor tissue classification using gene expression data.

MSC:
62H20 Measures of association (correlation, canonical correlation, etc.)
62H12 Estimation in multivariate analysis
Software:
glasso; pcalg
PDF BibTeX XML Cite
Full Text: DOI Euclid arXiv
References:
[1] Alon, U., Barkai, N., Notterman, D. A., Gish, K., Ybarra, S., Mack, D., and Levine, A. J. (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays., Proc Natl Acad Sci USA , 96(12):6745-6750.
[2] Bazaraa, M. S., Sherali, H. D., and Shetty, C. M. (2006)., Nonlinear Programming: Theory and Algorithms . Wiley, New Jersey, 3rd edition. · Zbl 1140.90040
[3] Bickel, P. J. and Levina, E. (2004). Some theory for Fisher’s linear discriminant function, “naive Bayes”, and some alternatives when there are many more variables than observations., Bernoulli , 10(6):989-1010. · Zbl 1064.62073 · doi:10.3150/bj/1106314847
[4] Bickel, P. J. and Levina, E. (2007). Covariance regularization by thresholding., Ann. Statist. · Zbl 1196.62062 · doi:10.1214/08-AOS600
[5] Bickel, P. J. and Levina, E. (2008). Regularized estimation of large covariance matrices., Ann. Statist. , 36(1):199-227. · Zbl 1132.62040 · doi:10.1214/009053607000000758 · euclid:aos/1201877299
[6] Chaudhuri, S., Drton, M., and Richardson, T. S. (2007). Estimation of a covariance matrix with zeros., Biometrika , 94(1):199-216. · Zbl 1143.62032 · doi:10.1093/biomet/asm007
[7] d’Aspremont, A., Banerjee, O., and El Ghaoui, L. (2008). First-order methods for sparse covariance selection., SIAM Journal on Matrix Analysis and its Applications , 30(1):56-66. · Zbl 1156.90423 · doi:10.1137/060670985
[8] Dey, D. K. and Srinivasan, C. (1985). Estimation of a covariance matrix under Stein’s loss., Ann. Statist. , 13(4):1581-1591. · Zbl 0582.62042 · doi:10.1214/aos/1176349756
[9] Drton, M. and Perlman, M. D. (2008). A SINful approach to Gaussian graphical model selection., J. Statist. Plann. Inference , 138(4):1179-1200. · Zbl 1130.62068 · doi:10.1016/j.jspi.2007.05.035
[10] El Karoui, N. (2007). Operator norm consistent estimation of large dimensional sparse covariance matrices., Ann. Statist. · Zbl 1196.62064 · doi:10.1214/07-AOS559
[11] Fan, J., Fan, Y., and Lv, J. (2008). High dimensional covariance matrix estimation using a factor model., Journal of Econometrics . · Zbl 1429.62185 · doi:10.1016/j.jeconom.2008.09.017
[12] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties., J. Amer. Statist. Assoc. , 96(456):1348-1360. · Zbl 1073.62547 · doi:10.1198/016214501753382273
[13] Friedman, J., Hastie, T., and Tibshirani, R. (2007). Pathwise coordinate optimization., Annals of Applied Statistics , 1(2):302-332. · Zbl 1378.90064 · doi:10.1214/07-AOAS131
[14] Friedman, J., Hastie, T., and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso., Biostatistics . Pre-published online, DOI 10.1093/biostatistics/kxm045. · Zbl 1143.62076
[15] Fu, W. (1998). Penalized regressions: the bridge versus the lasso., Journal of Computational and Graphical Statistics , 7(3):397-416.
[16] Furrer, R. and Bengtsson, T. (2007). Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants., Journal of Multivariate Analysis , 98(2):227-255. · Zbl 1105.62091 · doi:10.1016/j.jmva.2006.08.003
[17] Golub, G. H. and Van Loan, C. F. (1989)., Matrix Computations . The John Hopkins University Press, Baltimore, Maryland, 2nd edition. · Zbl 0733.65016
[18] Haff, L. R. (1980). Empirical Bayes estimation of the multivariate normal covariance matrix., Ann. Statist. , 8(3):586-597. · Zbl 0441.62045 · doi:10.1214/aos/1176345010
[19] Huang, J., Liu, N., Pourahmadi, M., and Liu, L. (2006). Covariance matrix selection and estimation via penalised normal likelihood., Biometrika , 93(1):85-98. · Zbl 1152.62346 · doi:10.1093/biomet/93.1.85
[20] Hunter, D. R. and Li, R. (2005). Variable selection using mm algorithms., Ann. Statist. , 33(4):1617-1642. · Zbl 1078.62028 · doi:10.1214/009053605000000200
[21] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis., Ann. Statist. , 29(2):295-327. · Zbl 1016.62078 · doi:10.1214/aos/1009210544
[22] Johnstone, I. M. and Lu, A. Y. (2004). Sparse principal components analysis. Unpublished, manuscript.
[23] Kalisch, M. and Bühlmann, P. (2007). Estimating high-dimensional directed acyclic graphs with the PC-algorithm., J. Mach. Learn. Res. , 8:613-636. · Zbl 1222.68229 · www.jmlr.org
[24] Lam, C. and Fan, J. (2007). Sparsistency and rates of convergence in large covariance matrices estimation., Manuscript. · Zbl 1191.62101
[25] Ledoit, O. and Wolf, M. (2003). A well-conditioned estimator for large-dimensional covariance matrices., Journal of Multivariate Analysis , 88:365-411. · Zbl 1032.62050 · doi:10.1016/S0047-259X(03)00096-4
[26] Levina, E., Rothman, A. J., and Zhu, J. (2008). Sparse estimation of large covariance matrices via a nested Lasso penalty., Annals of Applied Statistics , 2(1):245-263. · Zbl 1137.62338 · doi:10.1214/07-AOAS139
[27] Lin, S. P. and Perlman, M. D. (1985). A Monte Carlo comparison of four estimators for a covariance matrix. In Krishnaiah, P. R., editor, Multivariate Analysis , volume 6, pages 411-429. Elsevier Science Publishers. · Zbl 0593.62051
[28] Mardia, K. V., Kent, J. T., and Bibby, J. M. (1979)., Multivariate Analysis . Academic Press, New York. · Zbl 0432.62029
[29] Meinshausen, N. and Bühlmann, P. (2006). High dimensional graphs and variable selection with the Lasso., Ann. Statist. , 34(3):1436-1462. · Zbl 1113.62082 · doi:10.1214/009053606000000281
[30] Paul, D. (2007). Asymptotics of sample eigenstructure for a large dimensional spiked covariance model., Stat. Sinica , 17(4):1617-1642. · Zbl 1134.62029
[31] Saulis, L. and Statulevičius, V. A. (1991)., Limit Theorems for Large Deviations . Kluwer Academic Publishers, Dordrecht. · Zbl 0744.60028
[32] Smith, M. and Kohn, R. (2002). Parsimonious covariance matrix estimation for longitudinal data., J. Amer. Statist. Assoc. , 97(460):1141-1153. · Zbl 1041.62044 · doi:10.1198/016214502388618942
[33] Wang, L., Zhu, J., and Zou, H. (2007). Hybrid huberized support vector machines for microarray classification. In, ICML ’07: Proceedings of the 24th International Conference on Machine Learning , pages 983-990, New York, NY, USA. ACM Press.
[34] Wong, F., Carter, C., and Kohn, R. (2003). Efficient estimation of covariance selection models., Biometrika , 90:809-830. · Zbl 1436.62346 · doi:10.1093/biomet/90.4.809
[35] Wu, W. B. and Pourahmadi, M. (2003). Nonparametric estimation of large covariance matrices of longitudinal data., Biometrika , 90:831-844. · Zbl 1436.62347 · doi:10.1093/biomet/90.4.831
[36] Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model., Biometrika , 94(1):19-35. · Zbl 1142.62408 · doi:10.1093/biomet/asm018
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.