×

Maximum likelihood estimation of a multi-dimensional log-concave density. With discussion and authors’ reply. (English) Zbl 1411.62055

Summary: Let \(X_1,\ldots,X_n\) be independent and identically distributed random vectors with a (Lebesgue) density \(f\). We first prove that, with probability 1, there is a unique log-concave maximum likelihood estimator \(\hat{f}_n\) of \(f\). The use of this estimator is attractive because, unlike kernel density estimation, the method is fully automatic, with no smoothing parameters to choose. Although the existence proof is non-constructive, we can reformulate the issue of computing \(\hat{f}_n\) in terms of a non-differentiable convex optimization problem, and thus combine techniques of computational geometry with Shor’s \(r\)-algorithm to produce a sequence that converges to \(\hat{f}_n\). An R version of the algorithm is available in the package LogConcDEAD – log-concave density estimation in arbitrary dimensions. We demonstrate that the estimator has attractive theoretical properties both when the true density is log-concave and when this model is misspecified. For the moderate or large sample sizes in our simulations, \(\hat{f}_n\) is shown to have smaller mean integrated squared error compared with kernel-based methods, even when we allow the use of a theoretical, optimal fixed bandwidth for the kernel estimator that would not be available in practice. We also present a real data clustering example, which shows that our methodology can be used in conjunction with the expectation-maximization algorithm to fit finite mixtures of log-concave densities.

MSC:

62F10 Point estimation
62G07 Density estimation
90C25 Convex programming
62-02 Research exposition (monographs, survey articles) pertaining to statistics
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Abramson, I. ( 1982) On variable bandwidth in kernel estimates—a square root law. Ann. Statist., 10, 1217– 1223. · Zbl 0507.62040
[2] An, M. Y. ( 1998) Logconcavity versus logconvexity: a complete characterization. J. Econ. Theor., 80, 350– 369. · Zbl 0911.90071
[3] Asuncion, A. and Newman, D. J. ( 2007) UCI Machine Learning Repository. University of California, Irvine. (Available from http://www.ics.uci.edu/ mlearn/MLRepository.html.)
[4] Bagnoli, M. and Bergstrom, T. ( 2005) Log‐concave probability and its applications. Econ. Theor., 26, 445– 469. · Zbl 1077.60012
[5] Balabdaoui, F., Rufibach, K. and Wellner, J. A. ( 2009) Limit distribution theory for maximum likelihood estimation of a log‐concave density. Ann. Statist., 37, 1299– 1331. · Zbl 1160.62008
[6] Barber, C. B., Dobkin, D. P. and Huhdanpaa, H. ( 1996) The Quickhull algorithm for convex hulls. ACM Trans. Math. Softwr., 22, 469– 483. · Zbl 0884.65145
[7] Barndorff‐Nielsen, O. ( 1978) Information and Exponential Families in Statistical Theory. New York: Wiley. · Zbl 0387.62011
[8] Boyd, S. and Vandenberghe, L. ( 2004) Convex Optimization. Cambridge: Cambridge University Press. · Zbl 1058.90049
[9] Bozdogan, H. ( 1994) Choosing the number of clusters, subset selection of variables, and outlier detection on the standard mixture‐model cluster analysis. In New Approaches in Classification and Data Analysis (eds E. Diday, Y. Lechevallier, M. Schader, P. Bertrand and B. Burtschy), pp. 169– 177. New York: Springer.
[10] Breiman, L., Meisel, W. and Purcell, E. ( 1977) Variable kernel estimates of multivariate densities. Technometrics, 19, 135– 144. · Zbl 0379.62023
[11] Brooks, S. P. ( 1998) MCMC convergence diagnosis via multivariate bounds on log‐concave densities. Ann. Statist., 26, 398– 433. · Zbl 0961.65002
[12] Caplin, A. and Nalebuff, B. ( 1991a) Aggregation and social choice: a mean voter theorem. Econometrica, 59, 1– 23. · Zbl 0743.90006
[13] Caplin, A. and Nalebuff, B. ( 1991b) Aggregation and imperfect competition: on the existence of equilibrium. Econometrica, 59, 25– 59. · Zbl 0738.90012
[14] Chacón, J. E. ( 2009) Data‐driven choice of the smoothing parametrization for kernel density estimators. Can. J. Statist., 34, 249– 265.
[15] Chacón, J. E., Duong, T. and Wand, M. P. ( 2010) Asymptotics for general multivariate kernel density derivative estimators. Statist. Sin.,to be published.
[16] Chang, G. and Walther, G. ( 2007) Clustering with mixtures of log‐concave distributions. Computnl Statist. Data Anal., 51, 6242– 6251. · Zbl 1445.62141
[17] Chiu, S.‐T. ( 1992) An automatic bandwidth selector for kernel density estimation. Biometrika, 79, 771– 782. · Zbl 0764.62034
[18] Cule, M. L. ( 2009) Maximum likelihood estimation of a multivariate log‐concave density. PhD Thesis. University of Cambridge, Cambridge.
[19] Cule, M. L. and Dümbgen, L. ( 2008) On an auxiliary function for log‐density estimation. Technical Report 71. Universität Bern, Bern.
[20] Cule, M. L., Gramacy, R. B. and Samworth, R. J. ( 2007) LogConcDEAD: Maximum Likelihood Estimation of a Log‐Concave Density. Statistical Laboratory, Cambridge. (Available from http://CRAN.R‐project.org/package=LogConcDEAD.)
[21] Cule, M. L., Gramacy, R. B. and Samworth, R. J. ( 2009) LogConcDEAD: an R package for maximum likelihood estimation of a multivariate log‐concave density. J. Statist. Softwr., 29, issue 2.
[22] Cule, M. L. and Samworth, R. J. ( 2010), Theoretical properties of the log‐concave maximum likelihood estimator of a multidimensional density. Electron. J. Statist., 4, 254– 270. · Zbl 1329.62183
[23] Cule, M. L., Samworth, R. J. and Stewart, M. I. ( 2010) Maximum likelihood estimation of a multidimensional log‐concave density (long version). Statistical Laboratory, Cambridge. (Available from http://www.statslab.cam.ac.uk/ rjs57/Research.html.) · Zbl 1329.62183
[24] Ćwik, J. and Koronacki, J. ( 1997) Multivariate density estimation: a comparative study. Neur. Computn Appl., 6, 173– 185.
[25] Deheuvels, P. ( 1977) Estimation non parametrique de la densité par histogrammes generalisés II. Publ. Inst. Statist. Univ. Paris, 22, 1– 23. · Zbl 0375.62038
[26] Dempster, A. P., Laird, N. M. and Rubin, D. B. ( 1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Statist. Soc. B, 39, 1– 38. · Zbl 0364.62022
[27] Devroye, L., Györfi, L. and Lugosi, G. ( 1996) A Probabilistic Theory of Pattern Recognition. New York: Springer.
[28] Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. and Picard, D. ( 1996) Density estimation by wavelet thresholding. Ann. Statist., 24, 508– 539. · Zbl 0860.62032
[29] Dümbgen, L., Hüsler, A. and Rufibach, K. ( 2007) Active set and EM algorithms for log‐concave densities based on complete and censored data. Technical Report 61. Universität Bern, Bern. (Available from http://arxiv.org/abs/0707.4643/.)
[30] Dümbgen, L. and Rufibach, K. ( 2009) Maximum likelihood estimation of a log‐concave density and its distribution function: basic properties and uniform consistency. Bernoulli, 15, 40– 68. · Zbl 1200.62030
[31] Dümbgen, L., Samworth, R. J. and Schuhmacher, D. ( 2010) Approximation by log‐concave distributions with applications to regression. Technical Report 75. Universität Bern, Bern. (Available from http://arxiv.org/abs/1002.3448/.)
[32] Duong, T. ( 2004) Bandwidth selectors for multivariate kernel density estimation. PhD Thesis. University of Western Australia, Perth. · Zbl 1060.62042
[33] Duong, T. and Hazelton, M. L. ( 2003) Plug‐in bandwidth matrices for bivariate kernel density estimation. J. Nonparam. Statist., 15, 17– 30. · Zbl 1019.62032
[34] Duong, T. and Hazelton, M. L. ( 2005) Convergence rates for unconstrained bandwidth matrix selectors in multivariate kernel density estimation. J. Multiv. Anal., 93, 417– 433. · Zbl 1066.62059
[35] Eggermont, P. P. B. and LaRiccia, V. ( 2001) Maximum Penalized Likelihood Estimation, vol. 1, Density Estimation. New York: Springer.
[36] Eubank, R. L. ( 1988) Spline Smoothing and Nonparametric Regression. New York: Dekker. · Zbl 0702.62036
[37] Fix, E. and Hodges, J. L. ( 1951) Discriminatory analysis—nonparametric discrimination: consistency properties. Technical Report 4, project 21‐29‐004. US Air Force School of Aviation Medicine, Randolph Field. · Zbl 0715.62080
[38] Fix, E. and Hodges, J. L. ( 1989) Discriminatory analysis—nonparametric discrimination: consistency properties. Int. Statist. Rev., 57, 238– 247. · Zbl 0715.62080
[39] Fraley, C. F. and Raftery, A. E. ( 2002) Model‐based clustering, discriminant analysis, and density estimation. J. Am. Statist. Ass., 97, 611– 631. · Zbl 1073.62545
[40] Gordon, A. D. ( 1981) Classification. London: Chapman and Hall.
[41] Grenander, U. ( 1956) On the theory of mortality measurement II. Skand. Akt., 39, 125– 153. · Zbl 0077.33715
[42] Groeneboom, P., Jongbloed, G. and Wellner, J. A. ( 2001) Estimation of a convex function: characterizations and asymptotic theory. Ann. Statist., 29, 1653– 1698. · Zbl 1043.62027
[43] Groeneboom, P., Jongbloed, G. and Wellner, J. A. ( 2008) The support reduction algorithm for computing nonparametric function estimates in mixture models. Scand. J. Statist., 35, 385– 399. · Zbl 1199.65017
[44] Groeneboom, P. and Wellner, J. A. ( 1992) Information Bounds and Nonparametric Maximum Likelihood Estimation. Basel: Birkhäuser. · Zbl 0757.62017
[45] Hall, P., Marron, J. S. and Park, B. U. ( 1992) Smoothed cross‐validation. Probab. Theor. Reltd Flds, 92, 1– 20.
[46] Hall, P., Park, B. U. and Samworth, R. J. ( 2008) Choice of neighbour order in nearest‐neighbour classification. Ann. Statist., 36, 2135– 2152. · Zbl 1274.62421
[47] Hand, D. J. ( 1981) Discrimination and Classification. New York: Wiley. · Zbl 0587.62119
[48] Hyndman, R. J. ( 1996) Computing and graphing highest density regions. Am. Statistn, 50, 120– 126.
[49] Ibragimov, A. I. ( 1956) On the composition of unimodal distributions. Theor. Probab. Appl., 1, 255– 260.
[50] Jongbloed, G. ( 1998) The iterative convex minorant algorithm for nonparametric estimation. J. Computnl Graph. Statist., 7, 310– 321.
[51] Kappel, F. and Kuntsevich, A. ( 2000) An implementation of Shor’s r‐algorithm. Computnl Optimizn Appl., 15, 193– 205. · Zbl 0947.90112
[52] Koenker, R. and Mizera, I. ( 2010) Quasi‐concave density estimation. Ann. Statist., to be published. · Zbl 1200.62031
[53] Lee, C. W. ( 2004) Subdivisions and triangulations of polytopes. In Handbook of Discrete and Computational Geometry (eds J. E. Goodman and J. O’Rourke), 2nd edn, pp. 383– 406. New York: CRC Press.
[54] McLachlan, G. J. and Basford, K. E. ( 1988) Mixture Models: Inference and Applications to Clustering. New York: Dekker. · Zbl 0697.62050
[55] McLachlan, G. J. and Krishnan, T. ( 1997) The EM Algorithm and Extensions. New York: Wiley. · Zbl 0882.62012
[56] Mengersen, K. L. and Tweedie, R. L. ( 1996) Rates of convergence of the Hastings and Metropolis algorithms. Ann. Statist., 24, 101– 121. · Zbl 0854.60065
[57] Pal, J. K., Woodroofe, M. and Meyer, M. ( 2007) Estimating a Polya frequency function. In Complex Datasets and Inverse Problems: Tomography, Networks and Beyond, pp. 239– 249. Ohio: Institute of Mathematical Statistics.
[58] Parzen, E. ( 1962) On the estimation of a probability density function and the mode. Ann. Math. Statist., 33, 1065– 1076. · Zbl 0116.11302
[59] Prékopa, A. ( 1973) On logarithmically concave measures and functions. Acta Sci. Math., 34, 335– 343.
[60] R Development Core Team ( 2009) R: a Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing.
[61] Rockafellar, R. T. ( 1997) Convex Analysis. Princeton: Princeton University Press. · Zbl 0932.90001
[62] Rosenblatt, M. ( 1956) Remarks on some nonparametric estimates of a density function. Ann. Math. Statist., 27, 832– 837. · Zbl 0073.14602
[63] Rufibach, K. ( 2007) Computing maximum likelihood estimators of a log‐concave density function. J. Statist. Computn Simuln, 77, 561– 574. · Zbl 1146.62027
[64] Rufibach, K. and Dümbgen, L. ( 2006) logcondens: estimate a log‐concave probability density from i.i.d. observations. Universität Bern, Bern. (Available from http://CRAN.R‐project.org/package=logcondens.)
[65] Sain, S. R. ( 2002) Multivariate locally adaptive density estimation. Computnl Statist. Data Anal., 39, 165– 186. · Zbl 1132.62329
[66] Sain, S. R. and Scott, D. W. ( 1996) On locally adaptive density estimation. J. Am. Statist. Ass., 91, 1525– 1534. · Zbl 0882.62035
[67] Schuhmacher, D. and Dümbgen, L. ( 2010) Consistency of multivariate log‐concave density estimators. Statist. Probab. Lett., 80, 376– 380. · Zbl 1181.62048
[68] Schuhmacher, D., Hüsler, A. and Dümbgen, L. ( 2009) Multivariate log‐concave distributions as a nearly parametric model. Technical Report 74. Universität Bern, Bern. (Available from http://arxiv.org/pdf/0907.0250v2.)
[69] Scott, D. W. and Sain, S. R. ( 2004) Multi‐dimensional density estimation. In Handbook of Statistics (eds C. R. Rao and E. J. Wegman), vol. 23, Data Mining and Computational Statistics. Amsterdam: Elsevier.
[70] Seregin, A. and Wellner, J. A. ( 2010) Nonparametric estimation of convex‐transformed densitiesAnn. Statist., to be published. · Zbl 1204.62058
[71] Shor, N. Z. ( 1985) Minimization Methods for Non‐differentiable Functions. Berlin: Springer. · Zbl 0561.90058
[72] Street, W. M., Wolberg, W. H. and Mangasarian, O. L. ( 1993) Nuclear feature extraction for breast tumor diagnosis. In Proc. Int. Symp. Electronic Imaging: Science and Technology, San Jose, pp. 861– 870.
[73] Swales, J. D. (ed.) ( 1985) Platt vs. Pickering: an Episode in Recent Medical History. Cambridge: Keynes.
[74] Titterington, D. M., Smith, A. F. M. and Makov, U. E. ( 1985) Statistical Analysis of Finite Mixture Distributions. Chichester: Wiley. · Zbl 0646.62013
[75] Vapnik, V. N. and Mukherjee, S. ( 2000) Support vector method for multivariate density estimation. In Advances in Neural Information Processing Systems, pp. 659– 665. Cambridge: MIT Press.
[76] Wahba, G. ( 1990) Spline Models for Observational Data. Philadelphia: Society for Industrial and Applied Mathematics. · Zbl 0813.62001
[77] Walther, G. ( 2002) Detecting the presence of mixing with multiscale maximum likelihood. J. Am. Statist. Ass., 97, 508– 513. · Zbl 1073.62533
[78] Walther, G. ( 2009) Inference and modeling with log‐concave distributions. Statist. Sci., 24, 319– 327. · Zbl 1329.62192
[79] Wand, M. P. and Jones, M. C. ( 1995) Kernel Smoothing. Boca Raton: Chapman and Hall–CRC Press. · Zbl 0854.62043
[80] Zhang, X., King, M. L. and Hyndman, R. J. ( 2006) Bandwidth selection for multivariate kernel density estimation using MCMC. Computnl Statist. Data Anal., 50, 3009– 3031. · Zbl 1445.62077
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.