×

zbMATH — the first resource for mathematics

Equivalence of distance-based and RKHS-based statistics in hypothesis testing. (English) Zbl 1281.62117
Summary: We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, maximum mean discrepancies (MMD), that is, distances between embeddings of distributions to reproducing kernel Hilbert spaces (RKHS), as established in machine learning. In the case where the energy distance is computed with a semimetric of negative type, a positive definite kernel, termed distance kernel, may be defined such that the MMD corresponds exactly to the energy distance. Conversely, for any positive definite kernel, we can interpret the MMD as energy distance with respect to some negative-type semimetric. This equivalence readily extends to distance covariance using kernels on the product space.
We determine the class of probability distributions for which the test statistics are consistent against all alternatives. Finally, we investigate the performance of the family of distance kernels in two-sample and independence tests: we show in particular that the energy distance most commonly employed in statistics is just one member of a parametric family of kernels, and that other choices from this family can yield more powerful tests.

MSC:
62G10 Nonparametric hypothesis testing
46N30 Applications of functional analysis in probability theory and statistics
68T05 Learning and adaptive systems in artificial intelligence
68Q32 Computational learning theory
PDF BibTeX Cite
Full Text: DOI Euclid arXiv
References:
[1] Alba Fernández, V., Jiménez Gamero, M. D. and Muñoz García, J. (2008). A test for the two-sample problem based on empirical characteristic functions. Comput. Statist. Data Anal. 52 3730-3748. · Zbl 1452.62305
[2] Anderson, N. H., Hall, P. and Titterington, D. M. (1994). Two-sample test statistics for measuring discrepancies between two multivariate probability density functions using kernel-based density estimates. J. Multivariate Anal. 50 41-54. · Zbl 0798.62055
[3] Arcones, M. A. and Giné, E. (1992). On the bootstrap of \(U\) and \(V\) statistics. Ann. Statist. 20 655-674. · Zbl 0760.62018
[4] Bach, F. R. and Jordan, M. I. (2002). Kernel independent component analysis. J. Mach. Learn. Res. 3 1-48. · Zbl 1088.68689
[5] Baringhaus, L. and Franz, C. (2004). On a new multivariate two-sample test. J. Multivariate Anal. 88 190-206. · Zbl 1035.62052
[6] Berg, C., Christensen, J. P. R. and Ressel, P. (1984). Harmonic Analysis on Semigroups : Theory of Positive Definite and Related Functions. Graduate Texts in Mathematics 100 . Springer, New York. · Zbl 0619.43001
[7] Berlinet, A. and Thomas-Agnan, C. (2004). Reproducing Kernel Hilbert Spaces in Probability and Statistics . Kluwer, London. · Zbl 1145.62002
[8] Fukumizu, K., Song, L. and Gretton, A. (2011). Kernel Bayes’ rule. In Advances in Neural Information Processing Systems (J. Shawe-Taylor, R. S. Zemel, P. Bartlett, F. C. N. Pereira and K. Q. Weinberger, eds.) 24 1737-1745. Curran Associates, Red Hook, NY.
[9] Fukumizu, K., Gretton, A., Sun, X. and Schölkopf, B. (2008). Kernel measures of conditional dependence. In Advances in Neural Information Processing Systems 20 489-496. MIT Press, Cambridge, MA.
[10] Fukumizu, K., Sriperumbudur, B., Gretton, A. and Schoelkopf, B. (2009). Characteristic kernels on groups and semigroups. In Advances in Neural Information Processing Systems 21 473-480. Curran Associates, Red Hook, NY.
[11] Gretton, A., Fukumizu, K. and Sriperumbudur, B. K. (2009). Discussion of: Brownian distance covariance. Ann. Appl. Stat. 3 1285-1294. · Zbl 1454.62175
[12] Gretton, A. and Györfi, L. (2010). Consistent nonparametric tests of independence. J. Mach. Learn. Res. 11 1391-1423. · Zbl 1242.62033
[13] Gretton, A., Bousquet, O., Smola, A. and Schölkopf, B. (2005). Measuring statistical dependence with Hilbert-Schmidt norms. In Algorithmic Learning Theory (S. Jain, H. U. Simon and E. Tomita, eds.). Lecture Notes in Computer Science 3734 63-77. Springer, Berlin. · Zbl 1168.62354
[14] Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B. and Smola, A. (2007). A kernel method for the two-sample problem. In NIPS 513-520. MIT Press, Cambridge, MA. · Zbl 1283.62095
[15] Gretton, A., Fukumizu, K., Teo, C. H., Song, L., Schölkopf, B. and Smola, A. (2008). A kernel statistical test of independence. In Advances in Neural Information Processing Systems 20 585-592. MIT Press, Cambridge, MA.
[16] Gretton, A., Fukumizu, K., Harchaoui, Z. and Sriperumbudur, B. (2009). A fast, consistent kernel two-sample test. In Advances in Neural Information Processing Systems 22. Curran Associates, Red Hook, NY.
[17] Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf, B. and Smola, A. (2012a). A kernel two-sample test. J. Mach. Learn. Res. 13 723-773. · Zbl 1283.62095
[18] Gretton, A., Sriperumbudur, B., Sejdinovic, D., Strathmann, H., Balakrishnan, S., Pontil, M. and Fukumizu, K. (2012b). Optimal kernel choice for large-scale two-sample tests. In Advances in Neural Information Processing Systems 25 1214-1222. Curran Associates, Red Hook, NY.
[19] Lyons, R. (2013). Distance covariance in metric spaces. Ann. Probab. 41 3284-3305. · Zbl 1292.62087
[20] Müller, A. (1997). Integral probability metrics and their generating classes of functions. Adv. in Appl. Probab. 29 429-443. · Zbl 0890.60011
[21] Reed, M. and Simon, B. (1980). Methods of Modern Mathematical Physics. I : Functional Analysis , 2nd ed. Academic Press, San Diego. · Zbl 0459.46001
[22] Schölkopf, B., Smola, A. J. and Müller, K. R. (1997). Kernel principal component analysis. In ICANN (W. Gerstner, A. Germond, M. Hasler and J. D. Nicoud, eds.). Lecture Notes in Computer Science 1327 583-588. Springer, Berlin.
[23] Sejdinovic, D., Gretton, A., Sriperumbudur, B. and Fukumizu, K. (2012). Hypothesis testing using pairwise distances and associated kernels. In Proceedings of the International Conference on Machine Learning ( ICML ) 1111-1118. Omnipress, New York.
[24] Smola, A. J., Gretton, A., Song, L. and Schölkopf, B. (2007). A Hilbert space embedding for distributions. In Proceedings of the Conference on Algorithmic Learning Theory ( ALT ) 4754 13-31. Springer, Berlin. · Zbl 1142.68407
[25] Sriperumbudur, B. (2011). Mixture density estimation via Hilbert space embedding of measures. In Proceedings of the International Symposium on Information Theory 1027-1030. IEEE, Piscataway, NJ.
[26] Sriperumbudur, B. K., Fukumizu, K. and Lanckriet, G. R. G. (2011). Universality, characteristic kernels and RKHS embedding of measures. J. Mach. Learn. Res. 12 2389-2410. · Zbl 1280.68198
[27] Sriperumbudur, B., Gretton, A., Fukumizu, K., Lanckriet, G. and Schölkopf, B. (2008). Injective Hilbert space embeddings of probability measures. In Proceedings of the Conference on Learning Theory ( COLT ) 111-122. Omnipress, New York. · Zbl 1242.60005
[28] Sriperumbudur, B., Fukumizu, K., Gretton, A., Lanckriet, G. and Schoelkopf, B. (2009). Kernel choice and classifiability for RKHS embeddings of probability distributions. In Advances in Neural Information Processing Systems 22. Curran Associates, Red Hook, NY.
[29] Sriperumbudur, B. K., Gretton, A., Fukumizu, K., Schölkopf, B. and Lanckriet, G. R. G. (2010). Hilbert space embeddings and metrics on probability measures. J. Mach. Learn. Res. 11 1517-1561. · Zbl 1242.60005
[30] Sriperumbudur, B. K., Fukumizu, K., Gretton, A., Schölkopf, B. and Lanckriet, G. R. G. (2012). On the empirical estimation of integral probability metrics. Electron. J. Stat. 6 1550-1599. · Zbl 1295.62035
[31] Steinwart, I. and Christmann, A. (2008). Support Vector Machines . Springer, New York. · Zbl 1203.68171
[32] Székely, G. and Rizzo, M. (2004). Testing for equal distributions in high dimension. InterStat 5 .
[33] Székely, G. J. and Rizzo, M. L. (2005). A new test for multivariate normality. J. Multivariate Anal. 93 58-80. · Zbl 1087.62070
[34] Székely, G. J., Rizzo, M. L. and Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. Ann. Statist. 35 2769-2794. · Zbl 1129.62059
[35] Székely, G. J. and Rizzo, M. L. (2009). Brownian distance covariance. Ann. Appl. Stat. 3 1236-1265. · Zbl 1196.62077
[36] Zhang, K., Peters, J., Janzing, D. and Schoelkopf, B. (2011). Kernel-based conditional independence test and application in causal discovery. In Proceedings of the Conference on Uncertainty in Artificial Intelligence ( UAI ) 804-813. AUAI Press, Corvallis, Oregon.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.