×

Jackknife bias correction of the AIC for selecting variables in canonical correlation analysis under model misspecification. (English) Zbl 1288.62097

Summary: We deal with a bias correction of Akaike’s information criterion (AIC) for selecting variables in the canonical correlation analysis when a goodness of fit of the model is assessed by the risk function consisting of the expected Kullback-Leibler loss function with a normality assumption. Although the bias of the AIC to the risk function is \(O(n^{- 1})\) when the model is correctly specified, its order turns into \(O(1)\) when the model is misspecified, where \(n\) is the sample size.
By using the leave-two-out jackknife method with a constant adjustment, we propose a new criterion that reduces the AIC’s bias to \(O(n^{- 2})\) even when the model is misspecified, and is an exact unbiased estimator of the risk function when data is generated from the normal distribution. Additionally, by applying basic theorems of linear algebra, e.g., the formula of an inverse of the sum of matrices and a simple property of an inverse matrix, to our problem, we obtain strict conditions to guarantee the validity of the bias correction, and another expression of the proposed criterion to reduce computational time tremendously, which does not contain any jackknife estimators.

MSC:

62H20 Measures of association (correlation, canonical correlation, etc.)
62F40 Bootstrap, jackknife and other resampling methods
15A09 Theory of matrix inversion and generalized inverses
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Srivastava, M. S., Methods of Multivariate Statistics (2002), John Wiley & Sons: John Wiley & Sons New York · Zbl 1006.62048
[2] Timm, N. H., Applied Multivariate Analysis (2002), Springer-Verlag: Springer-Verlag New York · Zbl 1002.62036
[3] Doeswijk, T. G.; Hageman, J. A.; Westerhuis, J. A.; Tikunov, Y.; Bovy, A.; van Eeuwijk, F. A., Canonical correlation analysis of multiple sensory directed metabolomics data blocks reveals corresponding parts between data blocks, Chemometr. Intell. Lab., 107, 371-376 (2011)
[4] Khalil, B.; Ouarda, T. B.M. J.; St-Hilaire, A., Estimation of water quality characteristics at ungauged sites using artificial neural networks and canonical correlation analysis, J. Hydrol., 405, 277-287 (2011)
[5] Vahedia, S., Canonical correlation analysis of procrastination, learning strategies and statistics anxiety among Iranian female college students, Procedia Soc. Behav. Sci., 30, 1620-1624 (2011)
[6] McKay, R. J., Variable selection in multivariate regression: an application of simultaneous test procedures, J. Roy. Statist. Soc. Ser. B, 39, 371-380 (1977) · Zbl 0388.62049
[7] Fujikoshi, Y., A test for additional information in canonical correlation analysis, Ann. Inst. Statist. Math., 34, 523-530 (1982) · Zbl 0544.62054
[8] Fujikoshi, Y., Selection of variables in discriminant analysis and canonical correlation analysis, (Krishnaiah, P. R., Multivariate Analysis VI (1985), North-Holland: North-Holland Amsterdam), 219-236
[9] Al-Kandari, N. M.; Jolliffe, I. T., Variable selection and interpretation in canonical correlation analysis, Comm. Statist. Simulation Comput., 26, 873-900 (1997) · Zbl 0900.62299
[10] Noble, R.; Smith, E. P.; Ye, K., Model selection in canonical correlation analysis (CCA) using Bayesian model averaging, Environmetrics, 15, 291-311 (2004)
[11] Fujikoshi, Y.; Sakurai, T.; Kanda, S.; Sugiyama, T., Bootstrap information criterion for selection of variables in canonical correlation analysis, J. Inst. Sci. Eng., Chuo Univ., 14, 31-49 (2008), (in Japanese)
[12] Ogura, T., A variable selection method in principal canonical correlation analysis, Comput. Statist. Data Anal., 54, 1117-1123 (2010) · Zbl 1319.62122
[13] Akaike, H., Information theory and an extension of the maximum likelihood principle, (Petrov, B. N.; Csáki, F., 2nd International Symposium on Information Theory (1973), Akadémiai Kiadó: Akadémiai Kiadó Budapest), 267-281 · Zbl 0283.62006
[14] Akaike, H., A new look at the statistical model identification, IEEE Trans. Automat. Control, AC-19, 716-723 (1974) · Zbl 0314.62039
[15] Kullback, S.; Leibler, R., On information and sufficiency, Ann. Math. Statist., 22, 79-86 (1951) · Zbl 0042.38403
[16] Sugiura, N., Further analysis of the data by Akaike’s information criterion and the finite corrections, Comm. Statist. Theory Methods, A7, 13-26 (1978) · Zbl 0382.62060
[17] Hurvich, C. M.; Tsai, C.-L., Regression and time series model selection in small samples, Biometrika, 76, 297-307 (1989) · Zbl 0669.62085
[18] Bedrick, E. J.; Tsai, C.-L., Model selection for multivariate regression in small samples, Biometrics, 50, 226-231 (1994) · Zbl 0825.62564
[19] Yanagihara, H.; Sekiguchi, R.; Fujikoshi, Y., Bias correction of AIC in logistic regression models, J. Statist. Plann. Inference, 115, 349-360 (2003) · Zbl 1022.62057
[20] Yanagihara, H.; Kamo, K.; Imori, S.; Satoh, K., Bias-corrected AIC for selecting variables in multinomial logistic regression models, Linear Algebra Appl., 436, 4329-4341 (2012) · Zbl 1238.62082
[21] Kamo, K.; Yanagihara, H.; Satoh, K., Bias-corrected AIC for selecting variables in Poisson regression models, Comm. Statist. Theory Methods, 42, 1911-1921 (2013) · Zbl 1268.62070
[22] Imori, S.; Yanagihara, H.; Wakaki, H., Simple formula for calculating bias-corrected AIC in generalized linear models, Scand. J. Stat., 41, 535-555 (2014) · Zbl 1416.62426
[23] Takeuchi, K., Distribution of information statistics and criteria for adequacy of models, Math. Sci., 153, 12-18 (1976), (in Japanese)
[24] Ishiguro, M.; Sakamoto, Y.; Kitagawa, G., Bootstrapping log likelihood and EIC, an extension of AIC, Ann. Inst. Statist. Math., 49, 411-434 (1997) · Zbl 0935.62033
[25] Fujikoshi, Y.; Satoh, K., Modified AIC and \(C_p\) in multivariate linear regression, Biometrika, 84, 707-716 (1997) · Zbl 0888.62055
[26] Fujikoshi, Y.; Yanagihara, H.; Wakaki, H., Bias corrections of some criteria for selection multivariate linear regression models in a general case, Amer. J. Math. Management Sci., 25, 221-258 (2005) · Zbl 1151.62338
[27] Yanagihara, H., Corrected version of AIC for selecting multivariate normal linear regression models in a general nonnormal case, J. Multivariate Anal., 97, 1070-1089 (2006) · Zbl 1089.62059
[28] Yanagihara, H.; Kamo, K.; Tonda, T., Second-order bias-corrected AIC in multivariate normal linear models under nonnormality, Canad. J. Statist., 39, 126-146 (2011) · Zbl 1349.62213
[29] Ichikawa, M.; Konishi, S., Model evaluation and information criteria in covariance structure analysis, British J. Math. Statist. Psych., 52, 285-302 (1999)
[30] Jöreskog, K. G., Some contributions to maximum likelihood factor analysis, Psychometrika, 32, 443-482 (1967) · Zbl 0183.24603
[31] Fujikoshi, Y.; Kurata, H., Information criterion for some independence structures, (Shigemasu, K.; Okada, A.; Imaizumi, T.; Hoshino, T., New Trends in Psychometrics (2008), Universal Academy Press: Universal Academy Press Tokyo), 69-78
[32] Fujikoshi, Y.; Shimizu, R.; Ulyanov, V. V., Multivariate Statistics: High-Dimensional and Large-Sample Approximations (2010), John Wiley & Sons: John Wiley & Sons Hoboken, New Jersey · Zbl 1304.62016
[33] Davies, S. J.; Neath, A. A.; Cavanaugh, J. E., Estimation optimality of corrected AIC and modified \(C_p\) in linear regression model, Internat. Statist. Rev., 74, 161-168 (2006)
[34] Siotani, M.; Hayakawa, T.; Fujikoshi, Y., Modern Multivariate Statistical Analysis: A Graduate Course and Handbook (1985), American Sciences Press: American Sciences Press Columbus, Ohio · Zbl 0588.62068
[35] Yanagihara, H., A family of estimators for multivariate kurtosis in a nonnormal linear regression model, J. Multivariate Anal., 98, 1-29 (2007) · Zbl 1102.62061
[36] Serfling, R. J., Approximation Theorems of Mathematical Statistics (2001), John Wiley & Sons, Inc.: John Wiley & Sons, Inc. New York
[37] Hall, P., The Bootstrap and Edgeworth Expansion (1992), Springer-Verlag: Springer-Verlag New York · Zbl 0829.62021
[38] Harville, D. A., Matrix Algebra from a Statistician’s Perspective (1997), Springer-Verlag: Springer-Verlag New York · Zbl 0881.15001
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.