×

Estimating large-scale general linear and seemingly unrelated regressions models after deleting observations. (English) Zbl 1505.62174

Summary: A new numerical method to solve the downdating problem (and variants thereof), namely removing the effect of some observations from the generalized least squares (GLS) estimator of the general linear model (GLM) after it has been estimated, is extensively investigated. It is verified that the solution of the downdated least squares problem can be obtained from the estimation of an equivalent GLM, where the original model is updated with the imaginary deleted observations. This updated GLM has a non positive definite dispersion matrix which comprises complex covariance values and it is proved herein to yield the same normal equations as the downdated model. Additionally, the problem of deleting observations from the seemingly unrelated regressions model is addressed, demonstrating the direct applicability of this method to other multivariate linear models. The algorithms which implement the novel downdating method utilize efficiently the previous computations from the estimation of the original model. As a result, the computational cost is significantly reduced. This shows the great usability potential of the downdating method in computationally intensive problems. The downdating algorithms have been applied to real and synthetic data to illustrate their efficiency.

MSC:

62-08 Computational methods for problems pertaining to statistics
62J05 Linear regression; mixed models
62J20 Diagnostics, and linear inference and regression
62P20 Applications of statistics to economics

Software:

LAPACK; ScaLAPACK
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Aitken, A.C.: On least squares and linear combination of observations. Proc R Soc Edinb 55, 42-48 (1934) · Zbl 0011.26603 · doi:10.1017/S0370164600014346
[2] Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK Users’ Guide, 3rd edn. Society for Industrial and Applied Mathematics, Philadelphia (1999) · Zbl 0934.65030 · doi:10.1137/1.9780898719604
[3] Bai, J., Shi, S.: Estimating high dimensional covariance matrices and its applications. Ann. Econ. Financ. 12(2), 199-215 (2011)
[4] Belsley, D.A., Kuh, A.E., Welsch, R.E.: Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. Wiley, New York (2004) · Zbl 0479.62056
[5] Bhimasankaram, P., Sengupta, D., Ramanathan, S.: Recursive inference in a gegeral linear model. Sankhya: The Indian Journal of Statistics 57(A), 227-255 (1995) · Zbl 0857.62067
[6] Björck, Å.: Numerical methods for least squares problems. SIAM, Philadelphia (1996) · Zbl 0847.65023 · doi:10.1137/1.9781611971484
[7] Blackford, L.S., Choi, J., Cleary, A., D’Azevedo, E., Demmel, J., Dhillon, I., Dongarra, J., Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D., Whaley, R.C.: ScaLAPACK Users’ Guide. Society for Industrial and Applied Mathematics, Philadelphia (1997) · Zbl 0886.65022 · doi:10.1137/1.9780898719642
[8] Chambers, J.M.: Regression updating. J. Am. Stat. Assoc. 66(336), 744-748 (1971) · doi:10.1080/01621459.1971.10482338
[9] Chavas, J.-P.: Recursive estimation of simultaneous equation models. J. Econ. 18(2), 207-217 (1982) · Zbl 0489.62099 · doi:10.1016/0304-4076(82)90036-7
[10] Chib, S., Greenberg, E.: Hierarchical analysis of SUR models with extensions to correlated serial errors and time-varying parameter models. J. Econ. 68(2), 339-360 (1995) · Zbl 0833.62103 · doi:10.1016/0304-4076(94)01653-H
[11] Christensen, R., Pearson, L.M., Johnson, W.: Case-deletion diagnostics for mixed models. Technometrics 34(1), 38-45 (1992) · Zbl 0761.62098 · doi:10.2307/1269550
[12] Clark, T.E., McCracken, M.W.: Improving forecast accuracy by combining recursive and rolling forecasts. Int. Econ. Rev. 50(2), 363-395 (2009) · doi:10.1111/j.1468-2354.2009.00533.x
[13] Cook, R.D.: Detection of influential observation in linear regression. Technometrics 19(1), 15-18 (1977) · Zbl 0371.62096
[14] Deaton, A.; Griliches, Z. (ed.); Intriligator, MD (ed.), Demand analysis, 1767-1839 (1986), New York · Zbl 0603.62119
[15] Elden, L., Park, H.: Block downdating of least squares solutions. SIAM J. Matrix Anal. Appl. 15(3), 1018-1034 (1994) · Zbl 0808.65041 · doi:10.1137/S089547989223691X
[16] Foschi, P., Belsley, D.A., Kontoghiorghes, E.J.: A comparative study of algorithms for solving seemingly unrelated regressions models. Comput. Stat. Data Anal. 44(1-2), 3-35 (2003) · Zbl 1429.62289 · doi:10.1016/S0167-9473(03)00028-8
[17] Foschi, P., Kontoghiorghes, E.J.: Seemingly unrelated regression model with unequal size observations: computational aspects. Comput. Stat. Data Anal. 41(1), 211-229 (2002) · Zbl 1018.65015 · doi:10.1016/S0167-9473(02)00146-9
[18] Golub, GH; Milton, RC (ed.); Neider, JA (ed.), Matrix decompositions and statistical calculations, 365-397 (1969), New York · doi:10.1016/B978-0-12-498150-8.50021-5
[19] Golub, G.H., Van Loan, C.F.: Matrix computations. Johns Hopkins Studies in the Mathematical Sciences, 3rd edn. Johns Hopkins University Press, Baltimore, Maryland (1996) · Zbl 0865.65009
[20] Gragg, W.B., Leveque, R.J., Trangenstein, J.A.: Numerically stable methods for updating regressions. J. Am. Stat. Assoc. 74(365), 161-168 (1979) · Zbl 0398.62058 · doi:10.1080/01621459.1979.10481633
[21] Griffiths, W.E., Valenzuela, M.R.: Gibbs samplers for a set of seemingly unrelated regressions. Aust. N. Z. J. Stat. 48(3), 335-351 (2006) · Zbl 1108.62025 · doi:10.1111/j.1467-842X.2006.00444.x
[22] Hadjiantoni, S.: Numerical methods for the recursive estimation of large-scale linear econometric models. PhD thesis, Queen Mary, University of London (2015) · Zbl 1115.62066
[23] Haslett, J.: A simple derivation of deletion diagnostic results for the general linear model with correlated errors. J. R. Stat. Soc. Ser B (Stat Methodol) 61(3), 603-609 (1999) · Zbl 0924.62076 · doi:10.1111/1467-9868.00195
[24] Haslett, J., Dillane, D.: Application of ’delete = replace’ to deletion diagnostics for variance component estimation in the linear mixed model. J. R. Stat. Soc. Ser. B (Stat. Methodol) 66(1), 131-143 (2004) · Zbl 1060.62081 · doi:10.1046/j.1369-7412.2003.05211.x
[25] Holly, S., Pesaran, M.H., Yamagata, T.: Spatial and temporal diffusion of house prices in the UK. J. Urban Econ. 69(1), 2-23 (2011) · doi:10.1016/j.jue.2010.08.002
[26] Jammalamadaka, S., Sengupta, D.: Changes in the general linear model: a unified approach. Linear Algebra Appl. 289(1-3), 225-242 (1999) · Zbl 0933.62060 · doi:10.1016/S0024-3795(97)10047-7
[27] Jammalamadaka, S.R., Sengupta, D.: Inclusion and exclusion of data or parameters in the general linear model. Stat. Probab. Lett. 77(12), 1235-1247 (2007) · Zbl 1115.62066 · doi:10.1016/j.spl.2007.03.008
[28] Judge, G.G., Griffiths, W.E., Hill, C.H., Lütkepohl, H., Lee, T.-C.: The Theory and Practice of Econometrics, 2nd edn. Wiley, New York (1985) · Zbl 0731.62155
[29] Kmenta, J., Gilbert, R.F.: Estimation of seemingly unrelated regressions with autoregressive disturbances. J. Am. Stat. Assoc. 65(329), 186-197 (1970) · doi:10.1080/01621459.1970.10481073
[30] Kontoghiorghes, E.J.: Parallel Algorithms for Linear Models: Numerical Methods and Estimation Problems, Volume 15 of Advances in Computational Economics. Kluwer Academic Publishers, Boston (2000) · Zbl 0981.68176 · doi:10.1007/978-1-4615-4571-2
[31] Kontoghiorghes, E.J., Clarke, M.R.B.: An alternative approach for the numerical solution of seemingly unrelated regression equations models. Comput. Stat. Data Anal. 19(4), 369-377 (1995) · Zbl 0875.62362 · doi:10.1016/0167-9473(94)00010-G
[32] Kourouklis, S., Paige, C.C.: A constrained least squares approach to the general Gauss-Markov linear model. J. Am. Stat. Assoc. 76(375), 620-625 (1981) · Zbl 0475.62052 · doi:10.1080/01621459.1981.10477694
[33] Lee, B.-S.: Causal relations among stock returns, interest rates, real activity, and inflation. J. Financ. 47(4), 1591-1603 (1992) · doi:10.1111/j.1540-6261.1992.tb04673.x
[34] Lütkepohl, H.: New Introduction to Multiple Time Series Analysis, 1st edn. Springer, New York (2007). 2006. corr. 2nd printing edition · Zbl 1072.62075
[35] Martin, R.J.: Leverage, influence and residuals in regression models when observations are correlated. Commun. Stat. Theory Methods 21(5), 1183-1212 (1992) · Zbl 0800.62369 · doi:10.1080/03610929208830840
[36] Paige, C.C.: Numerically stable computations for general univariate linear models. Commun. Stat. Simul. Comput. 7(5), 437-453 (1978) · Zbl 0429.62052 · doi:10.1080/03610917808812090
[37] Paige, C.C.: Fast numerically stable computations for generalized linear least squares problems. SIAM J. Numer. Anal. 16(1), 165-171 (1979) · Zbl 0402.65006
[38] Pesaran, M.H., Pick, A.: Forecast combination across estimation windows. J. Bus. Econ. Stat. 29(2), 307-318 (2011) · Zbl 1214.62095 · doi:10.1198/jbes.2010.09018
[39] Pesaran, M.H., Pick, A., Pranovich, M.: Optimal forecasts in the presence of structural breaks. J. Econ. 177(2), 134-152 (2013). Dynamic Econometric Modeling and Forecasting · Zbl 1288.62140 · doi:10.1016/j.jeconom.2013.04.002
[40] Pollock, D.S.G.: Recursive estimation in econometrics. Comput. Stat. Data Anal. 44(1-2), 37-75 (2003) · Zbl 1429.62692 · doi:10.1016/S0167-9473(03)00150-6
[41] Preisser, J., Perin, J.: Deletion diagnostics for marginal mean and correlation model parameters in estimating equations. Stat. Comput. 17(4), 381-393 (2007) · doi:10.1007/s11222-007-9031-1
[42] Preisser, J.S., Qaqish, B.F.: Deletion diagnostics for generalised estimating equations. Biometrika 83(3), 551-562 (1996) · Zbl 0866.62041 · doi:10.1093/biomet/83.3.551
[43] Rader, C., Steinhardt, A.: Hyperbolic householder transformations. IEEE Trans. Acoust. Speech Signal Process. 34(6), 1589-1602 (1986) · Zbl 0629.93063 · doi:10.1109/TASSP.1986.1164998
[44] Rao, C.R.: Linear Statistical Inference and its Applications, 2nd edn. Wiley, New York (2002)
[45] Rochon, J.: Accounting for covariates observed post randomization for discrete and continuous repeated measures data. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 205-219 (1996) · Zbl 0850.62841
[46] Rossi, B., Inoue, A.: Out-of-sample forecast tests robust to the choice of window size. J. Bus. Econ. Stat. 30(3), 432-453 (2012) · doi:10.1080/07350015.2012.693850
[47] Shieh, G.: General multivariate linear models for longitudional studies. Commun. Stat. Theory Methods 29(4), 735-753 (2000) · Zbl 1061.62547 · doi:10.1080/03610920008832512
[48] Smith, M., Kohn, R.: Nonparametric seemingly unrelated regression. J. Econ. 98(2), 257-281 (2000) · Zbl 0957.62033 · doi:10.1016/S0304-4076(00)00018-X
[49] Srivastava, V., Dwivedi, T.: Estimation of seemingly unrelated regression equations: a brief survey. J. Econ. 10(1), 15-32 (1979) · Zbl 0408.62060 · doi:10.1016/0304-4076(79)90061-7
[50] Srivastava, V.K., Giles, D.E.A. (eds.): Seemingly Unrelated Regression Equations Models: Estimation and Inference. Marcel Dekker Inc., New York (1987) · Zbl 0638.62108
[51] Telser, L.G.: Iterative estimation of a set of linear regression equations. J. Am. Stat. Assoc. 59(307), 845-862 (1964) · Zbl 0131.35902 · doi:10.1080/01621459.1964.10480731
[52] Verbyla, A.P., Venables, W.N.: An extension of the growth curve model. Biometrika 75(1), 129-138 (1988) · Zbl 0636.62073 · doi:10.1093/biomet/75.1.129
[53] Wang, H.: Sparse seemingly unrelated regression modelling: applications in finance and econometrics. Comput. Stat. Data Anal. 54(11), 2866-2877 (2010) · Zbl 1284.91461 · doi:10.1016/j.csda.2010.03.028
[54] Yanev, P.I., Kontoghiorghes, E.J.: Parallel algorithms for downdating the least squares estimator of the regression model. Parallel Comput. 34(6-8), 451-468 (2008) · doi:10.1016/j.parco.2008.01.002
[55] Yanev, P.I., Kontoghiorghes, E.J.: Graph-based strategies for performing the exhaustive and random k-fold cross-validations. J. Comput. Graph. Stat. 18(4), 894-914 (2009) · doi:10.1198/jcgs.2009.08019
[56] Zellner, A.: An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. J. Am. Stat. Assoc. 57(298), 348-368 (1962) · Zbl 0113.34902 · doi:10.1080/01621459.1962.10480664
[57] Zhu, H., Ibrahim, J.G., Cho, H.: Perturbation and scaled cooks distance. Ann. Stat. 40(2), 785-811 (2012) · Zbl 1273.62180 · doi:10.1214/12-AOS978
[58] Zyskind, G.: On canonical forms, non-negative covariance matrices and best and simple least squares linear estimators in linear models. Ann. Math. Stat. 38(4), 1092-1109 (1967) · Zbl 0171.17103 · doi:10.1214/aoms/1177698779
[59] Zyskind, G., Martin, F.B.: On best linear estimation and general gauss-markov theorem in linear models with arbitrary nonnegative covariance structure. SIAM J. Appl. Math. 17(6), 1190-1202 (1969) · Zbl 0193.47301 · doi:10.1137/0117110
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.