×

High-dimensional VAR with low-rank transition. (English) Zbl 1448.62130

Summary: We propose a vector auto-regressive model with a low-rank constraint on the transition matrix. This model is well suited to predict high-dimensional series that are highly correlated, or that are driven by a small number of hidden factors. While our model has formal similarities with factor models, its structure is more a way to reduce the dimension in order to improve the predictions, rather than a way to define interpretable factors. We provide an estimator for the transition matrix in a very general setting and study its performances in terms of prediction and adaptation to the unknown rank. Our method obtains good result on simulated data, in particular when the rank of the underlying process is small. On macroeconomic data from [D. Giannone et al., “Prior selection for vector autoregressions”, Rev. Econ. Stat. 97, No. 2, 436–451 (2015; doi:10.1162/REST_a_00483)], our method is competitive with state-of-the-art methods in small dimension and even improves on them in high dimension.

MSC:

62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
62M20 Inference from stochastic processes and prediction
62P20 Applications of statistics to economics

Software:

GitHub; CAPUSHE
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Alquier, P.; Guedj, B., Simpler PAC-Bayesian bounds for hostile data, Mach. Learn., 107, 5, 887-902 (2018) · Zbl 1464.62238
[2] Alquier, P., Li, X.: Prediction of quantiles by statistical learning and application to GDP forecasting. In: International Conference on Discovery Science, pp. 22-36. Springer, Berlin (2012)
[3] Alquier, P.; Marie, N., Matrix factorization for multivariate time series analysis, Electron. J. Stat., 13, 2, 4346-4366 (2019) · Zbl 1442.62195
[4] Alquier, P.; Wintenberger, O., Model selection for weakly dependent time series forecasting, Bernoulli, 18, 3, 883-913 (2012) · Zbl 1243.62117
[5] Alquier, P.; Li, X.; Wintenberger, O., Prediction of time series by statistical learning: general losses and fast rates, Depend. Model., 1, 65-93 (2013) · Zbl 06297673
[6] Alquier, P.; Cottet, V.; Lecué, G., Estimation bounds and sharp oracle inequalities of regularized procedures with Lipschitz loss functions, Ann. Stat., 47, 4, 2117-2144 (2019) · Zbl 1466.62289
[7] Alquier, P.; Doukhan, P.; Fan, X., Exponential inequalities for nonstationary Markov chains, Depend. Model., 7, 150-168 (2019) · Zbl 1434.60171
[8] Anderson, TW, Estimating linear restrictions on regression coefficients for multivariate normal distributions, Ann. Math. Stat., 22, 3, 327-351 (1951) · Zbl 0043.13902
[9] Arlot, S.: Minimal penalties and the slope heuristics: a survey. arXiv preprint arXiv:1901.07277 (2019) · Zbl 1437.62121
[10] Arlot, S.; Massart, P., Data-driven calibration of penalties for least-squares regression, J. Mach. Learn. Res., 10, Feb, 245-279 (2009)
[11] Basu, S.; Meckesheimer, M., Automatic outlier detection for time series: an application to sensor data, Knowl. Inf. Syst., 11, 2, 137-154 (2007)
[12] Basu, S.; Li, X.; Michailidis, G., Low rank and structured modeling of high-dimensional vector autoregressions, IEEE Trans. Signal Process., 67, 5, 1207-1222 (2019) · Zbl 1415.94051
[13] Baudry, J-P; Maugis, C.; Michel, B., Slope heuristics: overview and implementation, Stat. Comput., 22, 2, 455-470 (2012) · Zbl 1322.62007
[14] Bauwens, L.; Laurent, S.; Rombouts, JVK, Multivariate GARCH models: a survey, J. Appl. Econom., 21, 79-109 (2006)
[15] Bing, X.; Wegkamp, MH, Adaptive estimation of the rank of the coefficient matrix in high-dimensional multivariate response regression models, Ann. Stat., 47, 6, 3157-3184 (2019) · Zbl 1477.62140
[16] Birgé, L.; Massart, P., Minimal penalties for Gaussian model selection, Probab. Theory Relat. Fields, 138, 1-2, 33-73 (2007) · Zbl 1112.62082
[17] Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J., Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trends® Mach. Learn., 3, 1, 1-122 (2011) · Zbl 1229.90122
[18] Buja, A.; Hastie, T.; Tibshirani, R., Linear smoothers and additive models, Ann. Stat., 17, 2, 453-510 (1989) · Zbl 0689.62029
[19] Bunea, F.; She, Y.; Wegkamp, MH, Optimal selection of reduced rank estimators of high-dimensional matrices, Ann. Stat., 39, 2, 1282-1309 (2011) · Zbl 1216.62086
[20] Candès, EJ; Plan, Y., Tight oracle inequalities for low-rank matrix recovery from a minimal number of noisy random measurements, IEEE Trans. Inf. Theory, 57, 4, 2342-2359 (2011) · Zbl 1366.90160
[21] Carel, L.: Big data analysis in the field of transportation. Ph.D. thesis, Université Paris-Saclay (2019)
[22] Cesa-Bianchi, N.; Lugosi, G., Prediction, Learning, and Games (2006), Cambridge: Cambridge University Press, Cambridge · Zbl 1114.91001
[23] Chan, J.; Leon-Gonzalez, R.; Strachan, RW, Invariant inference and efficient computation in the static factor model, J. Am. Stat. Assoc., 113, 522, 819-828 (2018) · Zbl 1398.62062
[24] Chen, L.; Wu, WB, Testing for trends in high-dimensional time series, J. Am. Stat. Assoc., 114, 869-881 (2018) · Zbl 1420.62375
[25] Cornec, M., Constructing a conditional GDP fan chart with an application to French business survey data, OECD J. J. Bus. Cycle Meas. Anal., 2013, 2, 109-127 (2014)
[26] Davis, RA; Zang, P.; Zheng, T., Sparse vector autoregressive modeling, J. Comput. Graph. Stat., 25, 4, 1077-1096 (2016)
[27] De Castro, Y., Goude, Y., Hébrail, G., Mei, J.: Recovering multiple nonnegative time series from a few temporal aggregates. In: ICML 2017-34th International Conference on Machine Learning, pp. 1-9 (2017)
[28] Dedecker, J., Doukhan, P., Lang, G., León, JR., Louhichi, S., Prieur, C.: Weak Dependence: With Examples and Applications. Lecture Notes in Statistics, vol. 190, p. xiv+318. Springer, New York (2007) · Zbl 1165.62001
[29] Dedecker, J.; Fan, X., Deviation inequalities for separately Lipschitz functionals of iterated random functions, Stoch. Process. Appl., 125, 1, 60-90 (2015) · Zbl 1301.60055
[30] Dedecker, J.; Doukhan, P.; Fan, X., Deviation inequalities for separately Lipschitz functionals of composition of random functions, J. Math. Anal. Appl., 479, 2, 1549-1568 (2019) · Zbl 1479.60041
[31] Doukhan, P., Stochastic Models for Time Series (2018), Berlin: Springer, Berlin
[32] Engle, R., ARCH: Selected Readings (1995), Oxford: Oxford University Press, Oxford
[33] Francq, C.; Zakoian, J-M, GARCH Models: Structure, Statistical Inference and Financial Applications (2019), New York: Wiley, New York · Zbl 1431.62004
[34] Gaillard, P.; Goude, Y.; Nedellec, R., Additive models and robust aggregation for GEFCom2014 probabilistic electric load and electricity price forecasting, Int. J. Forecast., 32, 3, 1038-1050 (2016)
[35] Garnier, R.: Simulations for penalized estimation of VAR with law rank. https://github.com/garnier94/Simulation_low_rank_VAR1 (2019)
[36] Ge, R., Jin, C., Zheng, Y.: No spurious local minima in nonconvex low rank problems: a unified geometric analysis. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 1233-1242. JMLR. org (2017)
[37] Giannone, D.; Lenza, M.; Primiceri, GE, Prior selection for vector autoregressions, Rev. Econ. Stat., 97, 2, 436-451 (2015)
[38] Giordani, P.; Pitt, M.; Kohn, R.; Geweke, J.; Koop, G.; Van Dijk, H., Bayesian inference for time series state space models, The Oxford Handbook of Bayesian Econometrics (2011), Oxford: Oxford University Press, Oxford
[39] Giraud, C.; Roueff, F.; Sanchez-Perez, A., Aggregation of predictors for nonstationary sub-linear processes and online adaptive forecasting of time varying autoregressive processes, Ann. Stat., 43, 6, 2412-2450 (2015) · Zbl 1327.62478
[40] Hallin, M.; Lippi, M., Factor models in high-dimensional time series—a time-domain approach, Stoch. Process. Appl., 123, 7, 2678-2695 (2013) · Zbl 1285.62106
[41] Hang, H.; Steinwart, I., Fast learning from \(\alpha \)-mixing observations, J. Multivar. Anal., 127, 184-199 (2014) · Zbl 1359.62242
[42] Izenman, AJ, Reduced-rank regression for the multivariate linear model, J. Multivar. Anal., 5, 2, 248-264 (1975) · Zbl 0313.62042
[43] Ji, S., Ye, J.: An accelerated gradient method for trace norm minimization. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 457-464. ACM (2009)
[44] Jochmann, M.; Koop, G.; Leon-Gonzalez, R.; Strachan, RW, Stochastic search variable selection in vector error correction models with an application to a model of the UK macroeconomy, J. Appl. Econom., 28, 1, 62-81 (2013)
[45] Klopp, O.; Lounici, K.; Tsybakov, AB, Robust matrix completion, Probab. Theory Relat. Fields, 169, 1-2, 523-564 (2017) · Zbl 1383.62167
[46] Klopp, O.; Lu, Y.; Tsybakov, AB; Zhou, HH, Structured matrix estimation and completion, Bernoulli, 25, 4, 3883-3911 (2019) · Zbl 1428.62281
[47] Koltchinskii, V.; Lounici, K.; Tsybakov, AB, Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion, Ann. Stat., 39, 5, 2302-2329 (2011) · Zbl 1231.62097
[48] Koop, G.; Potter, S., Forecasting in dynamic factor models using Bayesian model averaging, Econom. J., 7, 2, 550-565 (2004) · Zbl 1063.62032
[49] Kumar, M.; Patel, NR, Using clustering to improve sales forecasts in retail merchandising, Ann. Oper. Res., 174, 1, 33-46 (2010) · Zbl 1185.62116
[50] Kuznetsov, V., Mohri, M.: Learning theory and algorithms for forecasting non-stationary time series. In: Advances in Neural Information Processing Systems, pp. 541-549 (2015)
[51] Lam, C.; Yao, Q., Factor modeling for high-dimensional time series: inference for the number of factors, Ann. Stat., 40, 2, 694-726 (2012) · Zbl 1273.62214
[52] Lam, C.; Yao, Q.; Bathia, N., Estimation of latent factors for high-dimensional time series, Biometrika, 98, 4, 901-918 (2011) · Zbl 1228.62110
[53] Lippi, M.; Bertini, M.; Frasconi, P., Short-term traffic flow forecasting: an experimental comparison of time-series analysis and supervised learning, IEEE Trans. Intell. Transp. Syst., 14, 2, 871-882 (2013)
[54] Lipton, Z.C., Kale, D.C., Wetzel, R.: Modeling missing data in clinical time series with RNNs. In: Machine Learning for Healthcare, vol. 56 (2016)
[55] London, B., Huang, B., Taskar, B., Getoor, L.: PAC-Bayesian collective stability. In: Artificial Intelligence and Statistics, pp. 585-594 (2014)
[56] McDonald, DJ; Shalizi, CR; Schervish, M., Nonparametric risk bounds for time-series forecasting, J. Mach. Learn. Res., 18, 32, 1-40 (2017) · Zbl 1437.62337
[57] Meir, R., Nonparametric time series prediction through adaptive model selection, Mach. Learn., 39, 1, 5-34 (2000) · Zbl 0954.68124
[58] Moridomi, K.; Hatano, K.; Takimoto, E., Tighter generalization bounds for matrix completion via factorization into constrained matrices, IEICE Trans. Inf. Syst., 101, 8, 1997-2004 (2018)
[59] Negahban, S.; Wainwright, MJ, Estimation of (near) low-rank matrices with noise and high-dimensional scaling, Ann. Stat., 39, 2, 1069-1097 (2011) · Zbl 1216.62090
[60] Poignard, B.: Sparse multivariate ARCH models: finite sample properties. https://arxiv.org/pdf/1808.05352v1.pdf (2018)
[61] Purser, A.; Bergmann, M.; Lundälv, T.; Ontrup, J.; Nattkemper, TW, Use of machine-learning algorithms for the automated detection of cold-water coral habitats: a pilot study, Mar. Ecol. Prog. Ser., 397, 241-251 (2009)
[62] Saha, A., Sindhwani, V.: Learning evolving and emerging topics in social media: a dynamic NMF approach with temporal regularization. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, pp. 693-702. ACM (2012)
[63] Shalizi, C., Kontorovich, A.: Predictive PAC learning and process decompositions. In: Advances in Neural Information Processing Systems, pp. 1619-1627 (2013)
[64] Steinwart, I., Christmann, A.: Fast learning from non-iid observations. In: Advances in Neural Information Processing Systems, pp. 1768-1776 (2009)
[65] Steinwart, I.; Hush, D.; Scovel, C., Learning from dependent observations, J. Multivar. Anal., 100, 1, 175-194 (2009) · Zbl 1158.68040
[66] Suzuki, T.: Convergence rate of Bayesian tensor estimator and its minimax optimality. In: International Conference on Machine Learning, pp. 1273-1282 (2015)
[67] Vapnik, V., Statistical Learning Theory (1998), New York: Wiley, New York · Zbl 0935.62007
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.