Ma, Shaohui; Fildes, Robert; Huang, Tao Demand forecasting with high dimensional data: the case of SKU retail sales forecasting with intra- and inter-category promotional information. (English) Zbl 1346.62165 Eur. J. Oper. Res. 249, No. 1, 245-257 (2016). Summary: In marketing analytics applications in OR, the modeler often faces the problem of selecting key variables from a large number of possibilities. For example, SKU level retail store sales are affected by inter and intra category effects which potentially need to be considered when deciding on promotional strategy and producing operational forecasts. But no research has yet put this well accepted concept into forecasting practice: an obvious obstacle is the ultra-high dimensionality of the variable space. This paper develops a four steps methodological framework to overcome the problem. It is illustrated by investigating the value of both intra- and inter-category SKU level promotional information in improving forecast accuracy. The method consists of the identification of potentially influential categories, the building of the explanatory variable space, variable selection and model estimation by a multistage LASSO regression, and the use of a rolling scheme to generate forecasts. The success of this new method for dealing with high dimensionality is demonstrated by improvements in forecasting accuracy compared to alternative methods of simplifying the variable space. The empirical results show that models integrating more information perform significantly better than the baseline model when using the proposed methodology framework. In general, we can improve the forecasting accuracy by 12.6 percent over the model using only the SKU’s own predictors. But of the improvements achieved, 95 percent of it comes from the intra-category information, and only 5 percent from the inter-category information. The substantive marketing results also have implications for promotional category management. Cited in 10 Documents MSC: 62P20 Applications of statistics to economics 90B60 Marketing, advertising Keywords:analytics; OR in marketing; forecasting; retailing; promotions Software:CHAN4CAST; elasticnet; rms PDFBibTeX XMLCite \textit{S. Ma} et al., Eur. J. Oper. Res. 249, No. 1, 245--257 (2016; Zbl 1346.62165) Full Text: DOI Link References: [1] Aburto, L.; Weber, R., Improved supply chain management based on hybrid demand forecasts, Applied Soft Computing, 7, 1, 136-144 (2007) [2] Alon, I.; Qi, M.; Sadowsik, R. J., Forecasting aggregate retail sales: A comparison of artificial neural networks and traditional methods, Journal of Retailing Consumer Services, 8, 3, 147-156 (2001) [3] Andrews, R. L.; Currim, I. S.; Leeflang, P.; Lim, J., Estimating the \(SCAN^*\) PRO model of store sales: HB, FM or just OLS?, International Journal of Research in Marketing, 25, 1, 22-33 (2008) [5] Ashley, R.; Granger, C. W.J.; Schmalensee, R., Advertising and aggregate consumption: An analysis of causality, Econometrica, 48, 5, 1149-1167 (1980) · Zbl 0442.90012 [6] Bandyopadhyay, S., A dynamic model of cross-category competition: Theory, tests and applications, Journal of Retailing, 85, 4, 468-479 (2009) [7] Berman, B.; Evans, J. R., Retail management: A strategic approach (1989), Macmillian: Macmillian New York [8] Bronnenberg, B. J.; Kruger, M. W.; Carl, F. M., The IRI academic dataset, Marketing Science, 27, 4, 745-748 (2008) [9] Bucklin, R. E.; Gupta, S.; Siddarth, S., Determining segmentation in sales response across consumer purchase behaviors, Journal of Marketing Research, 35, May, 189-197 (1998) [10] Chiang, J., A simultaneous approach to the whether, what, and how much to buy questions, Marketing Science, 10, 4, 297-315 (1991) [11] Chintagunta, P. K., Investigating purchase incidence, brand choice, and purchase quantity decisions of households, Marketing Science, 12, 2, 184-208 (1993) [12] Cooper, L. G.; Baron, P.; Levy, W.; Swisher, M.; Gogos, P., “Promocast”: A new forecasting method for promotion planning, Marketing Science, 18, 3, 301-316 (1999) [13] Davydenko, A.; Fildes, R., Measuring forecasting accuracy: The case of judgmental adjustments to SKU-level demand forecasts, International Journal of Forecasting, 29, 3, 510-522 (2013) [14] Divakar, S.; Ratchford, B. T.; Shankar, V., CHAN4CAST: A multichannel, multiregion sales forecasting model and decision support system for consumer packaged goods, Marketing Science, 24, 3, 334-350 (2005) [15] Donoho, D. L., High-dimensional data analysis: The curses and blessings of dimensionality, (Aide-Memoire of the lecture in AMS conference, Math challenges of 21st century (2000)), Available at: http://www-stat.stanford.edu/˜donoho/Lectures [16] Efron, B.; Hastie, T.; Johnstone, I.; Tibshirani, R., Least angle regression, Annals of Statistics, 32, 2, 407-451 (2004) · Zbl 1091.62054 [17] Fan, J.; Lv, J., Sure independence screening for ultrahigh dimensional feature space, Journal of Royal Statistical Society, Series B, 70, 5, 849-911 (2008) · Zbl 1411.62187 [18] Fildes, R.; Goodwin, P., Against your better judgment? How organizations can improve their use of management judgment in forecasting, Interfaces, 37, 6, 570-576 (2007) [19] Fildes, R.; Goodwin, P.; Lawrence, M.; Nikolopoulos, K., Effective forecasting and judgmental adjustments: An empirical evaluation and strategies for improvement in supply-chain planning, International Journal of Forecasting, 25, 1, 3-23 (2009) [20] Fildes, R.; Nikolopoulos, K.; Crone, S.; Syntetos, A. A., Forecasting and operational research: A review, Journal of the Operational Research Society, 59, 9, 1150-1172 (2008) · Zbl 1153.90009 [21] Foekens, E. W.; Leeflang, P. S.H.; Wittink, D. R., A comparison and an exploration of the forecasting accuracy of a loglinear model at different levels of aggregation, International Journal of Forecasting, 10, 2, 245-261 (1994) [22] Forni, M.; Hallin, M.; Lippi, M.; Reichlin, L., The generalized factor model: Identification and estimation, Review of Economics and Statistics, 82, 4, 540-554 (2000) [23] Forni, M.; Hallin, M.; Lippi, M.; Reichlin, L., Do financial variables help forecasting inflation and real activity in the EURO area?, Journal of Monetary Economics, 50, 6, 1243-1255 (2003) [24] Gupta, S., Impact of sales promotions on when, what, and how much to buy, Journal of Marketing Research, 25, 322-355 (1988) [25] Gür Ali, Ö., Driver moderator method for retail sales prediction, International Journal of Information Technology & Decision Making, 12, 6, 1261-1286 (2013) [26] Gür Ali, Ö.; SayIn, S.; van Woensel, T.; Fransoo, J., SKU demand forecasting in the presence of promotions, Expert Systems with Applications, 36, 10, 12340-12348 (2009) [27] Harrell, F. E., Regression modeling strategies: With applications to linear models, logistic regression, and survival Analysis (2001), Springer: Springer New York · Zbl 0982.62063 [28] Heerde, H. J.; Gupta, S.; Wittink, D. R., Is 75 [29] Heerde, H. J.; Leeflang, V.; Peter, S. H.; Wittink, D. R., The estimation of pre-and postpromotion dips with store-level scanner data, Journal of Marketing Research, 37, 3, 383-395 (2000) [30] Heerde, H. J.; Leeflang, V.; Peter, S. H.; Wittink, D. R., Semiparametric analysis to estimate the deal effect curve, Journal of Marketing Research, 38, 2, 197-215 (2001) [31] Hiemstra, C.; Jones, J. D., Testing for linear and nonlinear Granger causality in the stock price-volume relation, Journal of Finance, 49, 5, 1639-1664 (1994) [32] Hruschka, H., Comparing small- and large-scale models of multicategory buying behavior, Journal of Forecasting, 32, 5, 423-434 (2013) · Zbl 1397.62541 [33] Huang, T.; Fildes, R.; Soopramanien, D., The value of competitive information in forecasting FMCG retail product sales and the variable selection problem, European Journal of Operational Research, 237, 2, 738-748 (2014) [34] Hyndman, R. J.; Koehler, A. B., Another look at measures of forecast accuracy, International Journal of Forecasting, 22, 4, 679-688 (2006) [35] Hyndman, R. J.; Koehler, A. B.; Snyder, R. D.; Grose, S., A state space framework for automatic forecasting using exponential smoothing methods, International Journal of Forecasting, 18, 3, 439-454 (2002) [36] John, G. H.; Kohavi, R.; Pfleger, K., Irrelevant features and the subset selection problem, (Proceedings of the Eleventh International Conference on Machine Learning (1994), Morgan Kaufmann Publishers: Morgan Kaufmann Publishers San Francisco, CA), 121-129 [37] Kumar, V.; Leone, R., Measuring the effect of retail store promotions on brand and store substitution, Journal of Marketing Research, 25, 2, 178-185 (1988) [38] Kuo, R. J., A sales forecasting system based on fuzzy neural network with initial weights generated by genetic algorithm, European Journal of Operational Research, 129, 3, 496-517 (2001) · Zbl 1125.90364 [39] Lee, S.; Kim, J.; Allenby, G. M., A direct utility model for asymmetric complements, Marketing Science, 32, 3, 454-470 (2013) [40] Lee, W. Y.; Goodwin, P.; Fildes, R.; Nikolopoulos, K.; Lawrence, M., Providing support for the use of analogies in demand forecasting tasks, International Journal of Forecasting, 23, 3, 377-390 (2007) [41] Levy, M.; Grewal, D.; Kopalle, P. K.; Hess, J. D., Emerging trends in retail pricing practice: Implications for research, Journal of Retailing, 80, 3, xiii-xxxi (2004) [42] Mehta, N., Investigating consumers purchase incidence and brand choice decisions across multiple product categories: A theoretical and empirical analysis, Marketing Science, 26, 2, 196-217 (2007) [43] Meiri, R.; Zahavi, J., Using simulated annealing to optimize the feature selection problem in marketing applications, European Journal of Operational Research, 171, 3, 842-858 (2006) · Zbl 1116.90069 [44] Melab, N.; Cahon, S.; Talbi, E.-G.; Duponchel, L., Parallel GA-based wrapper feature selection for spectroscopic data mining, (International Parallel and Distributed Processing Symposium: IPDPS 2002 Workshops (2002), IEEE Comput. Soc: IEEE Comput. Soc Los Alamitos, CA), 201-208 [45] Moriarty, M., Retail promotional effects on intra and interbrand sales performance, Journal of Retailing, 61, 3, 27-47 (1985) [46] Mulhern, F. J.; Leone, R. P., Implicit price bundling of retail products: A multiproduct approach to maximizing store profitability, Journal of Marketing, 55, 63-76 (1991) [47] Nicholson, W., Microeconomic theory: Basic principles and extensions (1998), South-Western Cengage Learning: South-Western Cengage Learning Mason, OH [48] Nikolopoulos, K., Forecasting with quantitative methods: The impact of special events in time series, Applied Economics, 42, 8, 947-955 (2010) [49] Ord, J. K.; Fildes, R., Principles of business forecasting (2013), South-Western Cengage Learning: South-Western Cengage Learning Mason, OH [50] Preston, J.; Mercer, A., The evaluation and analysis of retail sales promotions, European Journal of Operational Research, 47, 3, 330-338 (1990) [51] Raju, J. S., Theoretical models of sales promotions: Contributions, limitations, and a future research agenda, European Journal of Operational Research, 85, 1, 1-17 (1995) · Zbl 0910.90196 [52] Rinne, H.; Geurts, M., A forecasting model to evaluate the profitability of price promotions, European Journal of Operational Research, 33, 3, 279-289 (1988) [53] Song, I.; Chintagunta, P. K., A discrete-continuous model for multicategory purchase behavior of households, Journal of Marketing Research, 44, 4, 595-612 (2007) [54] Stock, J.; Watson, M., Forecasting inflation, Journal of Monetary Economics, 44, 2, 293-335 (1999) [55] Stock, J.; Watson, M., Forecasting using principal components from a large number of predictors, Journal of the American Statistical Association, 97, 1167-1179 (2002) · Zbl 1041.62081 [56] Stock, J.; Watson, M., Forecasting output and inflation: The role of asset prices, Journal of Economic Literature, 41, 3, 788-829 (2003) [57] Stock, J.; Watson, M., Forecasting with many predictors, Handbook of economic forecasting, 515-554 (2004), Elsevier: Elsevier North Holland [58] Taylor, J. W., Forecasting daily supermarket sales using exponentially weighted quantile regression, European Journal of Operational Research, 178, 1, 154-167 (2007) · Zbl 1102.62103 [59] Tibshirani, R., Regression shrinkage and selection via the LASSO, Journal of Royal Statistical Society, Series B, 58, 1, 267-288 (1996) · Zbl 0850.62538 [60] Tibshirani, R., Regression shrinkage and selection via the lasso: A retrospective, Journal of the Royal Statistical Society: Series B, 73, 3, 273-282 (2011) · Zbl 1411.62212 [61] Trapero, J. R.; Fildes, R.; Davydenko, A., Nonlinear identification of judgmental forecasts effects at SKU level, Journal of Forecasting, 30, 5, 490-508 (2011) · Zbl 1219.91111 [62] Trapero, J. R.; Kourentzes, N.; Fildes, R., On the identification of sales forecasting models in the presence of promotions, Journal of the Operational Research Society, 66, 2, 299-307 (2014) [63] Trapero, J. R.; Pedregal, D. J.; Fildes, R.; Kourentzes, N., Analysis of judgmental adjustments in the presence of promotions, International Journal of Forecasting, 29, 2, 234-243 (2013) [64] Van den, Poel; Schamphelaere, D. D.; Wets, J. G., Direct and indirect effects of retail promotions, Expert Systems with Applications, 27, 1, 53-62 (2004) [65] Vindevogel, B.; Van den Poel, D.; Wets, G., Why promotion strategies based on market basket analysis do not work, Expert Systems with Applications, 28, 3, 583-590 (2005) [66] Walters, R. G., Retail promotions and retail store performance: A test of some key hypotheses, Journal of Retailing, 64, 2, 153-180 (1988) [67] Walters, R. G., Assessing the impact of retail price promotions on product substitution, complementary purchase, and inter-store sales displacement, Journal of Marketing, 55, April, 17-28 (1991) [68] Wang, F. S.; Shao, H. M., Effective personalized recommendation based on time-framed navigation clustering and association mining, Expert Systems with Applications, 27, 3, 365-377 (2004) [69] Wedel, M.; Zhang, J., Analyzing brand competition across subcategories, Journal of Marketing Research, 41, 4, 448-456 (2004) [71] Zhang, J. L.; Chen, J.; Lee, C. Y., Joint optimization on pricing, promotion and inventory control with stochastic demand, International Journal of Production Economics, 116, 2, 190-198 (2008) [72] Zou, H.; Hastie, T., Regularization and variable selection via the elasticnet, Journal of the Royal Statistical Society, Series B, 67, 2, 301-320 (2005) · Zbl 1069.62054 [73] Zou, H.; Hastie, T.; Tibshirani, R., Sparse principal component analysis, Journal of Computational and Graphical Statistics, 15, 2, 265-286 (2006) This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.