## Lasso-constrained regression analysis for interval-valued data.(English)Zbl 1414.62305

Summary: A new method of regression analysis for interval-valued data is proposed. The relationship between an interval-valued response variable and a set of interval-valued explanatory variables is investigated by considering two regression models, one for the midpoints and the other one for the radii. The estimation problem is approached by introducing Lasso-based constraints on the regression coefficients. This can improve the prediction accuracy of the model and, taking into account the nature of the constraints, can sometimes produce a parsimonious model with a common subset of regression coefficients for the midpoint and the radius models. The effectiveness of our method, called Lasso-IR (Lasso-based Interval-valued Regression), is shown by a simulation experiment and some applications to real data.

### MSC:

 62J05 Linear regression; mixed models

### Keywords:

interval-valued data; regression; Lasso; prediction accuracy

SODAS; bootstrap
Full Text:

### References:

  Ahn, J.; Peng, M.; Park, C.; Jeon, Y., A resampling approach for interval-valued data regression, Stat Anal Data Min, 5, 336-348, (2012)  Billard, L.; Diday, E.; Kiers, HAL (ed.); Rasson, JP (ed.); Groenen, PJF (ed.); Schader, M. (ed.), Regression analysis for interval-valued data, 369-374, (2000), Heidelberg  Billard, L.; Diday, E., From the statistics of data to the statistics of knowledge: symbolic data analysis, J Am Stat Assoc, 98, 470-487, (2003)  Billard L, Diday E (2007) Symbolic data analysis: conceptual statistics and data mining. Wiley, Chichester  Blanco-Fernandez, A.; Corral, N.; Gonzalez-Rodriguez, G., Estimation of a flexible simple linear model for interval data based on set arithmetic, Comput Stat Data Anal, 55, 2568-2578, (2011) · Zbl 1464.62030  Blanco-Fernandez, A.; Colubi, A.; Gonzalez-Rodriguez, G., Confidence sets in a linear regression model for interval data, J Stat Plann Infer, 142, 1320-1329, (2012) · Zbl 1242.62072  Bock HH, Diday E (2000) Analysis of symbolic data: exploratory methods for extracting statistical information from complex data. Springer, Heidelberg · Zbl 1039.62501  Brito, P.; Noirhomme-Fraiture, M., Far beyond the classical data models: symbolic data analysis, Stat Anal Data Min, 4, 157-170, (2011)  Diday E, Noirhomme-Fraiture M (2008) Symbolic data analysis and the SODAS software. Wiley, Chichester · Zbl 1275.62029  Domingues, MAO; Souza, RMCR; Cysneiros, FJA, A robust method for linear regression of symbolic interval data, Patt Rec Lett, 31, 1991-1996, (2010)  Efron, B.; Hastie, T.; Johnstone, I.; Tibshirani, R., Least angle regression, Ann Stat, 32, 407-499, (2004) · Zbl 1091.62054  Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman & Hall, New York · Zbl 0835.62038  Gill PE, Murray W, Wright MH (1981) Practical optimization. Academic Press, London · Zbl 0503.90062  Gonzalez-Rodriguez, G.; Blanco, A.; Corral, N.; Colubi, A., Least squares estimation of linear regression models for convex compact random sets, Adv Data Anal Classif, 1, 67-81, (2007) · Zbl 1131.62058  Lawson CL, Hanson RJ (1995) Solving least squares problems, (classics in applied mathematics, vol. 15). SIAM, Philadelphia  Lima-Neto, EA; Carvalho, FAT, Centre and range method to fitting a linear regression model on symbolic interval data, Comput Stat Data Anal, 52, 1500-1515, (2008) · Zbl 1452.62493  Lima-Neto, EA; Carvalho, FAT, Constrained linear regression models for symbolic interval-valued variables, Comput Stat Data Anal, 54, 333-347, (2010) · Zbl 1464.62055  Tibshirani, R., Regression shrinkage and selection via the lasso, J R Stat Soc Ser B, 58, 267-288, (1996) · Zbl 0850.62538  Trutschnig, W.; Gonzalez-Rodriguez, G.; Colubi, A.; Gil, MA, A new family of metrics for compact, convex (fuzzy) sets based on a generalized concept of mid and spread, Inf Sci, 179, 3964-3972, (2009) · Zbl 1181.62016  Xu W (2010) Symbolic data analysis: interval-valued data regression. PhD thesis, University of Georgia
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.