×

Lasso-constrained regression analysis for interval-valued data. (English) Zbl 1414.62305

Summary: A new method of regression analysis for interval-valued data is proposed. The relationship between an interval-valued response variable and a set of interval-valued explanatory variables is investigated by considering two regression models, one for the midpoints and the other one for the radii. The estimation problem is approached by introducing Lasso-based constraints on the regression coefficients. This can improve the prediction accuracy of the model and, taking into account the nature of the constraints, can sometimes produce a parsimonious model with a common subset of regression coefficients for the midpoint and the radius models. The effectiveness of our method, called Lasso-IR (Lasso-based Interval-valued Regression), is shown by a simulation experiment and some applications to real data.

MSC:

62J05 Linear regression; mixed models

Software:

SODAS; bootstrap
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Ahn, J.; Peng, M.; Park, C.; Jeon, Y., A resampling approach for interval-valued data regression, Stat Anal Data Min, 5, 336-348, (2012)
[2] Billard, L.; Diday, E.; Kiers, HAL (ed.); Rasson, JP (ed.); Groenen, PJF (ed.); Schader, M. (ed.), Regression analysis for interval-valued data, 369-374, (2000), Heidelberg
[3] Billard, L.; Diday, E., From the statistics of data to the statistics of knowledge: symbolic data analysis, J Am Stat Assoc, 98, 470-487, (2003)
[4] Billard L, Diday E (2007) Symbolic data analysis: conceptual statistics and data mining. Wiley, Chichester
[5] Blanco-Fernandez, A.; Corral, N.; Gonzalez-Rodriguez, G., Estimation of a flexible simple linear model for interval data based on set arithmetic, Comput Stat Data Anal, 55, 2568-2578, (2011) · Zbl 1464.62030
[6] Blanco-Fernandez, A.; Colubi, A.; Gonzalez-Rodriguez, G., Confidence sets in a linear regression model for interval data, J Stat Plann Infer, 142, 1320-1329, (2012) · Zbl 1242.62072
[7] Bock HH, Diday E (2000) Analysis of symbolic data: exploratory methods for extracting statistical information from complex data. Springer, Heidelberg · Zbl 1039.62501
[8] Brito, P.; Noirhomme-Fraiture, M., Far beyond the classical data models: symbolic data analysis, Stat Anal Data Min, 4, 157-170, (2011)
[9] Diday E, Noirhomme-Fraiture M (2008) Symbolic data analysis and the SODAS software. Wiley, Chichester · Zbl 1275.62029
[10] Domingues, MAO; Souza, RMCR; Cysneiros, FJA, A robust method for linear regression of symbolic interval data, Patt Rec Lett, 31, 1991-1996, (2010)
[11] Efron, B.; Hastie, T.; Johnstone, I.; Tibshirani, R., Least angle regression, Ann Stat, 32, 407-499, (2004) · Zbl 1091.62054
[12] Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman & Hall, New York · Zbl 0835.62038
[13] Gill PE, Murray W, Wright MH (1981) Practical optimization. Academic Press, London · Zbl 0503.90062
[14] Gonzalez-Rodriguez, G.; Blanco, A.; Corral, N.; Colubi, A., Least squares estimation of linear regression models for convex compact random sets, Adv Data Anal Classif, 1, 67-81, (2007) · Zbl 1131.62058
[15] Lawson CL, Hanson RJ (1995) Solving least squares problems, (classics in applied mathematics, vol. 15). SIAM, Philadelphia
[16] Lima-Neto, EA; Carvalho, FAT, Centre and range method to fitting a linear regression model on symbolic interval data, Comput Stat Data Anal, 52, 1500-1515, (2008) · Zbl 1452.62493
[17] Lima-Neto, EA; Carvalho, FAT, Constrained linear regression models for symbolic interval-valued variables, Comput Stat Data Anal, 54, 333-347, (2010) · Zbl 1464.62055
[18] Tibshirani, R., Regression shrinkage and selection via the lasso, J R Stat Soc Ser B, 58, 267-288, (1996) · Zbl 0850.62538
[19] Trutschnig, W.; Gonzalez-Rodriguez, G.; Colubi, A.; Gil, MA, A new family of metrics for compact, convex (fuzzy) sets based on a generalized concept of mid and spread, Inf Sci, 179, 3964-3972, (2009) · Zbl 1181.62016
[20] Xu W (2010) Symbolic data analysis: interval-valued data regression. PhD thesis, University of Georgia
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.