Finite mixture of regression models for censored data based on scale mixtures of normal distributions. (English) Zbl 1474.62259

Summary: In statistical analysis, particularly in econometrics, the finite mixture of regression models based on the normality assumption is routinely used to analyze censored data. In this work, an extension of this model is proposed by considering scale mixtures of normal distributions (SMN). This approach allows us to model data with great flexibility, accommodating multimodality and heavy tails at the same time. The main virtue of considering the finite mixture of regression models for censored data under the SMN class is that this class of models has a nice hierarchical representation which allows easy implementation of inferences. We develop a simple EM-type algorithm to perform maximum likelihood inference of the parameters in the proposed model. To examine the performance of the proposed method, we present some simulation studies and analyze a real dataset. The proposed algorithm and methods are implemented in the new R package CensMixReg.


62H30 Classification and discrimination; cluster analysis (statistical aspects)
62J05 Linear regression; mixed models
62N01 Censored data models
Full Text: DOI


[1] Andrews, DF; Mallows, CL, Scale mixtures of normal distributions, J R Stat Soc Ser B, 36, 99-102, (1974) · Zbl 0282.62017
[2] Arellano-Valle, RB; Castro, L.; González-Farías, G.; Muños Gajardo, K., Student-t censored regression model: properties and inference, Stat Methods Appl, 21, 453-473, (2012) · Zbl 1332.62381
[3] Ateya, SF, Maximum likelihood estimation under a finite mixture of generalized exponential distributions based on censored data, Stat Pap, 55, 311-325, (2014) · Zbl 1297.62040
[4] Basso, RM; Lachos, VH; Cabral, CRB; Ghosh, P., Robust mixture modeling based on scale mixtures of skew-normal distributions, Comput Stat Data Anal, 54, 2926-2941, (2010) · Zbl 1284.62193
[5] Benites L, Lachos VH, Moreno EJL (2017) CensMixReg: censored linear mixture regression models. https://CRAN.R-project.org/package=CensMixReg, R package version 3.0
[6] Cabral, CRB; Lachos, VH; Prates, MO, Multivariate mixture modeling using skew-normal independent distributions, Comput Stat Data Anal, 56, 126-142, (2012) · Zbl 1239.62058
[7] Caudill, SB, A partially adaptive estimator for the censored regression model based on a mixture of normal distributions, Stat Methods Appl, 21, 121-137, (2012)
[8] Cuesta-Albertos, JA; Gordaliza, A.; Matrán, C., Trimmed \(k\)-means: an attempt to robustify quantizers, Annal Stat, 25, 553-576, (1997) · Zbl 0878.62045
[9] Depraetere, N.; Vandebroek, M., Order selection in finite mixtures of linear regressions: literature review and a simulation study, Stat Pap, 55, 871-911, (2014) · Zbl 1334.62138
[10] Dempster, AP; Laird, NM; Rubin, DB, Maximum likelihood from incomplete data via the EM algorithm, J R Stat Soc Ser B, 39, 1-38, (1977) · Zbl 0364.62022
[11] Fagundes, RA; Souza, RM; Cysneiros, FJA, Robust regression with application to symbolic interval data, Eng Appl Artif Intell, 26, 564-573, (2013)
[12] Faria, S.; Soromenho, G., Fitting mixtures of linear regressions, J Stat Comput Simul, 80, 201-225, (2010) · Zbl 1184.62118
[13] Frühwirth-Schnatter S (2006) Finite mixture and Markov switching models. Springer, New York · Zbl 1108.62002
[14] Galimberti, G.; Soffritti, G., A multivariate linear regression analysis using finite mixtures of t distributions, Comput Stat Data Anal, 71, 138-150, (2014) · Zbl 1471.62070
[15] Garay, AM; Lachos, VH; Bolfarine, H.; Cabral, CRB, Linear censored regression models with scale mixtures of normal distributions, Stat Pap, 58, 247-278, (2015) · Zbl 1394.62131
[16] Garay, AM; Lachos, VH; Lin, TI, Nonlinear censored regression models with heavy-tailed distributions, Stat Interface, 9, 281-293, (2016) · Zbl 1405.62094
[17] Greene WH (2012) Econometric analysis, 7th edn. Pearson, Harlow
[18] Grün B, Leisch F (2008) Finite mixtures of generalized linear regression models. In: Recent advances in linear models and related areas: essays in honour of helge toutenburg. Physica-Verlag HD, Heidelberg, pp 205-230
[19] He, J., Mixture model based multivariate statistical analysis of multiply censored environmental data, Adv Water Res, 59, 15-24, (2013)
[20] Hennig, C., Identifiablity of models for clusterwise linear regression, J Classif, 17, 273-296, (2000) · Zbl 1017.62058
[21] Hennig C (2012) Trimcluster: cluster analysis with trimming. https://CRAN.R-project.org/package=trimcluster, r package version 0.1-2
[22] Karlsson, M.; Laitila, T., Finite mixture modeling of censored regression models, Stat Pap, 55, 627-642, (2014) · Zbl 1416.62215
[23] Kaufman L, Rousseeuw P (1990) Finding groups in data. Wiley, New York · Zbl 1345.62009
[24] Lachos, VH; Moreno, EJL; Chen, K.; Cabral, CRB, Finite mixture modeling of censored data using the multivariate student-t distribution, J Multivar Anal, 159, 151-167, (2017) · Zbl 1397.62221
[25] Lange, KL; Sinsheimer, JS, Normal/independent distributions and their applications in robust regression, J Comput Graph Stat, 2, 175-198, (1993)
[26] Lin, TI; Ho, HJ; Lee, CR, Flexible mixture modelling using the multivariate skew-t-normal distribution, Stat Comput, 24, 531-546, (2014) · Zbl 1325.62113
[27] Liu, C.; Rubin, DB, The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence, Biometrika, 81, 633-648, (1994) · Zbl 0812.62028
[28] Louis, T., Finding the observed information matrix when using the em algorithm, J R Stat Soc Ser B, 44, 226-233, (1982) · Zbl 0488.62018
[29] Massuia, MB; Cabral, CRB; Matos, LA; Lachos, VH, Influence diagnostics for student-t censored linear regression models, Statistics, 49, 1074-1094, (2015) · Zbl 1382.62050
[30] MATLAB (2016) version 9.0 (R2016a). The MathWorks Inc., Natick, Massachusetts
[31] Mazza, A.; Punzo, A., Mixtures of multivariate contaminated normal regression models, Stat Pap, (2017)
[32] McLachlan GJ, Krishnan T (2008) The EM algorithm and extensions. John Wiley & Sons, New Jersey · Zbl 1165.62019
[33] McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York · Zbl 0963.62061
[34] Melenberg, B.; Soest, AV, Parametric and semi-parametric modeling of vacation expenditures, J Appl Econ, 11, 59-76, (1996)
[35] Miyata, Y., Maximum likelihood estimators in finite mixture models with censored data, J Stat Plan Inference, 141, 56-64, (2011) · Zbl 1197.62026
[36] Mouselimis L (2017) ClusterR: gaussian mixture models, K-Means, mini-batch-Kmeans and K-Medoids clustering. https://CRAN.R-project.org/package=ClusterR, R package version 1.0.5
[37] Mroz, TA, The sensitivity of an empirical model of married women’s hours of work to economic and statistical assumptions, Econometrica, 55, 765-799, (1987)
[38] Powell, JL, Least absolute deviations estimation for the censored regression model, J Econ, 25, 303-325, (1984) · Zbl 0571.62100
[39] Powell, JF, Symmetrically trimmed least squares estimation for Tobit models, Econometrica, 54, 1435-1460, (1986) · Zbl 0625.62048
[40] R Core Team (2017) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
[41] Raftery, AE, Bayesian model selection in social research, Sociol Methodol, 25, 111-163, (1995)
[42] Tzortzis, G.; Likas, A., The MinMax k-Means clustering algorithm, Pattern Recognit, 47, 2505-2516, (2014)
[43] Vaida, F.; Liu, L., Fast implementation for normal mixed effects models with censored response, J Comput Graph Stat, 18, 797-817, (2009)
[44] Vuong, QH, Likelihood ratio tests for model selection and non-nested hypotheses, Econom J Econom Soc, 57, 307-333, (1989) · Zbl 0701.62106
[45] Witte, A., Estimating an economic model of crime with individual data, Q J Econ, 94, 57-84, (1980)
[46] Zhang B (2003) Regression clustering. In: Proceedings of the third IEEE international conference on data mining, Melbourne
[47] Zeller, CB; Cabral, CRB; Lachos, VH, Robust mixture regression modeling based on scale mixtures of skew-normal distributions, Test, 25, 375-396, (2016) · Zbl 1342.62113
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.