Generalised linear model trees with global additive effects. (English) Zbl 1474.62269

Summary: Model-based trees are used to find subgroups in data which differ with respect to model parameters. In some applications it is natural to keep some parameters fixed globally for all observations while asking if and how other parameters vary across subgroups. Existing implementations of model-based trees can only deal with the scenario where all parameters depend on the subgroups. We propose partially additive linear model trees (PALM trees) as an extension of (generalised) linear model trees (LM and GLM trees, respectively), in which the model parameters are specified a priori to be estimated either globally from all observations or locally from the observations within the subgroups determined by the tree. Simulations show that the method has high power for detecting subgroups in the presence of global effects and reliably recovers the true parameters. Furthermore, treatment – subgroup differences are detected in an empirical application of the method to data from a mathematics exam: the PALM tree is able to detect a small subgroup of students that had a disadvantage in an exam with two versions while adjusting for overall ability effects.


62J05 Linear regression; mixed models
62J12 Generalized linear models (logistic models)
Full Text: DOI arXiv


[1] Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. Wadsworth, Pacific Grove · Zbl 0541.62042
[2] Chen, J.; Yu, K.; Hsing, A.; Therneau, TM, A partially linear tree-based regression model for assessing complex joint gene – gene and gene – environment effects, Genet Epidemiol, 31, 238-251, (2007)
[3] Dusseldorp E, Conversano C (2018) Stima: Simultaneous Threshold Interaction Modeling Algorithm. R package version 1.2. https://CRAN.R-project.org/package=stima
[4] Doove, LL; Dusseldorp, E.; Deun, K.; Mechelen, I., A comparison of five recursive partitioning methods to find person subgroups involved in meaningful treatment – subgroup interactions, Adv Data Anal Classif, 8, 403-425, (2014) · Zbl 1414.62239
[5] Dusseldorp, E.; Conversano, C.; Os, BJ, Combining an additive and tree-based regression model simultaneously: STIMA, J Comput Graph Stat, 19, 514-530, (2010)
[6] Fokkema, M.; Smits, N.; Zeileis, A.; Hothorn, T.; Kelderman, H., Detecting treatment-subgroup interactions in clustered data with generalized linear mixed-effects model trees, Behav Res Methods, 50, 2016-2034, (2018)
[7] Hajjem, A.; Bellavance, F.; Larocque, D., Mixed effects regression trees for clustered data, Stat Probab Lett, 81, 451-459, (2011) · Zbl 1207.62136
[8] Holloway ST, Laber EB, Linn KA, Zhang B, Davidian M, Tsiatis AA (2015) DynTxRegime: methods for estimating dynamic treatment regimes. https://CRAN.R-project.org/package=DynTxRegime, R package version 2.1
[9] Hothorn, T.; Zeileis, A., partykit: a modular toolkit for recursive partytioning in R, J Mach Learn Res, 16, 3905-3909, (2015) · Zbl 1351.62005
[10] Hothorn, T.; Hornik, K.; Zeileis, A., Unbiased recursive partitioning: a conditional inference framework, J Comput Graph Stat, 15, 651-674, (2006)
[11] Hubert, L.; Arabie, P., Comparing partitions, J Classif, 2, 193-218, (1985) · Zbl 0587.62128
[12] Italiano, A., Prognostic or predictive? It’s time to get back to definitions!, J Clin Oncol, 29, 4718-4718, (2011)
[13] Lang, M.; Bischl, B.; Surmann, D., batchtools: tools for R to work on batch systems, J Open Source Softw, (2017)
[14] Lipkovich, I.; Dmitrienko, A.; D’Agostino, RB, Tutorial in biostatistics: data-driven subgroup identification and analysis in clinical trials, Stat Med, (2016)
[15] Loh, WY, Regression trees with unbiased variable selection and interaction detection, Stat Sin, 12, 361-386, (2002) · Zbl 0998.62042
[16] Mbogning C, Toussile W (2015) GPLTR: generalized partially linear tree-based regression model. https://CRAN.R-project.org/package=GPLTR, R package version 1.2
[17] Milligan, GW; Cooper, MC, A study of the comparability of external criteria for hierarchical cluster analysis, Multivar Behav Res, 21, 441-458, (1986)
[18] Seibold, H.; Zeileis, A.; Hothorn, T., Model-based recursive partitioning for subgroup analyses, Int J Biostat, 12, 45-63, (2016)
[19] Seibold H, Hothorn T, Zeileis A (2017) palmtree: partially additive (generalized) linear model trees. https://CRAN.R-project.org/package=palmtree, R package version 0.9-0
[20] Sela, RJ; Simonoff, JS, RE-EM trees: a data mining approach for longitudinal and clustered data, Mach Learn, 86, 169-207, (2012) · Zbl 1238.68131
[21] Sies, A.; Mechelen, I., Comparing four methods for estimating tree-based treatment regimes, Int J Biostat Online First, (2017)
[22] Zeileis, A.; Hornik, K., Generalized M-fluctuation tests for parameter instability, Stat Neerl, 61, 488-508, (2007) · Zbl 1152.62014
[23] Zeileis, A.; Hothorn, T.; Hornik, K., Model-based recursive partitioning, J Comput Graph Stat, 17, 492-514, (2008)
[24] Zhang, B.; Tsiatis, AA; Davidian, M.; Zhang, M.; Laber, E., Estimating optimal treatment regimes from a classification perspective, Stat, 1, 103-114, (2012) · Zbl 1258.62116
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.