## Tree-structured modelling of categorical predictors in generalized additive regression.(English)Zbl 1416.62364

Summary: Generalized linear and additive models are very efficient regression tools but many parameters have to be estimated if categorical predictors with many categories are included. The method proposed here focusses on the main effects of categorical predictors by using tree type methods to obtain clusters of categories. When the predictor has many categories one wants to know in particular which of the categories have to be distinguished with respect to their effect on the response. The tree-structured approach allows to detect clusters of categories that share the same effect while letting other predictors, in particular metric predictors, have a linear or additive effect on the response. An algorithm for the fitting is proposed and various stopping criteria are evaluated. The preferred stopping criterion is based on $$p$$ values representing a conditional inference procedure. In addition, stability of clusters is investigated and the relevance of predictors is investigated by bootstrap methods. Several applications show the usefulness of the tree-structured approach and small simulation studies demonstrate that the fitting procedure works well.

### MSC:

 62H30 Classification and discrimination; cluster analysis (statistical aspects) 62J12 Generalized linear models (logistic models) 62J02 General nonlinear regression

### Software:

gamair; CasANOVA; BayesX; R; gvcm.cat; structree
Full Text: