zbMATH — the first resource for mathematics

Generalization bounds and complexities based on sparsity and clustering for convex combinations of functions from random classes. (English) Zbl 1222.62072
Summary: A unified approach is taken for deriving new generalization data dependent bounds for several classes of algorithms explored in the existing literature by different approaches. This unified approach is based on an extension of Vapnik’s inequality for VC classes of sets to random classes of sets – that is, classes depending on the random data, invariant under permutation of the data and possessing the increasing property. Generalization bounds are derived for convex combinations of functions from random classes with certain properties. Algorithms, such as SVMs (support vector machines), boosting with decision stumps, radial basis function networks, some hierarchies of kernel machines or convex combinations of indicator functions over sets with finite VC dimension, generate classifier functions that fall into the above category. We also explore the individual complexities of the classifiers, such as sparsity of weights and weighted variance over clusters from the convex combination introduced by V. Koltchinskii and D. Panchenko [Ann. Stat. 33, No. 4, 1455–1496 (2005; Zbl 1080.62045)], and show sparsity-type and cluster-variance-type generalization bounds for random classes.

62H30 Classification and discrimination; cluster analysis (statistical aspects)
68T05 Learning and adaptive systems in artificial intelligence
65C60 Computational problems in statistics (MSC2010)
Full Text: Link