×

Data-independent random projections from the feature-map of the homogeneous polynomial kernel of degree two. (English) Zbl 1440.68230

Summary: This paper presents a novel non-linear extension of the Random Projection method based on the degree-2 homogeneous polynomial kernel. Our algorithm is able to implicitly map data points to the high-dimensional feature space of that kernel and from there perform a Random Projection to an Euclidean space of the desired dimensionality. Pairwise distances between data points in the kernel feature space are approximately preserved in the resulting representation. As opposed to previous kernelized Random Projection versions, our method is data-independent and preserves much of the computational simplicity of the original algorithm. This is achieved by focusing on a specific kernel function, what allowed us to analyze the effect of its associated feature mapping in the distribution of the Random Projection hyperplanes. Finally, we present empirical evidence that the proposed method outperforms alternative approaches in terms of pairwise distance preservation, while being significantly more efficient. Also, we show how our method can be used to approximate the accuracy of non-linear classifiers with efficient linear classifiers in some datasets.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
62H30 Classification and discrimination; cluster analysis (statistical aspects)
68U05 Computer graphics; computational geometry (digital and algorithmic aspects)
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Achlioptas, D., Database-friendly random projections, Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, ACM, 274-281 (2001)
[2] Alavi, A.; Wiliem, A.; Zhao, K.; Lovell, B. C.; Sanderson, C., Random projections on manifolds of symmetric positive definite matrices for image classification, IEEE Winter Conference on Applications of Computer Vision, IEEE, 301-308 (2014)
[3] Balcan, M.-F.; Blum, A.; Vempala, S., Kernels as features: on kernels, margins, and low-dimensional mappings, Mach. Learn., 65, 1, 79-94 (2006)
[4] Bingham, E.; Mannila, H., Random projection in dimensionality reduction: applications to image and text data, Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 245-250 (2001)
[5] Blum, A., Random projection, margins, kernels, and feature-selection, Subspace, Latent Structure and Feature Selection, Springer, 52-68 (2006)
[6] Bo, L.; Jiao, L.; Wang, L., Working set selection using functional gain for ls-svm, IEEE Trans. Neural Networks, 18, 5, 1541-1544 (2007)
[7] Chang, Y.-W.; Hsieh, C.-J.; Chang, K.-W.; Ringgaard, M.; Lin, C.-J., Training and testing low-degree polynomial data mappings via linear svm, J. Mach. Learn. Res., 11, April, 1471-1490 (2010) · Zbl 1242.68210
[8] Chen, D.; Yuan, Z.; Hua, G.; Zheng, N.; Wang, J., Similarity learning on an explicit polynomial kernel feature map for person re-identification, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1565-1573 (2015)
[9] Coates, A.; Ng, A.; Lee, H., An analysis of single-layer networks in unsupervised feature learning, Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 215-223 (2011)
[10] Dasgupta, S.; Gupta, A., An elementary proof of a theorem of Johnson and Lindenstrauss, Random Struct. Algo., 22, 1, 60-65 (2003) · Zbl 1018.51010
[11] Fan, R.-E.; Chang, K.-W.; Hsieh, C.-J.; Wang, X.-R.; Lin, C.-J., LIBLINEAR: a library for large linear classification, J. Mach. Learn. Res., 9, 1871-1874 (2008) · Zbl 1225.68175
[12] Fanty, M.; Cole, R., Spoken letter recognition, Advances in Neural Information Processing Systems, 220-226 (1991)
[13] Fontenla-Romero, Ó.; Guijarro-Berdiñas, B.; Martinez-Rego, D.; Pérez-Sánchez, B.; Peteiro-Barral, D., Online machine learning, Eff. Scalabil. Methods Comput. Intell., 27 (2013)
[14] García, S.; Fernández, A.; Luengo, J.; Herrera, F., A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability, Soft Comput., 13, 10, 959-977 (2009)
[15] García, S.; Fernández, A.; Luengo, J.; Herrera, F., Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power, Inf. Sci. (Ny), 180, 10, 2044-2064 (2010)
[16] Hogg, R. V.; Tanis, E.; Zimmerman, D., Probability and statistical inference, 200-202 (2014), Pearson Higher Ed
[17] Jiao, L.; Bo, L.; Wang, L., Fast sparse approximation for least squares support vector machine, IEEE Trans. Neural Networks, 18, 3, 685-697 (2007)
[18] Kallenberg, O., Foundations of Modern Probability (2006), Springer Science & Business Media
[19] A. Krizhevsky, G. Hinton, Learning multiple layers of features from tiny images (2009).; A. Krizhevsky, G. Hinton, Learning multiple layers of features from tiny images (2009).
[20] Kudo, T.; Matsumoto, Y., Fast methods for kernel-based text analysis, Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, vol. 1, 24-31 (2003)
[21] Kulis, B.; Grauman, K., Kernelized locality-sensitive hashing for scalable image search, 2009 IEEE 12th International Conference on Computer Vision, IEEE, 2130-2137 (2009)
[22] Li, P.; Hastie, T. J.; Church, K. W., Very sparse random projections, Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 287-296 (2006)
[23] J.M. Phillips, S. Venkatasubramanian, A gentle introduction to the kernel distance, arXiv:1103.1625; J.M. Phillips, S. Venkatasubramanian, A gentle introduction to the kernel distance, arXiv:1103.1625
[24] Ratcliffe, J., Foundations of Hyperbolic Manifolds, vol. 149 (2006), Springer Science & Business Media · Zbl 1106.51009
[25] A. Shashua, Introduction to machine learning: class notes 67577, arXiv:0904.3664; A. Shashua, Introduction to machine learning: class notes 67577, arXiv:0904.3664
[26] Sun, C.; Jiao, L.; Liu, H.; Yang, S., New classifier based on compressed dictionary and ls-svm, Neurocomputing, 216, 617-626 (2016)
[27] Van Der Maaten, L.; Postma, E.; Van den Herik, J., Dimensionality reduction: a comparative, J. Mach. Learn. Res., 10, 66-71 (2009)
[28] Walpole, R. E.; Myers, R. H.; Myers, S. L.; Ye, K., Probability and statistics for engineers and scientists, vol. 5, 107-117 (1993), Macmillan New York
[29] Yaman, S.; Pelecanos, J., Using polynomial kernel support vector machines for speaker verification, IEEE Signal Process. Lett., 20, 9, 901-904 (2013)
[30] Yaman, S.; Pelecanos, J.; Omar, M. K., On the use of non-linear polynomial kernel svms in language recognition, Thirteenth Annual Conference of the International Speech Communication Association (2012)
[31] Yuan, G.-X.; Ho, C.-H.; Lin, C.-J., Recent advances of large-scale linear classification, Proc. IEEE, 100, 9, 2584-2603 (2012)
[32] Zhao, K.; Alavi, A.; Wiliem, A.; Lovell, B. C., Efficient clustering on Riemannian manifolds: a kernelised random projection approach, Pattern Recognit., 51, 333-345 (2016) · Zbl 1394.68407
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.