×

Kernel cuts: kernel and spectral clustering meet regularization. (English) Zbl 1464.62331

Summary: This work bridges the gap between two popular methodologies for data partitioning: kernel clustering and regularization-based segmentation. While addressing closely related practical problems, these general methodologies may seem very different based on how they are covered in the literature. The differences may show up in motivation, formulation, and optimization, e.g. spectral relaxation versus max-flow. We explain how regularization and kernel clustering can work together and why this is useful. Our joint energy combines standard regularization, e.g. MRF potentials, and kernel clustering criteria like normalized cut. Complementarity of such terms is demonstrated in many applications using our bound optimization Kernel Cut algorithm for the joint energy (code is publicly available). While detailing combinatorial move-making, our main focus are new linear kernel and spectral bounds for kernel clustering criteria allowing their integration with any regularization objectives with existing discrete or continuous solvers.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62M15 Inference from stochastic processes and spectral analysis
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Ssstrunk, S. (2012). Slic superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2274-2282.
[2] Adams, A., Baek, J., & Davis, M. A. (2010). Fast high-dimensional filtering using the permutohedral lattice. Computer Graphics Forum, 29(2), 753-762.
[3] Aggarwal, C. C., & Reddy, C. K. (eds.) (2014). Data clustering: Algorithms and applications. London: Chapman & Hall/CRC. · Zbl 1331.62026
[4] Arbelaez, P., Maire, M., Fowlkes, C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 898-916.
[5] Bach, F., & Jordan, M. (2003). Learning spectral clustering. Advances in Neural Information Processing Systems, 16, 305-312.
[6] Bazaraa, M. S., Sherali, H. D., & Shetty, C. M. (2006). Nonlinear programming: Theory and algorithms. Hoboken: Wiley. · Zbl 1140.90040
[7] Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6), 1373-1396. · Zbl 1085.68119
[8] Belongie, S., & Malik, J. (1998). Finding boundaries in natural images: A new method using point descriptors and area completion. In Proceedings of the European Conference on Computer Vision (ECCV).
[9] Ben Ayed, I., Gorelick, L., & Boykov, Y. (2013). Auxiliary cuts for general classes of higher order functionals. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1304-1311). Portland, Oregon. http://www.csd.uwo.ca/ yuri/Abstracts/cvpr13-auxcut-abs.shtml.
[10] Ben Ayed, I., Mitiche, A., & Belhadj, Z. (2005). Multiregion level set partitioning of synthetic aperture radar images. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 27(5), 793-800.
[11] Bishop, C. M. (2006). Pattern recognition and machine learning. Berlin: Springer. · Zbl 1107.68072
[12] Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press. · Zbl 1058.90049
[13] Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In ICCV (vol. I, pp. 105-112).
[14] Boykov, Y., & Kolmogorov, V. (2003). Computing geodesics and minimal surfaces via graph cuts. In International Conference on Computer Vision (vol. I, pp. 26-33).
[15] Boykov, Y., & Funka-Lea, G. (2006). Graph cuts and efficient N-D image segmentation. International Journal of Computer Vision (IJCV), 70(2), 109-131.
[16] Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222-1239.
[17] Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222-1239.
[18] Breiman, L. (1996). Technical note: Some properties of splitting criteria. Machine Learning, 24(1), 41-47. · Zbl 0849.68095
[19] Brox, T., & Malik, J. (2010). Object segmentation by long term analysis of point trajectories. In Computer Vision-ECCV 2010 (pp. 282-295). Springer.
[20] Brox, T., & Malik, J. (2011). Large displacement optical flow: Descriptor matching in variational motion estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(3), 500-513.
[21] Carreira-Perpinan, M. A., & Wang, W. (2013). The K-modes algorithm for clustering. arXiv:1304.6478v1 [cs.LG]
[22] Caselles, V., Kimmel, R., & Sapiro, G. (1997). Geodesic active contours. International Journal of Computer Vision, 22(1), 61-79. · Zbl 0894.68131
[23] Chambolle, A. (2004). An algorithm for total variation minimization and applications. Journal of Mathematical imaging and vision, 20(1-2), 89-97. · Zbl 1366.94048
[24] Chambolle, A., & Pock, T. (2011). A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision, 40(1), 120-145. · Zbl 1255.68217
[25] Chan, T., & Vese, L. (2001). Active contours without edges. IEEE Transactions on Image Processing, 10(2), 266-277. · Zbl 1039.68779
[26] Chew, S. E., & Cahill, N. D. (2015). Semi-supervised normalized cuts for image segmentation. In The IEEE International Conference on Computer Vision (ICCV).
[27] Chitta, R., Jin, R., Havens, T., & Jain, A. (2011). Scalable kernel clustering: Approximate kernel k-means. In KDD (pp. 895-903).
[28] Collins, M.D., Liu, J., Xu, J., Mukherjee, L., & Singh, V. (2014). Spectral clustering with a convex regularizer on millions of images. In Computer Vision-ECCV 2014 (pp. 282-298). Springer.
[29] Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 24(5), 603-619.
[30] Cox, I., Rao, S., & Zhong, Y. (1996). “Ratio regions”: A technique for image segmentation. In International Conference on Pattern Recognition (ICPR) (pp. 557-564).
[31] Cox, T., & Cox, M. (2000). Multidimensional scaling. Boca Raton: CRC Press. · Zbl 0853.62047
[32] Cremers, D., Rousson, M., & Deriche, R. (2007). A review of statistical approaches to level set segmentation: Integrating color, texture, motion and shape. International Journal of Computer Vision, 72(2), 195-215.
[33] Delong, A., Osokin, A., Isack, H., & Boykov, Y. (2012). Fast approximate energy minization with label costs. International Journal of Computer Vision (IJCV), 96(1), 1-27. · Zbl 1235.68257
[34] Deng, Z., Todorovic, S., & Latecki, L. J. (2015). Semantic segmentation of RGBD images with mutex constraints. In International Conference on Computer Vision (ICCV). Santiago, Chile.
[35] Dhillon, I., Guan, Y., & Kulis, B. (2004). Kernel k-means, spectral clustering and normalized cuts. In KDD.
[36] Dhillon, I., Guan, Y., & Kulis, B. (2007). Weighted graph cuts without eigenvectors: A multilevel approach. IEEE Transactions on Pattern Analysis and Machine Learning (PAMI), 29(11), 1944-1957.
[37] Dou, M., Taylor, J., Fuchs, H., Fitzgibbon, A., & Izadi, S. (2015). 3D scanning deformable objects with a single RGBD sensor. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 493-501).
[38] Duda, R., Hart, P., & Stork, D. (2001). Pattern classification. Hoboken: Wiley. · Zbl 0968.68140
[39] Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern classification. Hoboken: Wiley. · Zbl 0968.68140
[40] Eriksson, A., Olsson, C., & Kahl, F. (2011). Normalized cuts revisited: A reformulation for segmentation with linear grouping constraints. Journal of Mathematical Imaging and Vision, 39(1), 45-61. · Zbl 1255.68171
[41] Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721-741. · Zbl 0573.62030
[42] Girolami, M. (2002). Mercer kernel-based clustering in feature space. IEEE Transactions on Neural Networks, 13(3), 780-784.
[43] Golub, G. H., & Van Loan, C. F. (2012). Matrix computations (Vol. 3). Baltimore: JHU Press. · Zbl 1268.65037
[44] Gottfried, J. M., Fehr, J., & Garbe, C. S. (2011). Computing range flow from multi-modal kinect data. In Advances in visual computing (pp. 758-767). Springer.
[45] Gulshan, V., Lempitsky, V., & Zisserman, A. (2011). Humanising grabcut: Learning to segment humans using the kinect. In Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on (pp. 1127-1133). IEEE.
[46] Hein, M., Lal, T. N., & Bousquet, O. (2004). Hilbertian metrics on probability measures and their application in SVM’s. Pattern Recognition LNCS, 3175, 270-277.
[47] Hochbaum, D. S. (2010). Polynomial time algorithms for ratio regions and a variant of normalized cut. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(5), 889-898.
[48] Hofmann, T., & Buhmann, J. (1997). Pairwise data clustering by deterministic annealing. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 19(1), 1-14.
[49] Ishikawa, H. (2003). Exact optimization for Markov random fields with convex priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(10), 1333-1336.
[50] Jayasumana, S., Hartley, R., Salzmann, M., Li, H., & Harandi, M. (2015). Kernel methods on Riemannian manifolds with Gaussian RBF kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 2464-2477.
[51] Jermyn, I., & Ishikawa, H. (2001). Globally optimal regions and boundaries as minimum ratio weight cycles. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 23(10), 1075-1088.
[52] Kappes, J. H., Andres, B., Hamprecht, F. A., Schnörr, C., Nowozin, S., Batra, D., et al. (2015). A comparative study of modern inference techniques for structured discrete energy minimization problems. International Journal of Computer Vision, 115(2), 155-184.
[53] Kearns, M., Mansour, Y., & Ng, A. (1997). An information-theoretic analysis of hard and soft assignment methods for clustering. In Conference on Uncertainty in Artificial Intelligence (UAI).
[54] Kohli, P., Torr, P. H., et al. (2009). Robust higher order potentials for enforcing label consistency. International Journal of Computer Vision, 82(3), 302-324.
[55] Kolmogorov, V., Boykov, Y., & Rother, C. (2007). Applications of parametric maxflow in computer vision. In IEEE International Conference on Computer Vision (ICCV).
[56] Kolmogorov, V. (2006). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1568-1583.
[57] Kolmogorov, V. (2006). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1568-1583.
[58] Krahenbuhl, P., & Koltun, V. (2011). Efficient inference in fully connected CRFs with Gaussian edge potentials. In NIPS.
[59] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
[60] Kulis, B., Basu, S., Dhillon, I., & Mooney, R. (2009). Semi-supervised graph clustering: A kernel approach. Machine Learning, 74(1), 1-22. · Zbl 1472.68144
[61] Lange, K., Hunter, D. R., & Yang, I. (2000). Optimization transfer using surrogate objective functions. Journal of Computational and Graphical Statistics, 9(1), 1-20.
[62] Lempitsky, V., Blake, A., & Rother, C. (2008). Image segmentation by branch-and-mincut. In ECCV.
[63] Lempitsky, V., Kohli, P., Rother, C., & Sharp, T. (2009). Image segmentation with a bounding box prior. In International Conference on Computer Vision (ICCV) (pp. 277-284).
[64] Li, T., Ma, S., & Ogihara, M. (2004). Entropy-based criterion in categorical clustering. In: International Conference Mobile Learning.
[65] Li, S. (2009). Markov random field modeling in image analysis (3rd ed.). Berlin: Springer. · Zbl 1183.68691
[66] Louppe, G., Wehenkel, L., Sutera, A., & Geurts, P. (2013). Understanding variable importances in forests of randomized trees. In NIPS (pp. 431-439).
[67] MacKay, D. J. (2003). Information theory, inference, and learning algorithms. Cambridge: Cambridge University Press. · Zbl 1055.94001
[68] Malik, J., Belongie, S., Leung, T., & Shi, J. (2001). Contour and texture analysis for image segmentation. International Journal of Computer Vision, 43(1), 7-27. · Zbl 0972.68604
[69] Marin, D., Tang, M., Ayed, I. B., & Boykov, Y. (2018). Kernel clustering: Density biases and solutions. https://doi.org/10.1109/TPAMI.2017.2780166.
[70] Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on (vol. 2, pp. 416-423). IEEE.
[71] McGuinness, K., & Oconnor, N. E. (2010). A comparative evaluation of interactive segmentation algorithms. Pattern Recognition, 43(2), 434-444. · Zbl 1186.68417
[72] Menze, M., & Geiger, A. (2015). Object scene flow for autonomous vehicles. In Conference on Computer Vision and Pattern Recognition (CVPR).
[73] Mitiche, A., & Ayed, I. B. (2010). Variational and level set methods in image segmentation. Berlin: Springer. · Zbl 1371.94035
[74] Muller, K., Mika, S., Ratsch, G., Tsuda, K., & Scholkopf, B. (2001). An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks, 12(2), 181-201.
[75] Müller, K. R., Mika, S., Rätsch, G., Tsuda, K., & Schölkopf, B. (2001). An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks, 12(2), 181-201.
[76] Mumford, D., & Shah, J. (1989). Optimal approximations by piecewise smooth functions and associated variational problems. Communications on Pure and Applied Mathematics, 42, 577-685. · Zbl 0691.49036
[77] Narasimhan, M., & Bilmes, J. A. (2005). A submodular – supermodular procedure with applications to discriminative structure learning. In UAI (pp. 404-412).
[78] Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., Kohi, P., Shotton, J., Hodges, S., & Fitzgibbon, A. (2011). Kinectfusion: Real-time dense surface mapping and tracking. In Mixed and augmented reality (ISMAR), 2011 10th IEEE international symposium on (pp. 127-136). IEEE.
[79] Ng, A., Jordan, M., & Weiss, Y. (2002). On spectral clustering: analysis and an algorithm. In Advances in Neural Information Processing Systems (NIPS) (vol. 2, pp. 849-856).
[80] Nieuwenhuis, C., & Cremers, D. (2013). Spatially varying color distributions for interactive multilabel segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(5), 1234-1247.
[81] Ochs, P., & Brox, T. (2011). Object segmentation in video: A hierarchical variational approach for turning point trajectories into dense regions. In Computer Vision (ICCV), 2011 IEEE International Conference on (pp. 1583-1590). IEEE.
[82] Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3), 145-175. · Zbl 0990.68601
[83] Paris, S., & Durand, F. (2006). A fast approximation of the bilateral filter using a signal processing approach. In Computer Vision-ECCV 2006 (pp. 568-580). Springer.
[84] Park, K., & Gould, S. (2012). On learning higher-order consistency potentials for multi-class pixel labeling. In ECCV.
[85] Pock, T., Chambolle, A., Cremers, D., & Bischof, H. (2009). A convex relaxation approach for computing minimal partitions. In IEEE conference on Computer Vision and Pattern Recognition (CVPR).
[86] Ren, X., Bo, L., & Fox, D. (2012). Rgb-(d) scene labeling: Features and algorithms. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on (pp. 2759-2766). IEEE.
[87] Rose, K. (1998). Deterministic annealing for clustering, compression, classification, regression, and related optimization problems. Proceedings of the IEEE, 86(11), 2210-2239.
[88] Rother, C., Kolmogorov, V., & Blake, A. (2004). Grabcut: Interactive foreground extraction using iterated graph cuts. In ACM Transactions on Graphics (SIGGRAPH).
[89] Roth, V., Laub, J., Kawanabe, M., & Buhmann, J. (2003). Optimal cluster preserving embedding of nonmetric proximity data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(12), 1540-1551.
[90] Rousson, M., & Deriche, R. (2002). A variational framework for active and adaptative segmentation of vector valued images. In Workshop on motion and video computing.
[91] Salah, M. B., Mitiche, A., & Ayed, I. B. (2010). Effective level set image segmentation with a kernel induced data term. IEEE Transactions on Image Processing, 19(1), 220-232. · Zbl 1371.68284
[92] Shawe-Tayler, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge: Cambridge University Press.
[93] Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 888-905.
[94] Silberman, N., Hoiem, D., Kohli, P. & Fergus, R. (2012). Indoor segmentation and support inference from rgbd images. In ECCV.
[95] Sung, K. K., & Poggio, T. (1995). Example based learning for viewbased human face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 20, 39-51.
[96] Tang, M., Ayed, I. B., & Boykov, Y. (2014). Pseudo-bound optimization for binary energies. In European Conference on Computer Vision (ECCV) (pp. 691-707).
[97] Tang, M., Ayed, I. B., Marin, D., & Boykov, Y. (2015). Secrets of grabcut and kernel k-means. In International Conference on Computer Vision (ICCV). Santiago, Chile.
[98] Tang, M., Gorelick, L., Veksler, O., & Boykov, Y. (2013). Grabcut in one cut. In International Conference on Computer Vision (ICCV). Sydney, Australia.
[99] Tang, M., Marin, D., Ayed, I. B., & Boykov, Y. (2016). Kernel cuts: MRF meets kernel and spectral clustering. In arXiv:1506.07439
[100] Tang, M., Marin, D., Ayed, I. B., & Boykov, Y. (2016). Normalized cut meets MRF. In European Conference on Computer Vision (ECCV). Amsterdam, Netherlands.
[101] Vapnik, V. (1998). Statistical learning theory. Hoboken: Wiley. · Zbl 0935.62007
[102] Varma, M., & Zisserman, A. (2005). A statistical approach to texture classification from single images. International Journal of Computer Vision, 62(1-2), 61-81.
[103] Veksler, O. (2018). Efficient graph cut optimisation for full CRFs with quantized edges. IEEE Transactions on Pattern Analysis and Machine Intelligence. arXiv:1809.04995.
[104] Vicente, S., Kolmogorov, V., & Rother, C. (2009). Joint optimization of segmentation and appearance models. In International Conference on Computer Vision (ICCV).
[105] Von Luxburg, U. (2007). A tutorial on spectral clustering. Statistics and Computing, 17(4), 395-416.
[106] Wang, S., & Siskind, J. M. (2003). Image segmentation with ratio cut. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 25(6), 675-690.
[107] Werner, T. (2007). A linear programming approach to max-sum problem: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(7), 1165-1179.
[108] Xu, L., Li, W., & Schuurmans, D. (2009). Fast normalized cut with linear constraints. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2866-2873).
[109] Yedidia, J. S., Freeman, W. T., & Weiss, Y. (2005). Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Transactions on Information Theory, 51(7), 2282-2312. · Zbl 1283.94023
[110] Yu, S., & Shi, J. (2003). Multiclass spectral clustering. In International Conference on Computer Vision (ICCV).
[111] Yu, Y., Fang, C., & Liao, Z. (2015). Piecewise flat embedding for image segmentation. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1368-1376).
[112] Yu, S. X., & Shi, J. (2004). Segmentation given partial grouping constraints. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 26(2), 173-183.
[113] Zelnik-Manor, L., & Perona, P. (2004). Self-tuning spectral clustering. In Advances in NIPS (pp. 1601-1608).
[114] Zhu, S. C., & Yuille, A. (1996). Region competition: Unifying snakes, region growing, and Bayes/MDL for multiband image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(9), 884-900.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.