×

On multi-modal fusion learning in constraint propagation. (English) Zbl 1440.68227

Summary: Constraint propagation methods demonstrate splendid performance in constrained clustering tasks. Although some multi-modal constraint propagation methods have been proposed in recent years, a feasible and robust approach to multi-modal feature fusion in pairwise constraint propagation is still in demand. This paper presents a novel multi-modal fusion approach in order to cope with the constraint propagation on multi-modal datasets, called Multi-modal Fusion Learning (MFL). The proposed method can reach a multi-modal fusion based on the observed constraint information and the propagation process. It is capable of handling any number of modalities without any prior knowledge of each modality. We merge the fusion learning and constraint propagation into one unified problem and solve it by a bound-constrained quadratic optimization. Our proposed method has been tested in clustering tasks on two publicly available multi-modal datasets to show its superior performance.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
62H30 Classification and discrimination; cluster analysis (statistical aspects)
90C26 Nonconvex programming, global optimization

Software:

TagProp; LANCELOT
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Belkin, M.; Niyogi, P., Laplacian eigenmaps and spectral techniques for embedding and clustering., NIPS (2001)
[2] Bertsekas, D. P.; Rheinboldt, W., Constrained Optimization and Lagrange Multiplier Methods (1982) · Zbl 0572.90067
[3] Chung, F. R., Spectral Graph Theory, 92 (1997), American Mathematical Soc. · Zbl 0867.05046
[4] Conn, A.; Gould, N.; Toint, P. L., Lancelot: A Fortran Package for Large-scale Nonlinear Optimization (Release A) (1992) · Zbl 0761.90087
[5] Davis, J. V.; Kulis, B.; Jain, P.; Sra, S.; Dhillon, I. S., Information-theoretic metric learning, ICML (2007)
[6] M. Everingham, L. Van Gool, C.K.I. Williams, J. Winn, A. Zisserman, The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results, 2007, (http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html; M. Everingham, L. Van Gool, C.K.I. Williams, J. Winn, A. Zisserman, The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results, 2007, (http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html
[7] Fu, Z.; Ip, H. H.; Lu, H.; Lu, Z., Multi-modal constraint propagation for heterogeneous image clustering, Proceedings of the 19th ACM International Conference on Multimedia, 143-152 (2011), ACM
[8] Fu, Z.; Lu, H.; Ip, H. H.; Lu, Z., Modalities consensus for multi-modal constraint propagation, Proceedings of the 20th ACM International Conference on Multimedia, 773-776 (2012), ACM
[9] Fu, Z.; Lu, Z.; Ip, H. H.-S.; Peng, Y.; Lu, H., Symmetric graph regularized constraint propagation., AAAI (2011)
[10] Guillaumin, M.; Mensink, T.; Verbeek, J.; Schmid, C., Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation, 2009 IEEE 12th International Conference on Computer Vision, 309-316 (2009), IEEE
[11] Han, P.; Liu, G.; Huang, S.; Yuan, W.; Lu, Z., Segmentation with selectively propagated constraints, International Conference on Neural Information Processing, 585-592 (2016), Springer
[12] Jian, M.; Jung, C., Interactive image segmentation using adaptive constraint propagation, IEEE Trans. Image Process., 25, 3, 1301-1311 (2016) · Zbl 1408.94285
[13] Kahou, S. E.; Bouthillier, X.; Lamblin, P.; Gulcehre, C.; Michalski, V.; Konda, K.; Jean, S.; Froumenty, P.; Dauphin, Y.; Boulanger-Lewandowski, N., Emonets: multimodal deep learning approaches for emotion recognition in video, J. Multimodal User Interfaces, 10, 2, 99-111 (2016)
[14] Kan, M.; Shan, S.; Zhang, H.; Lao, S.; Chen, X., Multi-view discriminant analysis, IEEE Trans. Pattern Anal. Mach. Intell., 38, 1, 188-194 (2016)
[15] Lahat, D.; Adali, T.; Jutten, C., Multimodal data fusion: an overview of methods, challenges, and prospects, Proc. IEEE, 103, 9, 1449-1477 (2015)
[16] Lu, Z.; Carreira-Perpinan, M. A., Constrained spectral clustering through affinity propagation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1-8 (2008)
[17] Lu, Z.; Ip, H. H., Constrained spectral clustering via exhaustive and efficient constraint propagation, European Conference on Computer Vision, 1-14 (2010), Springer
[18] Lu, Z.; Peng, Y., Exhaustive and efficient constraint propagation: a graph-based learning approach and its applications, Int. J. Comput. Vis., 103, 3, 306-325 (2013) · Zbl 1270.68351
[19] Lu, Z.; Peng, Y., Unified constraint propagation on multi-view data., AAAI (2013)
[20] Nocedal, J.; Wright, S. J., Numerical Optimization (2006), Springer · Zbl 1104.65059
[21] Poria, S.; Cambria, E.; Bajpai, R.; Hussain, A., A review of affective computing: from unimodal analysis to multimodal fusion, Inf. Fusion, 37, 98-125 (2017)
[22] Poria, S.; Peng, H.; Hussain, A.; Howard, N.; Cambria, E., Ensemble application of convolutional neural networks and multiple kernel learning for multimodal sentiment analysis, Neurocomputing, 261, 217-230 (2017)
[23] Shao, L.; Liu, L.; Yu, M., Kernelized multiview projection for robust action recognition, Int. J. Comput. Vis., 118, 2, 115-129 (2016)
[24] Shen, G.; Jia, J.; Nie, L.; Feng, F.; Zhang, C.; Hu, T.; Chua, T.-S.; Zhu, W., Depression detection via harvesting social media: amultimodal dictionary learning solution, Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17), 3838-3844 (2017)
[25] Shi, J.; Malik, J., Normalized cuts and image segmentation, IEEE TPAMI, 22, 8, 888-905 (2000)
[26] Slawski, M.; Hein, M., Sparse recovery by thresholded non-negative least squares, Advances in Neural Information Processing Systems, 1926-1934 (2011)
[27] Slawski, M.; Hein, M., Non-negative least squares for high-dimensional linear models: consistency and sparse recovery without regularization, Electron. J. Stat., 7, 3004-3056 (2013) · Zbl 1280.62086
[28] Tibshirani, R., Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser. B, 267-288 (1996) · Zbl 0850.62538
[29] Von Luxburg, U., A tutorial on spectral clustering, Stat. Comput., 17, 4, 395-416 (2007)
[30] Wagstaff, K.; Cardie, C., Clustering with instance-level constraints, AAAI/IAAI, 1097 (2000)
[31] Wagstaff, K.; Cardie, C.; Rogers, S.; Schrödl, S., Constrained k-means clustering with background knowledge, ICML, 1, 577-584 (2001)
[32] Wang, D.; Gao, X.; Wang, X., Semi-supervised nonnegative matrix factorization via constraint propagation, IEEE Trans. Cybern., 46, 1, 233-244 (2016)
[33] Wang, M.; Hua, X.-S.; Hong, R.; Tang, J.; Qi, G.-J.; Song, Y., Unified video annotation via multigraph learning, IEEE Trans. Circuits Syst. Video Technol., 19, 5, 733-746 (2009)
[34] Weinberger, K. Q.; Blitzer, J.; Saul, L. K., Distance metric learning for large margin nearest neighbor classification, NIPS (2005)
[35] Wu, B.; Hu, B.-G.; Ji, Q., A coupled hidden Markov random field model for simultaneous face clustering and tracking in videos, Pattern Recognit., 64, 361-373 (2017)
[36] Xing, E. P.; Jordan, M. I.; Russell, S.; Ng, A. Y., Distance metric learning with application to clustering with side-information, NIPS (2002)
[37] Xu, J.; Han, J.; Nie, F., Discriminatively embedded k-means for multi-view clustering, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5356-5364 (2016)
[38] Xu, J.; Jagadeesh, V.; Manjunath, B., Multi-label learning with fused multimodal bi-relational graph, IEEE Trans. Multimedia, 16, 2, 403-412 (2014)
[39] Yu, J.; Yang, X.; Gao, F.; Tao, D., Deep multimodal distance metric learning using click constraints for image ranking, IEEE Trans. Cybern., 47, 12, 4014-4024 (2017)
[40] Zhou, D.; Bousquet, O.; Lal, T. N.; Weston, J.; Schölkopf, B., Learning with local and global consistency, Adv. Neural Inf. Process. Syst., 16, 16, 321-328 (2004)
[41] Zhu, X.; Loy, C. C.; Gong, S., Constrained clustering with imperfect oracles, IEEE Trans. Neural Netw. Learn. Syst., 27, 6, 1345-1357 (2016)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.