×

zbMATH — the first resource for mathematics

A novel mixed integer linear programming model for clustering relational networks. (English) Zbl 1384.90065
Summary: Integer programming models for clustering have applications in diverse fields addressing many problems such as market segmentation and location of facilities. Integer programming models are flexible in expressing objectives subject to some special constraints of the clustering problem. They are also important for guiding clustering algorithms that are capable of handling high-dimensional data. Here, we present a novel mixed integer linear programming model especially for clustering relational networks, which have important applications in social sciences and bioinformatics. Our model is applied to several social network data sets to demonstrate its ability to detect natural network structures.
MSC:
90C11 Mixed integer programming
90C27 Combinatorial optimization
90C35 Programming involving graphs or networks
90C90 Applications of mathematical programming
91D30 Social networks; opinion dynamics
05C82 Small world graphs, complex networks (graph-theoretic aspects)
05C12 Distance in graphs
68R05 Combinatorics in computer science
68R10 Graph theory (including graph drawing) in computer science
90C05 Linear programming
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Do, JH; Choi, DK, Clustering approaches to identifying gene expression patterns from DNA microarray data, Mol. Cells, 25, 279-288, (2008)
[2] Glover, FW; Kochenberger, G, New optimization models for data mining, Int. J. Inf. Technol. Decis. Mak., 05, 605-609, (2006)
[3] Shamir, R; Sharan, R; Jiang, T (ed.); Smith, T (ed.); Xu, Y (ed.); Zhang, M (ed.), Algorithmic approaches to clustering gene expression data, 269-299, (2002), Cambridge
[4] Pirim, H; Eksioglu, B; Perkins, AD; Yuceer, C, Clustering of high throughput gene expression data, Comput. Oper. Res., 39, 3046-3061, (2012) · Zbl 1349.62554
[5] Rao, MR, Cluster analysis and mathematical programming, J. Am. Stat. Assoc., 66, 622-626, (1971) · Zbl 0238.90042
[6] Kusiak, A, Analysis of integer programming formulations of clustering problems, Image Vis. Comput., 2, 35-40, (1984)
[7] Saglam, B; Salman, FS; Sayin, S; Turkay, M, A mixed-integer programming approach to the clustering problem with an application in customer segmentation, Eur. J. Oper. Res., 173, 866-879, (2006) · Zbl 1131.90434
[8] Mehrotra, A; Trick, MA, Cliques and clustering: a combinatorial approach, Oper. Res. Lett., 22, 1-12, (1998) · Zbl 0911.90337
[9] Xu, G; Tsoka, S; Papageorgiou, LG, Finding community structures in complex networks using mixed integer optimisation, Eur. Phys. J. B, 60, 231-239, (2007) · Zbl 1189.90027
[10] Cafieri, S; Hansen, P; Batsyn, M (ed.); Kalyagin, V (ed.); Pardalos, P (ed.), Using mathematical programming to refine heuristic solutions for network clustering, No. 104, (2014), Switzerland
[11] Agarwal, G; Kempe, D, Modularity-maximizing graph communities via mathematical programming, Eur. Phys. J. B, 66, 409-418, (2008) · Zbl 1188.90262
[12] Martins, P.: Modeling the maximum edge-weight k-plex partitioning problem. Cornell University. arxiv:1612.06243 [math.co] (2016)
[13] Nascimento, M; Toledo, F; Carvalho, A, Investigation of a new GRASP-based clustering algorithm applied to biological data, Comput. Oper. Res., 37, 1381-1388, (2010) · Zbl 1183.68494
[14] Pirim, H; Eksioglu, B; Perkins, AD, Clustering high throughput biological data with B-MST, a minimum spanning tree based heuristic, Comput. Biol. Med., 62, 94-102, (2015)
[15] Pirim, H; Gautam, D; Bhowmik, T; Perkins, AD; Eksioglu, B; Alkan, A, Performance of an ensemble clustering algorithm on biological data sets, Math. Comput. Appl., 16, 87-96, (2011)
[16] Tan, MP; Smith, EN; Broach, JR; Floudas, CA, Microarray data mining: a novel optimization-based approach to uncover biologically coherent structures, BMC Bioinform., 9, 1-21, (2008)
[17] Radicchi, F; Castellano, C; Cecconi, F; Loreto, V; Parisi, D, Defining and identifying communities in networks, Proc. Natl. Acad. Sci. USA, 101, 2658-2663, (2004)
[18] Cafieri, S; Hansen, P; Liberti, L, Locally optimal heuristic for modularity maximization of networks, Phys. Rev. E, 83, 1-8, (2011)
[19] Prieto, C; Risueno, A; Fontanillo, C; Rivas, JDL, Human gene coexpression landscape: confident network derived from tissue transcriptomic profiles, PLoS ONE, 3, 1-14, (2008)
[20] IBM ILOG CPLEX 12.6 (2014)
[21] R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2014). http://www.R-project.org/
[22] Csardi, G; Nepusz, T, The igraph software package for complex network research, Int. Complex Syst., 1695, 1-9, (2006)
[23] Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., Hornik, K.: Cluster: Cluster Analysis Basics and Extensions, R package version 2.0.6 edn. (2017) · Zbl 1349.62554
[24] Brock, G., Pihur, V., Datta, S., Datta, S.: clValid: Validation of Clustering Results, R package version 0.6-6 edn. (2014)
[25] Lusseau, D, The emergent properties of a dolphin social network, Proc. R. Soc. Lond. B Biol. Sci., 270, 186-188, (2003)
[26] Newman, MEJ; Girvan, M, Finding and evaluating community structure in networks, Phys. Rev. E, 69, 026113, (2004)
[27] Arbelaitz, O; Gurrutxaga, I; Muguerza, J; Perez, JM; Perona, I, An extensive comparative study of cluster validity indices, Pattern Recognit., 46, 243-256, (2013)
[28] Rousseeuw, PJ, Silhouettes: A graphical aid to the interpretation and validation of cluster analysis, J. Comput. Appl. Math., 20, 53-65, (1987) · Zbl 0636.62059
[29] Dunn, JC, Well-separated clusters and optimal fuzzy partitions, J. Cybernet., 4, 95-104, (1974) · Zbl 0304.68093
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.