×

A study of similarity measures through the paradigm of measurement theory: the classic case. (English) Zbl 1418.03156

Summary: Similarity measures are used in various tasks dealing with the management of data or information, such as decision-making, case-based reasoning, cased-based information retrieval, recommendation systems and user profile analysis, to cite but a few. The paper aims at providing information on similarity measures that can help in choosing “a priori” one of them on the basis of the semantics behind this choice. To this end, we study similarity measures from the point of view of the ranking relation they induce on object pairs. Using a classic method of measurement theory, we establish necessary and sufficient conditions for the existence of a particular class of numerical similarity measures, representing a given binary relation among pairs of objects which express the idea of “no more similar than”. The above conditions are all (and only) the rules which are accepted when one decides to evaluate similarity through any element of a specific class of similarity measures. We exemplify the possible application of such conditions and the relevant results on a real-world problem and discuss them in the ambit of cognitive psychology. We consider here a classical context, while the fuzzy context will be studied in a companion paper.

MSC:

03E72 Theory of fuzzy sets, etc.
68T37 Reasoning under uncertainty in the context of artificial intelligence
91B06 Decision theory
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Anderberg MR (1973) Cluster analysis for applications. Academic Press, New York · Zbl 0299.62029
[2] Baioletti M, Coletti G, Petturiti D (2012) Advances in computational intelligence: 14th international conference on information processing and management of uncertainty in knowledge-based systems, IPMU 2012, Catania, Italy, July 9-13, 2012, Proceedings, Part III, Chapter. Weighted attribute combinations based similarity measures. Springer, Berlin, pp 211-220 · Zbl 1252.68281
[3] Bertoluzza C, Di Bacco M, Doldi V (2004) An axiomatic characterization of the measures of similarity. Sankhya 66:474-486 · Zbl 1192.62005
[4] Bhutani KR, Rosenfeld A (2003) Dissimilarity measures between fuzzy sets or fuzzy structures. Inf Sci 152:313-318 · Zbl 1040.03517 · doi:10.1016/S0020-0255(03)00076-8
[5] Boriah S, Chandola V, Kumar V (2008) Similarity measures for categorical data: a comparative evaluation. In: Proceedings of the 8th SIAM international conference on data mining, SIAM, pp 243-254
[6] Bouchon-Meunier B, Rifqi M, Bothorel S (1996) Towards general measures of comparison of objects. Fuzzy Sets Syst 84:143-153 · Zbl 0917.94028 · doi:10.1016/0165-0114(96)00067-X
[7] Bouchon-Meunier B, Rifqi M, Lesot MJ (2008) Similarities in fuzzy data mining: from a cognitive view to real-world applications. In Zurada J, Yen G, Wang J (eds) Computational intelligence: research frontiers. WCCI 2008, vol 5050. Springer, LNCS, pp 349-367
[8] Bouchon-Meunier B, Coletti G, Lesot MJ, Rifqi M (2009) Towards a conscious choice of a similarity measure: a qualitative point of view. In: Sossai C, Ghemello G (eds) Symbolic and quantitative approaches to reasoning with uncertainty: Ecsqaru 2009 proceedings, vol 5590. Springer, LNAI, pp 542-553 · Zbl 1245.68204
[9] Bouchon-Meunier B, Coletti G, Lesot MJ, Rifqi M (2010) Towards a conscious choice of a fuzzy similarity measure: a qualitative point of view. In: Hllermeier E, Kruse R, Hoffmann F (eds) Computational intelligence for knowledge-based system design: IPMU 2010 proceedings, vol 6178. Springer, LNAI, pp 1-10 · Zbl 1245.68204
[10] Choi S-S, Cha S-H, Tappert CC (2010) A survey of binary similarity and distance measures. J Syst Cybern Inf 8(1):43-48
[11] Coletti G, Bouchon-Meunier B (2018) A study of similarity measures through the paradigm of measurement theory: the fuzzy case. SoftComputing (submitted) · Zbl 1491.03043
[12] Coletti G, Di Bacco M (1989) Qualitative characterization of a dissimilarity and concentration index. Metron XLVII:121-130 · Zbl 0718.62029
[13] Coletti G, Petturiti D, Vantaggi B (2017) Fuzzy weighted attribute combinations based similarity measures. In: Proceedings of ECSQARU 2017 (Symbolic and quantitative approaches to reasoning with uncertainty), vol 10369. LNCS, pp 364-374 · Zbl 1491.03044
[14] Couso I, Garrido L, Sànchez L (2013) Similarity and dissimilarity measures between fuzzy sets: a formal relational study. Inf Sci 229:122-141 · Zbl 1293.03017 · doi:10.1016/j.ins.2012.11.012
[15] Cross VV, Sudkamp TA (2002) Similarity and compatibility in fuzzy set theory: assessment and applications. Studies in fuzziness and soft computing, vol 93. Springer, Berlin · Zbl 0992.03066 · doi:10.1007/978-3-7908-1793-5
[16] Dice LR (1945) Measures of the amount of ecological association between species. Ecology 26:297-302 · doi:10.2307/1932409
[17] Dvoraki J, Baume N, Botré Broséus J, Budgett R, Frey WO, Geyer H, Harcourt PR, Ho D, Howman D, Isola V, Lundby C, Marclay F, Peytavin A, Pipe A, Pitsiladis YP, Reichel C, Robinson N, Rodchenkov G, Saugy M, Sayegh S, Segura J, Thevis M, Vernec A, Viret M, Vouillamoz M, Zorzoli M (2014) Time for change: a roadmap to guide the implementation of the World Anti-Doping Code 2015. Br J Sports Med: BJSM 48:801-806 · doi:10.1136/bjsports-2014-093561
[18] Filev P, Hadjiiski L, Sahiner B, Chan HP, Helvie MA (2005) Comparison of similarity measures for the task of template matching of masses on serial mammograms. Med Phys 32(2):515-529 · doi:10.1118/1.1851892
[19] Gilboa I, Schmeidler D (1995) Case-based decision theory. Q J Econ 110:605-639 · Zbl 0836.90005 · doi:10.2307/2946694
[20] Gilboa I, Schmeidler D (1997) Act similarity in case-based decision theory. Econ Theory 9:47-61 · Zbl 0866.90002 · doi:10.1007/BF01213442
[21] Gilboa I, Lieberman O, Schmeidler D (2006) A similarity-based approach to prediction. Rev Econ Stat 162(1):124-131 · Zbl 1441.62703
[22] Ha V, Haddawy P (2003) Similarity of personal preferences: theoretical foundations and empirical analysis. Artif Intell 146:149-173 · Zbl 1082.68835 · doi:10.1016/S0004-3702(03)00013-4
[23] Hahn U, Ramscar M (eds) (2001) Similarity and categorization. Oxford University Press, Oxford
[24] Hwang CM, Yang MS, Hung WL, Lee MG (2012) A similarity measure of intuitionistic fuzzy sets based on the Sugeno integral with its application to pattern recognition. Inf Sci 189:93-109 · Zbl 1247.03114 · doi:10.1016/j.ins.2011.11.029
[25] Jaccard P (1908) Nouvelles recherches sur la distribution florale. Bull Soc Vaud Sci Nat 44:223-270
[26] Krantz D, Luce R, Suppes P, Tversky A (1971) Foundations of measurement, vol I. Academic Press, New York · Zbl 0232.02040
[27] Lesot MJ, Rifqi M (2010) Order-based equivalence degrees for similarity and distance measures. In: Hllermeier E, Kruse R, Hoffmann F (eds) Computational intelligence for knowledge-based systems design. IPMU 2010, vol 6178. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, pp 19-28
[28] Lesot MJ, Rifqi M, Benhadda H (2009) Similarity measures for binary and numerical data: a survey. Int J Knowl Eng Soft Data Paradig (KESDP) 1:63-84 · doi:10.1504/IJKESDP.2009.021985
[29] Li Y, Qin K, He X (2014) Some new approaches to constructing similarity measures. Fuzzy Sets Syst 234:46-60 · Zbl 1315.03096 · doi:10.1016/j.fss.2013.03.008
[30] Narens L (1974) Minimal conditions for additive conjoint measurement and qualitative probability. J Math Psychol 11:404-430 · Zbl 0307.02038 · doi:10.1016/0022-2496(74)90030-3
[31] Ochiai A (1957) Zoogeographic studies on the soleoid fishes found in Japan and its neighbouring regions. Bull Jpn Soc Sci Fish 22:526-30 · doi:10.2331/suisan.22.526
[32] Pelillo M (ed) (2013) Similarity-based pattern analysis and recognition. Advances in computer vision and pattern recognition. Springer, London · Zbl 1279.68016
[33] Penney GP, Weese J, Little JA, Desmedt P, Hill DLG, Hawkes DJ (1998) A comparison of similarity measures for use in 2-D-3-D medical image registration. In: Proceedings of MICCAI 1998: medical image computing and computer-assisted intervention MICCAI98, vol. 1496. LNCS, pp 1153-1161
[34] Rissland E (2006) AI and similarity. IEEE Intell Syst 21:33-49 · doi:10.1109/MIS.2006.38
[35] Rogers DJ, Tanimoto TT (1960) A computer program for classifying plants. Science 132:1115-1118 · doi:10.1126/science.132.3434.1115
[36] Sokal RR, Michener C (1958) A statistical method for evaluating systematic relationships. Univ Kansas Sci Bull 38:1409-1438
[37] Sokal RR, Sneath PHA (1963) Priciples of numerical taxonomy. W.H. Freeman, San Francisco
[38] Sorensen T (1948) A method of establishing groups of equal amplitude in plant sociology based on similarity of species content and its application to analyses of the vegetation on Danish commons. K Dan Vidensk Selsk Biol Skr 5:1-34
[39] Simmons S, Estes Z (2008) Individual differences in the perception of similarity and difference. Cognition 106(3):781-795 · doi:10.1016/j.cognition.2008.07.003
[40] Suppes P, Krantz D, Luce R, Tversky A (1989) Foundations of measurement, vol II. Academic Press, New York · Zbl 0719.03003
[41] Toussaint GT (2004) A comparison of rhythmic similarity measures. In: Proceedings 5th international conference on music information retrieval
[42] Tversky A (1977) Features of similarity. Psychol Rev 84:327-352 · doi:10.1037/0033-295X.84.4.327
[43] WADA https://www.wada-ama.org/en/resources/the-code/world-anti-doping-code
[44] Zhang Z, Huang K, Tan T (2006) Comparison of similarity measures for trajectory clustering in outdoor surveillance scenes. In: Proceedings of 18th international conference on pattern recognition (ICPR’06). IEEE. https://doi.org/10.1109/ICPR.2006.392
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.