×

Characterizing functional dependencies in formal concept analysis with pattern structures. (English) Zbl 1320.68064

Summary: Computing functional dependencies from a relation is an important database topic, with many applications in database management, reverse engineering and query optimization. Whereas it has been deeply investigated in those fields, strong links exist with the mathematical framework of Formal Concept Analysis. Considering the discovery of functional dependencies, it is indeed known that a relation can be expressed as the binary relation of a formal context, whose implications are equivalent to those dependencies. However, this leads to a new data representation that is quadratic in the number of objects w.r.t. the original data. Here, we present an alternative avoiding such a data representation and show how to characterize functional dependencies using the formalism of pattern structures, an extension of classical FCA to handle complex data. We also show how another class of dependencies can be characterized with that framework, namely, degenerated multivalued dependencies. Finally, we discuss and compare the performances of our new approach in a series of experiments on classical benchmark datasets.

MSC:

68P15 Database theory
06A15 Galois correspondences, closure operators (in relation to ordered sets)
68T30 Knowledge representation

Software:

LCM; TANE; FastFDs
PDF BibTeX XML Cite
Full Text: DOI Link

References:

[1] Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Reading (1995) · Zbl 0848.68031
[2] Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining, pp. 307-328. AAAI/MIT Press (1996)
[3] Babin, MA; Kuznetsov, SO, Computing premises of a minimal cover of functional dependencies is intractable, Discret. Appl. Math., 161, 742-749, (2013) · Zbl 1263.68046
[4] Baixeries, J.: Lattice Characterization of Armstrong and Symmetric Dependencies (PhD Thesis). Universitat Politècnica de Catalunya (2007)
[5] Baixeries, J., Balcázar, J.L.: Characterization and armstrong relations for degenerate multivalued dependencies using formal concept analysis. In: Ganter, B., Godin, R. (eds.) ICFCA, Volume 3403 of Lecture Notes in Computer Science, pp. 162-175. Springer (2005 · Zbl 1078.68768
[6] Baudinet, M; Chomicki, J; Wolper, P, Constraint-generating dependencies, J. Comput. Syst. Sci., 59, 94-115, (1999) · Zbl 0939.68030
[7] Beeri, C; Dowd, M; Fagin, R; Statman, R, On the structure of Armstrong relations for functional dependencies, J. ACM, 31, 30-46, (1984) · Zbl 0629.68096
[8] Beeri, C; Vardi, MY, Formal systems for tuple and equality generating dependencies, SIAM J. Comput., 13, 76-98, (1984) · Zbl 0544.68064
[9] Belohlávek, R., Vychodil, V.: Data tables with similarity relations: functional dependencies, complete rules and non-redundant bases. In: Lee,M.-L., Tan, K.-L.,Wuwongse, V. (eds.) DASFAA, Volume 3882 of Lecture Notes in Computer Science, pp. 644-658. Springer (2006) · Zbl 0462.68082
[10] Bohannon, P., Fan, W., Geerts, F., Jia, X., Kementsietsidis, A.: Conditional functional dependencies for data cleaning. In: Chirkova, R., Dogac, A., Özsu, M.T., Sellis, T.K. (eds.) ICDE, pp. 746-755. IEEE (2007) · Zbl 1026.06008
[11] Caspard, N; Monjardet, B, The lattices of closure systems, closure operators, and implicational systems on a finite set: a survey, Discret. Appl. Math., 127, 241-269, (2003) · Zbl 1026.06008
[12] Davey, B.A., Priestley, H.A.: Introduction to Lattices and Order. Cambridge University Press, Cambridge (1990) · Zbl 0701.06001
[13] Diallo, T; Novelli, N; Petit, J-M, Discovering (frequent) constant conditional functional dependencies, IJDMMM, 4, 205-223, (2012)
[14] Fan,W.: Dependencies revisited for improving data quality. In: Proceedings of the Twenty-Seventh ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS ’08, pp. 159-170. ACM, New York (2008)
[15] Fan, W; Geerts, F; Li, J; Xiong, M, Discovering conditional functional dependencies, IEEE Trans. Knowl. Data Eng., 23, 683-698, (2011)
[16] Ganter, B., Kuznetsov, S.O.: Pattern structures and their projections. In: Delugach, H.S., Stumme, G. (eds.) Conceptual Structures: Broadening the Base, Proceedings of the 9th International Conference on Conceptual Structures (ICCS 2001), LNCS 2120, pp. 129-142. Springer (2001) · Zbl 0994.68147
[17] Ganter, B., Wille, R.: Formal Concept Analysis. Springer, Berlin (1999) · Zbl 0909.06001
[18] Graetzer, G., Davey, B., Freese, R., Ganter, B., Greferath, M., Jipsen, P., Priestley, H., Rose, H., Schmidt, E., Schmidt, S., Wehrung, F., Wille, R.: General Lattice Theory. Freeman, San Francisco (1971)
[19] Guigues, J-L; Duquenne, V, Familles minimales d’implications informatives résultant d’un tableau de données binaires, Math. Sci. Hum., 95, 5-18, (1986)
[20] Huhtala, Y; Kärkkäinen, J; Porkka, P; Toivonen, H, Tane: an efficient algorithm for discovering functional and approximate dependencies, Comput. J., 42, 100-111, (1999) · Zbl 0944.68054
[21] Kanellakis, P.C.: Elements of relational database theory. In: van Leeuwen, J. (ed.) Handbook of Theoretical Computer Science, vol. B, pp. 1073-1156. MIT Press, Cambridge (1990) · Zbl 0900.68090
[22] Kaytoue, M., Kuznetsov, S.O., Napoli, A.: Revisiting numerical pattern mining with formal concept analysis. In: Walsh, T. (ed.) IJCAI 2011, Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, July 16-22, 2011, pp. 1342-1347. IJCAI/AAAI, Barcelona (2011) · Zbl 0544.68064
[23] Kaytoue, M; Kuznetsov, SO; Napoli, A; Duplessis, S, Mining gene expression data with pattern structures in formal concept analysis, Inf. Sci., 181, 1989-2001, (2011)
[24] Kuznetsov, SO, Machine learning on the basis of formal concept analysis, Autom. Remote Control, 62, 1543-1564, (2001) · Zbl 1069.68580
[25] Kuznetsov, S.O.: Galois connections in data analysis: contributions from the soviet era and modern russian research. In: Ganter, B., Stumme, G., Wille, R. (eds.) Formal Concept Analysis, Foundations and Applications. Lecture Notes in Computer Science 3626, pp. 196-225. Springer (2005) · Zbl 1152.68628
[26] Kuznetsov, SO; Obiedkov, SA, Comparing performance of algorithms for generating concept lattices, J. Exp. Theor. Artif. Intell., 14, 189-216, (2002) · Zbl 1024.68020
[27] Lopes, S; Petit, J-M; Lakhal, L, Functional and approximate dependency mining: database and fca points of view, J. Exp. Theor. Artif. Intell., 14, 93-114, (2002) · Zbl 1024.68027
[28] Maier, D.: The Theory of Relational Databases. Computer Science, Rockville (1983) · Zbl 0519.68082
[29] Medina, R., Nourine, L.: A unified hierarchy for functional dependencies, conditional functional dependencies and association rules. In: Ferré, S., Rudolph, S. (eds.) ICFCA, Volume 5548 of Lecture Notes in Computer Science, pp. 98-113. Springer (2009) · Zbl 1248.68184
[30] Medina, R., Nourine, L.: Conditional functional dependencies: an fca point of view. In: Kwuida, L., Sertkaya, B. (eds.) ICFCA, Volume 5986 of Lecture Notes in Computer Science, pp. 161-176. Springer (2010) · Zbl 1274.68493
[31] Nedjar, S., Pesci, F., Lakhal, L., Cicchetti, R.: The agree concept lattice for multidimensional database analysis. In: Valtchev, P., Jäschke, R. (eds.) ICFCA, Volume 6628 of Lecture Notes in Computer Science, pp. 219-234. Springer (2011) · Zbl 1326.68288
[32] Ramakrishnan, R., Gehrke, J.: Database Management Systems, 2nd edn. Osborne/McGraw-Hill, Berkeley (2000) · Zbl 1058.68050
[33] Sagiv, Y; Delobel, C; Parker, DS; Fagin, R, An equivalence between relational database dependencies and a fragment of propositional logic, J. ACM, 28, 435-453, (1981) · Zbl 0462.68082
[34] Simovici, DA; Cristofor, D; Cristofor, L, Impurity measures in databases, Acta Inf., 38, 307-324, (2002) · Zbl 1025.68034
[35] Simovici, D.A., Tenney, R.L.: Relational Database Systems, 1st edn. Academic, Orlando (1995)
[36] Song, S; Chen, L, Differential dependencies: reasoning and discovery, ACM Trans. Database Syst., 36, 16:1-16:41, (2011)
[37] Song, S; Chen, L, Efficient discovery of similarity constraints for matching dependencies, Data Knowl. Eng., 87, 146-166, (2013)
[38] Song, S; Chen, L; Yu, PS, Comparable dependencies over heterogeneous data, VLDB J., 22, 253-274, (2013)
[39] Ullman, J.: Principles of Database Systems and Knowledge-Based Systems, vol. 1-2. Computer Science, Rockville (1989)
[40] Uno, T., Kiyomi, M., Arimura, H.: Lcm ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: Bayardo, R.J. Jr., Goethals, B., Zaki, M.J. (eds.) FIMI, Volume 126 of CEUR Workshop Proceedings. CEUR-WS.org (2004)
[41] Valtchev, P., Missaoui, R., Godin, R.: Formal concept analysis for knowledge discovery and data mining: The new challenges. In: Eklund, P.W. (ed.) ICFCA, Volume 2961 of Lecture Notes in Computer Science, pp. 352-371. Springer (2004) · Zbl 1198.68246
[42] Wille, R, Why can concept lattices support knowledge discovery in databases?, J. Exp. Theor. Artif. Intell., 14, 81-92, (2002) · Zbl 1022.68042
[43] Wyss, C., Giannella, C., Robertson, E.L.: Fastfds: a heuristic-driven, depth-first algorithm for mining functional dependencies from relation instances—extended abstract. In: Proceedings of the Third International Conference on Data Warehousing and Knowledge Discovery, DaWaK ’01, pp. 101-110. Springer, London (2001) · Zbl 0986.68805
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.