×

A simple greedy algorithm for finding functional relations: Efficient implementation and average case analysis. (English) Zbl 1026.68035

Summary: Inferring functional relations from relational databases is important for the discovery of scientific knowledge because many experimental data are represented in the form of tables and many rules are represented in the form of functions. A simple greedy algorithm has been known as an approximation algorithm for this problem. This paper presents an efficient implementation of the algorithm. This paper also shows that the algorithm can identify an exact solution for simple functions if input data for each function are generated uniformly at random and the size of the domain is bounded by a constant. Results of computational experiments using artificially generated data are presented to verify the approach.

MSC:

68P15 Database theory
68W05 Nonnumerical algorithms
68T05 Learning and adaptive systems in artificial intelligence

Software:

C4.5
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] R. Agrawal, T. Imielinski, A. Swami, Mining association rules between sets of items in large databases, in: Proc. ACM SIGMOD Conf. on Management of Data, ACM, New York, 1993, pp. 207-216.; R. Agrawal, T. Imielinski, A. Swami, Mining association rules between sets of items in large databases, in: Proc. ACM SIGMOD Conf. on Management of Data, ACM, New York, 1993, pp. 207-216.
[2] Akutsu, T.; Bao, F., Approximating minimum keys and optimal substructure screens, (Proc. COCOON ’96. Proc. COCOON ’96, Lecture Notes in Computer Science, Vol. 1090 (1996), Springer: Springer Berlin), 290-299
[3] T. Akutsu, S. Miyano, Selecting informative genes for cancer classification using gene expression data, in: Proc. 2001 IEEE—EURASIP Workshop on Nonlinear Signal and Image Processing (CD-ROM book, http://www.ee.udel.edu/nsip/index.html; T. Akutsu, S. Miyano, Selecting informative genes for cancer classification using gene expression data, in: Proc. 2001 IEEE—EURASIP Workshop on Nonlinear Signal and Image Processing (CD-ROM book, http://www.ee.udel.edu/nsip/index.html
[4] Akutsu, T.; Miyano, S.; Kuhara, S., Algorithms for identifying Boolean networks and related biological networks based on matrix multiplication and fingerprint function, J. Comput. Biol., 7, 331-343 (2000)
[5] T. Akutsu, A. Takasu, Inferring approximate functional dependencies from example data, in: Proc. AAAI93 Workshop on Knowledge Discovery in Databases, AAAI, New York, 1993, pp. 138-152.; T. Akutsu, A. Takasu, Inferring approximate functional dependencies from example data, in: Proc. AAAI93 Workshop on Knowledge Discovery in Databases, AAAI, New York, 1993, pp. 138-152.
[6] Coppersmith, D.; Winograd, S., Matrix multiplication via arithmetic progression, J. Symbol. Comput., 9, 251-280 (1990) · Zbl 0702.65046
[7] Cormen, T. H.; Leiserson, C. E.; Rivest, R. L., Introduction to Algorithms (1990), The MIT Press: The MIT Press Cambridge, MA · Zbl 1158.68538
[8] Johnson, D. S., Approximation algorithms for combinatorial problems, J. Comput. System Sci., 9, 256-278 (1974) · Zbl 0296.65036
[9] J. Kivinen, H. Mannila, Approximate dependency inference from relations, in: Proc. Fourth Internat. Conf. on Database Theory, 1992, pp. 86-98.; J. Kivinen, H. Mannila, Approximate dependency inference from relations, in: Proc. Fourth Internat. Conf. on Database Theory, 1992, pp. 86-98.
[10] H. Mannila, K. Räihä, Dependency inference, in: Proc. 13th VLDB Conf., 1987, pp. 155-158.; H. Mannila, K. Räihä, Dependency inference, in: Proc. 13th VLDB Conf., 1987, pp. 155-158.
[11] Mannila, H.; Räihä, K., On the complexity of inferring functional dependencies, Discrete Appl. Math., 40, 237-243 (1992) · Zbl 0767.68034
[12] Motowani, R.; Raghavan, P., Randomized Algorithms (1995), Cambridge University Press: Cambridge University Press Cambridge
[13] Quinlan, J. R., C4.5 Programs for Machine Learning (1993), Morgan Kaufmann: Morgan Kaufmann Los Altos, CA
[14] Somogyi, R.; Sniegoski, C. A., Modeling the complexity of genetic networks: understanding multigene and pleiotropic regulation, Complexity, 1, 45-63 (1996)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.