×

Rough sets approach to symbolic value partition. (English) Zbl 1184.68518

Summary: In data mining, searching for simple representations of knowledge is a very important issue. Attribute reduction, continuous attribute discretization and symbolic value partition are three preprocessing techniques which are used in this regard. This paper investigates the symbolic value partition technique, which divides each attribute domain of a data table into a family for disjoint subsets, and constructs a new data table with fewer attributes and smaller attribute domains. Specifically, we investigates the optimal symbolic value partition (OSVP) problem of supervised data, where the optimal metric is defined by the cardinality sum of new attribute domains. We propose the concept of partition reducts for this problem. An optimal partition reduct is the solution to the OSVP-problem. We develop a greedy algorithm to search for a suboptimal partition reduct, and analyze major properties of the proposed algorithm. Empirical studies on various datasets from the UCI library show that our algorithm effectively reduces the size of attribute domains. Furthermore, it assists in computing smaller rule sets with better coverage compared with the attribute reduction approach.

MSC:

68T37 Reasoning under uncertainty in the context of artificial intelligence
68T05 Learning and adaptive systems in artificial intelligence

Software:

UCI-ml
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Jensen, R.; Shen, Q., Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approaches, IEEE Transactions on Knowledge and Data Engineering, 16, 12, 1457-1471 (2004)
[2] Chan, C. C.; Grzymała-Busse, J. W., On the two local inductive algorithms: PRISM and LEM2, Foundations of Computing and Deision Sciences, 19, 185-203 (1994) · Zbl 0939.68756
[3] Quinlan, J. R., Induction of decision trees, Machine Learning, 1, 81-106 (1986)
[4] (Słowiński, R., Intelligent Decision Support - Handbook of Applications and Advances of the Rough Sets Theory (1992), Kluwer Academic Publishers: Kluwer Academic Publishers Dordrecht) · Zbl 0820.68001
[5] Pawlak, Z., Rough sets, International Journal of Computer and Information Sciences, 11, 341-356 (1982) · Zbl 0501.68053
[6] J.G. Bazan, A. Skowron, P. Synak, Dynamic reducts as a tool for extracting laws from decision tables, in: Proceeding of the Symposium on Methodologies for Intelligent Systems, LNAI 869, 1994, pp. 346-355.; J.G. Bazan, A. Skowron, P. Synak, Dynamic reducts as a tool for extracting laws from decision tables, in: Proceeding of the Symposium on Methodologies for Intelligent Systems, LNAI 869, 1994, pp. 346-355.
[7] Kryszkiewicz, M., Comparative studies of alternative type of knowledge reduction in inconsistent systems, International Journal of Intelligent Systems, 16, 1, 105-120 (2001) · Zbl 0969.68146
[8] Liu, Q.; Chen, L.; Zhang, J.; Min, F., Knowledge reduction in inconsistent decision tables, (Li, X.; Zaïane, O. R.; Li, Z., ADMA, LNCS 4093 (2006), Springer), 626-635
[9] Xu, C.; Min, F., Weighted reduction for decision tables, (Wang, L.; Jiao, L.; Shi, G.; Li, X.; Liu, J., Proceedings of The Third International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2006), LNCS 4223 (2006), Springer), 246-255
[10] J. Wróblewski, Finding minimal reducts using genetic algorithms, in: P.P. Wang (Ed.), JCIS’95, Wrightsville Beach, North Carolina, 1995, pp. 186-189.; J. Wróblewski, Finding minimal reducts using genetic algorithms, in: P.P. Wang (Ed.), JCIS’95, Wrightsville Beach, North Carolina, 1995, pp. 186-189.
[11] Wang, G.; Yu, H.; Yang, D., Decision table reduction based on conditional information entropy, Chinese Journal of Computers, 25, 7, 1-8 (2002)
[12] Yao, Y. Y.; Zhao, Y.; Wang, J., On reduct construction algorithms, (Wang, G. Y.; Peters, J. F.; Skowron, A.; Yao, Y. Y., Rough Sets and Knowledge Technology, First International Conference, RSKT 2006, Chongquing, China, July 24-26, 2006, Proceedings. Rough Sets and Knowledge Technology, First International Conference, RSKT 2006, Chongquing, China, July 24-26, 2006, Proceedings, LNCS 4062 (2006), Springer), 297-304 · Zbl 1196.68275
[13] F. Min, Q. Liu, C. Fang, J. Zhang, Reduction based symbolic value partition, in: ICHIT 2006 Electronic Proceeding, Soft Computing and Rough Sets, 2006, pp. 1-10.; F. Min, Q. Liu, C. Fang, J. Zhang, Reduction based symbolic value partition, in: ICHIT 2006 Electronic Proceeding, Soft Computing and Rough Sets, 2006, pp. 1-10.
[14] Ching, J. Y.; Wong, A. K.C.; Chan, K. C.C., Class-dependent discretization for inductive learning from continuous and mixed-mode data, IEEE Transactions on Pattern Analysis and Machine Intelligence, 17, 641-651 (1995)
[15] H.S. Nguyen, Discretization of Real Value Attributes, Boolean Reasoning Approach, Ph.D. Thesis, Warsaw University, Warsaw, Poland, 1997.; H.S. Nguyen, Discretization of Real Value Attributes, Boolean Reasoning Approach, Ph.D. Thesis, Warsaw University, Warsaw, Poland, 1997.
[16] Kurgan, L. A.; Cios, K. J., CAIM discretization algorithm, IEEE Transactions on Knowledge and Data Engineering, 16, 2, 145-153 (2004)
[17] (Quinlan, J. R., C4.5 Programs for Machine Learning (1993), Morgan Kaufman Publisher: Morgan Kaufman Publisher San Mateo, California)
[18] N.C. Berkman, Value Grouping for Binary Decision Trees, 1995.; N.C. Berkman, Value Grouping for Binary Decision Trees, 1995.
[19] Ho, K. M., Reducing decision tree fragmentation through attribute value grouping: a comparative study, Journal Intelligent Data Analysis, 4, 3-4, 255-274 (2000) · Zbl 1088.68784
[20] Pawlak, Z., Some issues on rough sets, (Peters, J. F.; Skowron, A.; Grzymała-Busse, J. W.; Kostek, B.; Świniarski, R. W.; Szczuka, M. S., Transactions on Rough Sets I, LNCS 3100 (2004), Springer-Verlag: Springer-Verlag Berlin, Heidelberg), 1-58 · Zbl 1104.68108
[21] S.H. Nguyen, Regularity Analysis and its Applications in Data Mining, Ph.D. Thesis, Warsaw University, Warsaw, Poland, 1999.; S.H. Nguyen, Regularity Analysis and its Applications in Data Mining, Ph.D. Thesis, Warsaw University, Warsaw, Poland, 1999.
[22] Yao, Y. Y., A partition model of granular computing, Transactions on Rough Sets I, 239-259 (2004) · Zbl 0949.68144
[23] Min, F.; Liu, Q.; Tan, H.; Chen, L., The \(M\)-relative reduct problem, (Wang, G. Y.; Peters, J. F.; Skowron, A.; Yao, Y. Y., Rough Sets and Knowledge Technology, First International Conference, RSKT 2006, Chongquing, China, July 24-26, 2006, Proceedings. Rough Sets and Knowledge Technology, First International Conference, RSKT 2006, Chongquing, China, July 24-26, 2006, Proceedings, LNCS 4062 (2006), Springer), 170-175 · Zbl 1196.68232
[24] (Ganter, B.; Wille, R., Formal Concept Analysis: Mathematical Foundations (1999), Springer-Verlag: Springer-Verlag New York) · Zbl 0909.06001
[25] Skowron, A.; Rauszer, C., The discernibility matrices and functions in information systems, (Słowiński, R., Intelligent Decision Support - Handbook of Applications and Advances of the Rough Sets Theory (1992), Kluwer Academic Publishers: Kluwer Academic Publishers Dordrecht), 331-362
[26] C.L. Blake, C.J. Merz, UCI Repository of Machine Learning Databases, 1998, <http://www.ics.uci.edu/ mlearn/mlrepository.html; C.L. Blake, C.J. Merz, UCI Repository of Machine Learning Databases, 1998, <http://www.ics.uci.edu/ mlearn/mlrepository.html
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.