×

Ensemble of randomized soft decision trees for robust classification. (English) Zbl 1348.62023

Summary: For classification, decision trees have become very popular because of its simplicity, interpret-ability and good performance. To induce a decision tree classifier for data having continuous valued attributes, the most common approach is, split the continuous attribute range into a hard (crisp) partition having two or more blocks, using one or several crisp (sharp) cut points. But, this can make the resulting decision tree, very sensitive to noise. An existing solution to this problem is to split the continuous attribute into a fuzzy partition (soft partition) using soft or fuzzy cut points which is based on fuzzy set theory and to use fuzzy decisions at nodes of the tree. These are called soft decision trees in the literature which are shown to perform better than conventional decision trees, especially in the presence of noise. Current paper, first proposes to use an ensemble of soft decision trees for robust classification where the attribute, fuzzy cut point, etc. parameters are chosen randomly from a probability distribution of fuzzy information gain for various attributes and for their various cut points. Further, the paper proposes to use probability based information gain to achieve better results. The effectiveness of the proposed method is shown by experimental studies carried out using three standard data sets. It is found that an ensemble of randomized soft decision trees has outperformed the related existing soft decision tree. Robustness against the presence of noise is shown by injecting various levels of noise into the training set and a comparison is drawn with other related methods which favors the proposed method.

MSC:

62C86 Statistical decision theory and fuzziness
91B06 Decision theory
62H86 Multivariate analysis and fuzziness
62H30 Classification and discrimination; cluster analysis (statistical aspects)
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Han Jiawei and Micheline Kamber 2001 Data mining: Concepts and techniques. Academic Press · Zbl 1445.68004
[2] Tan Pang-Ning et al 2006 Introduction to data mining. Pearson Addison Wesley Boston
[3] Duda Richard O and Peter E Hart 1973 Pattern classification and scene analysis. A Wiley-interscience Publication · Zbl 0277.68056
[4] Vapnik Vladimir 1999 An overview of statistical learning theory. In: IEEE Trans. Neural Netw. 10:988-999
[5] Zhu Ling 2013 Support vector machine. In: PSTAT, pp. 132-135
[6] Xu Yong et al 2013 Coarse to fine K nearest neighbor classifier. In: Pattern Recognit. Lett. 34:980-986
[7] Kolodner Janet 2014 Case-based reasoning. Morgan Kaufmann
[8] Kotsiantis S B 2013 Decision trees: a recent overview. In: Artif. Intell. Rev. 39:261-283
[9] Rodrigo C Barros, Marcio P Basgalupp, Andr I C P L F de Carvalho and Alex A Freitas 2012 A survey of evolutionary algorithms for decision tree induction. In: IEEE Trans. Syst. Man Cybern. 42:291-312
[10] Fayyad and Keki 1992 On the handling of continuous valued attributes in decision tree generation. In: Mach. Learn. 8:87-102 · Zbl 0767.68084
[11] Fayyad and Keki 1993 Multi interval discretization of continuous valued attributes for classification learning. In: Int. Joint Conf. Artif. Intell. 93:1022-1027
[12] Carter and Catlett 1987 Assessing credit card applications using machine learning. In: Proceedings of the IEEE Symposium on Security and Privacy, pp. 71-79
[13] Dietterich and Kong 1995 Machine learning bias, statistical bias and statistical variance of decision tree algorithms. In: Technical Report, Department of Computer Science. Oregon State Univerisity, pp. 71-79
[14] Marsala Christophe 2009 Data mining with ensembles of fuzzy decision trees. In: IEEE Symposium on Computational Intelligence and Data Mining, pp. 348-354.
[15] Cristina Olaru, Louis Wehenkel 2003 A complete fuzzy decision tree technique. In: Fuzzy Sets Syst. 138:221-254
[16] Quinlan 1996 Improved use of continuous attributes in C4.5. In: Artif. Intell. Res. 4:77-90 · Zbl 0900.68112
[17] Buntine 1992 Learning classification trees. In: Stat. Comput. 2: 63-73
[18] Ouzden and William 1993 Induction of rules subject to a quality constraint probabilistic inductive learning. In: IEEE Transactions on Knowledge and Data Engineering, pp. 979-984
[19] Xiaomeng Wang and Christian Borgelt 2004 Information measures in fuzzy decision trees. In: Fuzzy Syst. IEEE, pp. 979-984
[20] Wang Liu, Hong and Tseng 1999 A fuzzy inductive learning strategy for modular rules. In: Fuzzy Sets Syst. pp. 91-105
[21] Peng Yonghong and Peter A Flach 2001 Soft discretization to enhance the continuous decision tree induction. In: ECML/PKDD Workshop: IDDM
[22] Chen Min and Simone A Ludwig 2013 Fuzzy decision tree using soft discretization and a genetic algorithm based feature selection method. In: 2013 World Congress on Nature and Biologically Inspired Computing (NaBIC). IEEE, pp. 238-244
[23] Umano Okamoto, Hatono and Tamura 1994 Fuzzy decision trees by fuzzy ID3 algorithm and its application to diagnosis systems. In: IEEE, pp. 2113-2118
[24] Detyniecki and Marsala 2007 Forest of fuzzy decision trees and their application in video mining. In: Proceedings of the 5th EUSFLAT Conference. pp. 345-352
[25] Pradhan Biswajeet 2013 A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. In: Comput. Geosci. 51: 350-365
[26] Freund and Schapire 1996 Experiments with a new boosting algorithm. In: Proceedings 13th International Conference on Machine Learning. (San Francisco). Morgan Kaufmann, pp. 148-146
[27] Schapire 1990 The strength of weak learnability. In: Mach. Learn. 5:197-227
[28] Zhang Cha and Yunqian Ma 2012 Ensemble machine learning. Springer · Zbl 1303.68022
[29] Breiman L 1996 Bagging predictors. In: Mach. Learn. 24:123-140 · Zbl 0858.68080
[30] Efron and Tibshirani 1993 An introduction to the bootstrap. In: Chapman and Hall, CRC Press · Zbl 0835.62038
[31] Freund Iyer, Schapire and Singer 2003 An efficient boosting algorithm for combining preferneces. In: Mach. Learn. 4:933-969 · Zbl 1098.68652
[32] Breiman 2001 Random forests. In: Mach. Learn. 45: 5-32
[33] Cunningham Padraig 2007 Ensemble techniques. In: Techreport
[34] Geurts Ernst and Wehenkel 2006 Extremely randomized trees. In: Mach. Learn. 63:3-42 · Zbl 1110.68124
[35] Hamza and Larocque 2005 An empirical comparision of ensemble methods on classification trees. In: Stat. Comput. Simul. 75:629-643 · Zbl 1075.62051
[36] Wei Fan, TJ Watson Res and Sheng Ma 2003 Is random model better? On its accuracy and efficiency. In: Data Mining, 2003. ICDM 2003. Third IEEE International Conference. IEEE, pp. 51-58
[37] Wei Fan, TJ Watson Res, Hawthorne and McCloskey J 2005 Effective estimation of posterior probabilities: Explaining the accuracy of randomized decision tree approaches. In: Data Mining, Fifth IEEE International Conference. IEEE pp. 51-58
[38] Amit Yali and Donald Geman 1997 Shape quantization and recognition with randomized trees. In: Neural computation. MIT Press, pp. 1545-1588
[39] Dietterich T 2001 An experimental comparison of three methods for constructing ensmebles of decision trees: bagging, boosting, and randomization. In: Mach. Learn. 40:5-32
[40] Marsala and Bouchon-Meunier 1997 Forest of fuzzy decision trees. In: Seventh International Fuzzy Systems Assosiation World Congress, pp. 369-374 · Zbl 0886.68062
[41] Janikow and Faifer 2000 Fuzzy decision forest. In: 19th International Conference of the North American Fuzzy Information Processing Society, pp. 218-221
[42] Crockett Bandar and Mclean 2001 Growing a fuzzy decision forest. In: 10th International Conference on Fuzzy Systems. IEEE, pp. 614-617
[43] Janikow 2003 Fuzzy decision forest. In: 22nd International Conference of the North American Fuzzy Information Processing Society, pp. 480-483
[44] Bonissone Cadenas, Garrido and Diaz-Valladares 2008 A fuzzy random forest: Fundamental for design and construction. In: 12th International Conference on Information Processing and Management of Uncertainity in Knowledge Based Systems. Malaga Spain, pp. 1231-1238
[45] Kuncheva 2003 Fuzzy vs non-fuzzy in combining classifiers designed by boossting. In: IEEE Trans. Fuzzy Syst. 11:729-741
[46] Fumera Giorgio and Fabio Roli 2005 A theoretical and experimental ananysis of linear combiners for multiple classifier systems. In: IEEE Transactions on Pattern Analysis and Machine Intelligence 27 · Zbl 1040.68634
[47] Zadeh 1965 Fuzzy sets. In: Inform. Control 8:338-353
[48] Lior Rokach 2010 Ensemble-based classifiers. In: Artif. Intell. Rev. 33:1-39 · Zbl 1187.68495
[49] Lippman Fried Graf and Zissman 2000 Evaluating intrusion detection systems: The 1998 DARPA off-line intrusion detection evaluation. In: Proceedings of DARPA Information Survivability Conf. and Exosition (DISCEX’00), pp. 12-26
[50] DARPA Intrusion Detection Data Sets. URL: http://www.ll.mit.edu/mission/communications/ist/corpora/ideval/data/index.html
[51] Mahbod Tavallaee Ebrahim Bagheri, Wei Lu and Ali Ghorbani 2009 A Detailed analysis of the KDD CUP 99 data set. In: Proceedings of 2009 IEEE Symposium on Computer Intelligence in Security and Defense Applications(CISDA)
[52] Machine Learning Repository. URL: http://archive.ics.uci.edu/ml
[53] Quinlan 1993 C4.5 programs for machine learning. Morgan Kaufmann, Los Atlos
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.