Task decomposition and modular single-hidden-layer perceptron classifiers for multi-class learning problems.

*(English)*Zbl 1111.68636Summary: One of keys for MultiLayer Perceptrons (MLPs) to solve the multi-class learning problems is how to make them get good convergence and generalization performances merely through learning small-scale subsets, i.e., a small part of the original larger-scale data sets. This paper first decomposes an \(n\)-class problem into \(n\) two-class problems, and then uses n class-modular MLPs to solve them one by one. A class-modular MLP is responsible for forming the decision boundaries of its represented class, and thus can be trained only by the samples from the represented class and some neighboring ones. When solving a two-class problem, an MLP has to face with such unfavorable situations as unbalanced training data, locally sparse and weak distribution regions, and open decision boundaries. One of solutions is that the samples from the minority classes or in the thin regions are virtually reinforced by suitable enlargement factors. And next, the effective range of an MLP is localized by a correction coefficient related to the distribution of its represented class. In brief, this paper focuses on the formation of economic learning subsets, the virtual balance of imbalanced training sets, and the localization of generalization regions of MLPs. The results for the letter and the extended handwritten digital recognitions show that the proposed methods are effective.

##### MSC:

68T10 | Pattern recognition, speech recognition |

68T05 | Learning and adaptive systems in artificial intelligence |

##### Keywords:

task decomposition; multi-class learning data sets; modular multilayer perceptrons; unbalanced classes; weak distribution regions; output amendment
PDF
BibTeX
XML
Cite

\textit{G. Daqi} et al., Pattern Recognition 40, No. 8, 2226--2236 (2007; Zbl 1111.68636)

Full Text:
DOI

##### References:

[1] | Lohr, S., Sampling: design and analysis, (1999), Duxbury Press CA · Zbl 0967.62005 |

[2] | Kwak, N.; Choi, C.H., Input feature selection for classification problems, IEEE trans. neural networks, 13, 1, 143-159, (2002) |

[3] | A.L. Blum, P. Langley, Selection of relevant features and examples in machine learning, Artif. Intell. (1-2) (1997) 245-271. · Zbl 0904.68142 |

[4] | Mitra, S.; Pal, S.K.; Mitra, P., Data mining in soft computing framework: a survey, IEEE trans. neural networks, 13, 1, 3-14, (2002) |

[5] | Bermejo, S.; Cabestany, J., Oriented principle component analysis for large margin classifiers, Neural networks, 14, 1447-1461, (2001) |

[6] | Comon, P., Independent component analysis, a new concept?, Signal process., 36, 287-314, (1994) · Zbl 0791.62004 |

[7] | Plumbley, M.D., Algorithms for nonnegative independent component analysis, IEEE trans. neural networks, 14, 3, 534-543, (2003) |

[8] | Hall, D.L.; Llinas, J., An introduction to multisensor data fusion, Proc. IEEE, 85, 1, 6-23, (1997) |

[9] | Parker, J.R., Rank and response combination from confusion matrix data, Inform. fusion, 2, 111-120, (2001) |

[10] | Francois, J.; Grandvalet, Y.; Deneux, T., Resample and combine: an approach to improving uncertainty representation in evidential pattern classification, Inform. fusion, 4, 75-85, (2003) |

[11] | Jenkins, R.; Yuhas, B., A simplified neural-network solution through problem decomposition: the case of the truck backer-upper, IEEE trans. neural networks, 4, 718-722, (1993) |

[12] | Anand, R.; Mehrotra, K.G.; Mohan, C.K., Efficient classification for multiclass problems using modular neural networks, IEEE trans. neural networks, 6, 1, 117-124, (1995) |

[13] | Roverso, D., ARTD: autonomous recursive task decomposition for many-class learning, Int. J. knowledge based intell. eng. syst., 6, 4, (2002) |

[14] | D. Bollivier, P. Gallinari, S. Thiria, Cooperation of neural nets and task decomposition, in: Proceedings of the International Joint Conference on Neural Networks, vol. 2, Seattle, USA, 1991, pp. 845-849. |

[15] | Oh, I.S.; Suen, C.Y., A class-modular feedforward neural network for handwriting recognition, Pattern recognition, 35, 1, 229-244, (2002) · Zbl 0988.68803 |

[16] | Ou, G.B.; Murphey, Y.L., Multi-class pattern classification using neural networks, Pattern recognition, 40, 1, 4-18, (2007) · Zbl 1103.68777 |

[17] | Anand, R.; Mehrotra, K.; Mohan, C.K.; Ranka, S., An improved algorithm for neural network classification of imbalanced training sets, IEEE trans. neural networks, 4, 6, 962-969, (1993) |

[18] | Murphey, Y.L.; Guo, H.; Feldkamp, L.A., Neural learning from imbalanced data, Appl. intell. neural networks appl., 21, 2, 117-128, (2004) · Zbl 1075.68075 |

[19] | Hsu, C.W.; Lin, C.J., A comparison of methods for multiclass support vector machines, IEEE trans. neural networks, 13, 2, 415-425, (2002) |

[20] | Vapnik, V.N., The nature of statistical learning theory, (2000), Springer New York · Zbl 0934.62009 |

[21] | Duda, R.O.; Hart, P.E.; Stork, D.G., Pattern classification, (2000), Wiley New York |

[22] | Bishop, C.M., Neural networks for pattern recognition, (1995), Clarendon Press Oxford |

[23] | Lippmann, R.P., Pattern classification using neural networks, IEEE commun. mag., 11, 47-64, (1989) |

[24] | Song, H.H.; Lee, S.W., A self-organizing neural tree for large-set pattern classification, IEEE trans. neural networks, 9, 3, 369-380, (1998) |

[25] | Islam, M.M.; Yao, X.; Murase, K., A constructive algorithm for training cooperative neural network ensembles, IEEE trans. neural networks, 14, 4, 820-834, (2003) |

[26] | D. Jamshid, K. Mustafa, C.M. Valdivieso, A rule-based scheme for filtering examples from majority class in an imbalanced training set, Third International Conference on Machine Learning and Data Mining in Pattern Recognition, Lecture Notes in Artificial Intelligence (LNAI), vol. 2734, Leipzig, Germany, 2003, pp. 215-223. · Zbl 1029.68569 |

[27] | I. Cantador, J.R. Dorronsoro, Parallel perceptions, activation margins and imbalanced training set pruning, Second Iberian Conference on Pattern Recognition and Image Analysis, IbPRIA 2005, Lecture Notes in Computer Science (LNCS), vol. 3523, Estoril, Portugal, 2005, pp. 43-50. |

[28] | Andrew, E.; Taeho, J.; Nathalie, J., A multiple resampling method for learning from imbalanced data sets, Comput. intell., 20, 1, 18-36, (2004) |

[29] | Chakraborty, D.; Pal, N.R., A novel training scheme for multilayered perceptrons to realize proper generalization and incremental learning, IEEE trans. neural networks, 14, 1, 1-14, (2003) |

[30] | Daqi, G.; Genxing, Y., Influences of variable scales and activation functions on the performances of multilayer feedforward neural networks, Pattern recognition, 36, 4, 869-878, (2003) |

[31] | Gori, M.; Scarselli, F., Are multilayer perceptrons adequate for pattern recognition and verification?, IEEE trans. pattern anal. Mach. intell., 20, 11, 851-859, (1998) |

[32] | Daqi, G.; Yan, J., Classification methodologies of multilayer perceptrons with Sigmoid activation functions, Pattern recognition, 38, 10, 1469-1482, (2005) |

[33] | C.L. Blake, C.J. Merz, UCI Repository of Machine Learning Databases: \(\langle\)http://www.ics.uci.edu/\(\sim\)mlearn/MLRepository.html⟩, Irvine, CA, 1998. |

[34] | Frey, P.W.; Slate, D.J., Letter recognition using holland-style adaptive classifiers, Mach. learn., 6, 161-182, (1991) |

[35] | Williamson, J.R., Gaussian ARTMAP: a neural network for fast incremental learning of noisy multidimensional maps, Neural networks, 9, 5, 881-897, (1996) |

[36] | Islam, M.M.; Murase, K., A new algorithm to design compact two-hidden-layer artificial neural networks, Neural networks, 14, 1265-1274, (2001) |

[37] | Le Cun, Y., Backpropagation applied to handwritten zip code recognition, Neural comput., 1, 541-551, (1989) |

[38] | Le Cun, Y.; Boser, B.; Denker, J.S., Handwritten digit recognition with a back-propagation network, (), 396-404 |

[39] | Cho, S.B., Neural-network classifiers for recognizing totally unconstrained handwritten numerals, IEEE trans. neural networks, 8, 1, 43-52, (1997) |

[40] | Simard, P.; Le Cun, Y.; Denker, J., Efficient pattern recognition using a new transformation distance, (), 50-58 |

[41] | Scholkopf, B.; Sung, K.K.; Burges, C.J.C., Comparing support vector machines with Gaussian kernels to radial basis function classifiers, IEEE trans. signal process., 45, 11, 2758-2764, (1997) |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.