×

Robust sequential learning of feedforward neural networks in the presence of heavy-tailed noise. (English) Zbl 1325.68202

Summary: Feedforward neural networks (FFNN) are among the most used neural networks for modeling of various nonlinear problems in engineering. In sequential and especially real time processing all neural networks models fail when faced with outliers. Outliers are found across a wide range of engineering problems. Recent research results in the field have shown that to avoid overfitting or divergence of the model, new approach is needed especially if FFNN is to run sequentially or in real time. To accommodate limitations of FFNN when training data contains a certain number of outliers, this paper presents new learning algorithm based on improvement of conventional extended Kalman filter (EKF). Extended Kalman filter robust to outliers (EKF-OR) is probabilistic generative model in which measurement noise covariance is not constant; the sequence of noise measurement covariance is modeled as stochastic process over the set of symmetric positive-definite matrices in which prior is modeled as inverse Wishart distribution. In each iteration EKF-OR simultaneously estimates noise estimates and current best estimate of FFNN parameters. Bayesian framework enables one to mathematically derive expressions, while analytical intractability of the Bayes’ update step is solved by using structured variational approximation. All mathematical expressions in the paper are derived using the first principles. Extensive experimental study shows that FFNN trained with developed learning algorithm, achieves low prediction error and good generalization quality regardless of outliers’ presence in training data.

MSC:

68T05 Learning and adaptive systems in artificial intelligence

Software:

UCI-ml; PRMLT
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Agamennoni, G.; Nebot, E. M., Robust estimation in non-linear state-space models with state-dependent noise, IEEE Transactions on Signal Processing, 62, 8, 2165-2175 (2014) · Zbl 1394.94028
[2] Agamennoni, G.; Nieto, J.; Nebot, E., An outlier-robust Kalman filter, (IEEE international conference on robotics and automation. IEEE international conference on robotics and automation, ICRA 2011 (2011), (IEEE) Institute of Electrical and Electronics Engineers: (IEEE) Institute of Electrical and Electronics Engineers USA)
[3] Agamennoni, G.; Nieto, J.; Nebot, E., Approximate inference in state-space models with heavy-tailed noise, IEEE Transactions on Signal Processing, 60, 10, 5024-5037 (2012) · Zbl 1393.94150
[4] Andrieu, C.; De Freitas, N.; Doucet, A., Robust full Bayesian learning for radial basis networks, Neural Computation, 13, 10, 2359-2407 (2001) · Zbl 1013.68179
[5] Archambeau, C.; Verleysen, M., Robust Bayesian clustering, Neural Networks, 20, 1, 129-138 (2007) · Zbl 1158.68440
[6] Bache, K.; Lichman, M., UCI machine learning repository (2013), University of California, School of Information and Computer Science.: University of California, School of Information and Computer Science. Irvine, CA, http://archive.ics.uci.edu/ml (last date of access: February 10, 2014)
[7] Barber, D., Bayesian reasoning and machine learning (2012), Cambridge University Press · Zbl 1267.68001
[8] Beal, M. J., Variational algorithms for approximate Bayesian inference (2003), University of London, (Doctoral dissertation)
[9] Billings, S. A.; Wei, H. L.; Balikhin, M. A., Generalized multiscale radial basis function networks, Neural Networks, 20, 10, 1081-1094 (2007) · Zbl 1254.68197
[10] Bishop, C., Pattern recognition and machine learning (2006), Springer: Springer Berlin · Zbl 1107.68072
[11] Box, G.; Tiao, G., Bayesian inference in statistical analysis (1973), John Wiley & Sons · Zbl 0271.62044
[12] Chandola, V.; Banerjee, A.; Kumar, V., Anomaly detection: A survey, ACM Computing Surveys (CSUR), 41, 3, 15 (2009)
[13] Chatzis, S. P.; Korkinof, D.; Demiris, Y., The one-hidden layer non-parametric Bayesian kernel machine, (23rd IEEE International conference on tools with artificial intelligence (ICTAI) (2011), IEEE), 825-831
[14] Chuang, C. C.; Jeng, J. T., CPBUM neural networks for modeling with outliers and noise, Applied Soft Computing, 7, 3, 957-967 (2007)
[15] Chuang, C. C.; Jeng, J. T.; Lin, P. T., Annealing robust radial basis function networks for function approximation with outliers, Neurocomputing, 56, 123-139 (2004)
[16] Chuang, C. C.; Lee, Z. J., Hybrid robust support vector machines for regression with outliers, Applied Soft Computing, 11, 1, 64-72 (2011)
[17] Chuang, C. C.; Su, S. F.; Chen, S. S., Robust TSK fuzzy modeling for function approximation with outliers, IEEE Transactions on Fuzzy Systems, 9, 6, 810-821 (2001)
[18] Chuang, C. C.; Su, S. F.; Jeng, J. T.; Hsiao, C. C., Robust support vector regression networks for function approximation with outliers, IEEE Transactions on Neural Networks, 13, 6, 1322-1330 (2002)
[19] Connor, J. T.; Martin, R. D.; Atlas, L. E., Recurrent neural networks and robust time series prediction, IEEE Transactions on Neural Networks, 5, 2, 240-254 (1994)
[20] Doucet, A.; De Freitas, N.; Gordon, N. J., Sequential Monte Carlo methods in practice (2001), Springer-Verlag: Springer-Verlag New York · Zbl 0967.00022
[21] Đurović, Ž. M.; Kovačević, B. D., Robust estimation with unknown noise statistics, IEEE Transactions on Automatic Control, 44, 6, 1292-1296 (1999) · Zbl 0955.93051
[22] Fu, Y. Y.; Wu, C. J.; Jeng, J. T.; Ko, C. N., ARFNNs with SVR for prediction of chaotic time series with outliers, Expert Systems with Applications, 37, 6, 4441-4451 (2010)
[23] Gu, Y.; Liu, J.; Chen, Y.; Jiang, X.; Yu, H., TOSELM: Timeliness online sequential extreme learning machine, Neurocomputing, 128, 119-127 (2014)
[24] Guo, L.; Hao, J. H.; Liu, M., An incremental extreme learning machine for online sequential learning problems, Neurocomputing, 128, 50-58 (2014)
[26] Hodge, V. J.; Austin, J., A survey of outlier detection methodologies, Artificial Intelligence Review, 22, 2, 85-126 (2004) · Zbl 1101.68023
[27] Huang, G. B.; Saratchandran, P.; Sundararajan, N., An efficient sequential learning algorithm for growing and pruning RBF (GAP-RBF) networks, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 34, 6, 2284-2292 (2004)
[28] Huang, G. B.; Saratchandran, P.; Sundararajan, N., A generalized growing and pruning RBF (GGAP-RBF) neural network for function approximation, IEEE Transactions on Neural Networks, 16, 1, 57-67 (2005)
[29] Huang, G. B.; Zhu, Q. Y.; Siew, C. K., Extreme learning machine: theory and applications, Neurocomputing, 70, 1, 489-501 (2006)
[30] Huber, P. J., Robust statistics (2011), Springer: Springer Berlin, Heidelberg
[31] Huynh, H. T.; Won, Y., Regularized online sequential learning algorithm for single-hidden layer feedforward neural networks, Pattern Recognition Letters, 32, 14, 1930-1935 (2011)
[32] Jeng, J. T.; Chuang, C. C.; Tao, C. W., Hybrid SVMR-GPR for modeling of chaotic time series systems with noise and outliers, Neurocomputing, 73, 10, 1686-1693 (2010)
[33] Ko, C. N., Identification of nonlinear systems with outliers using wavelet neural networks based on annealing dynamical learning algorithm, Engineering Applications of Artificial Intelligence, 25, 3, 533-543 (2012)
[34] Lee, C. C.; Chiang, Y. C.; Shih, C. Y.; Tsai, C. L., Noisy time series prediction using M-estimator based robust radial basis function neural networks with growing and pruning techniques, Expert Systems with Applications, 36, 3, 4717-4724 (2009)
[35] Lee, C. C.; Chung, P. C.; Tsai, J. R.; Chang, C. I., Robust radial basis function neural networks, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 29, 6, 674-685 (1999)
[36] Li, D.; Han, M.; Wang, J., Chaotic time series prediction based on a novel robust echo state network, IEEE Transactions on Neural Networks and Learning Systems, 23, 5, 787-799 (2012)
[37] Liang, N. Y.; Huang, G. B.; Saratchandran, P.; Sundararajan, N., A fast and accurate online sequential learning algorithm for feedforward networks, IEEE Transactions on Neural Networks, 17, 6, 1411-1423 (2006)
[38] Mahdi, R. N.; Rouchka, E. C., Reduced HyperBF networks: Regularization by explicit complexity reduction and scaled Rprop based training, IEEE Transactions on Neural Networks, 22, 5, 673-686 (2011)
[39] Markou, M.; Singh, S., Novelty detection: a review—part 1: statistical approaches, Signal Processing, 83, 12, 2481-2497 (2003) · Zbl 1145.94402
[40] Markou, M.; Singh, S., Novelty detection: a review—part 2: neural network based approaches, Signal Processing, 83, 12, 2499-2521 (2003) · Zbl 1145.94403
[42] Miljković, Z.; Vuković, N.; Mitić, M.; Babić, B., New hybrid vision-based control approach for automated guided vehicles, The International Journal of Advanced Manufacturing Technology, 66, 1-4, 231-249 (2013)
[43] Minka, T. P., Expectation propagation for approximate Bayesian inference, (Proceedings of the seventeenth conference on uncertainty in artificial intelligence (2001), Morgan Kaufmann Publishers Inc), 362-369
[44] Nikolaev, N. Y.; de Menezes, L. M., Sequential Bayesian kernel modelling with non-Gaussian noise, Neural Networks, 21, 1, 36-47 (2008) · Zbl 1254.68216
[45] Nishida, K.; Yamauchi, K.; Omori, T., An online learning algorithm with dimension selection using minimal hyper basis function networks, Systems and Computers in Japan, 37, 11, 11-21 (2006)
[46] Pernía-Espinoza, A. V.; Ordieres-Meré, J. B.; Martínez-de-Pisón, F. J.; González-Marcos, A., TAO-robust backpropagation learning algorithm, Neural Networks, 18, 2, 191-204 (2005)
[48] Poggio, T.; Girosi, F., Networks for approximation and learning, Proceedings of the IEEE, 1481-1497 (1990) · Zbl 1226.92005
[49] Rusiecki, A., Robust LTS backpropagation learning algorithm, (Computational and ambient intelligence (2007), Springer: Springer Berlin, Heidelberg), 102-109
[50] Rusiecki, A., Robust learning algorithm based on LTA estimator, Neurocomputing, 624-632 (2013), Special Issue: Image Feature Detection and Description
[51] Sarkka, S.; Nummenmaa, A., Recursive noise adaptive Kalman filtering by variational Bayesian approximations, IEEE Transactions on Automatic Control, 54, 3, 596-600 (2009) · Zbl 1367.93658
[52] Schick, I. C.; Mitter, S. K., Robust recursive estimation in the presence of heavy-tailed observation noise, The Annals of Statistics, 22, 2, 1045-1080 (1994) · Zbl 0815.62014
[53] Simon, D., Training radial basis neural networks with the extended Kalman filter, Neurocomputing, 48, 1, 455-475 (2002) · Zbl 1006.68797
[54] Soria-Olivas, E.; Gomez-Sanchis, J.; Jarman, I. H.; Vila-Frances, J.; Martinez, M.; Magdalena, J. R., BELM: Bayesian extreme learning machine, IEEE Transactions on Neural Networks, 22, 3, 505-509 (2011)
[55] Stanković, S. S.; Kovačević, B. D., Analysis of robust stochastic approximation algorithms for process identification, Automatica, 22, 4, 483-488 (1986) · Zbl 0593.93064
[56] Ting, J. A.; Theodorou, E.; Schaal, S., A Kalman filter for robust outlier detection, (IEEE/RSJ international conference on intelligent robots and systems, 2007. IEEE/RSJ international conference on intelligent robots and systems, 2007, IROS 2007 (2007), IEEE), 1514-1519
[57] Tzikas, D. G.; Likas, C. L.; Galatsanos, N. P., The variational approximation for Bayesian inference, IEEE Signal Processing Magazine, 25, 6, 131-146 (2008)
[58] Vuković, N., Machine learning of intelligent mobile robot based on artificial neural networks (2012), University of Belgrade—Faculty of Mechanical Engineering (in Serbian), (Doctoral dissertation)
[59] Vuković, N.; Miljković, Z., A growing and pruning sequential learning algorithm of hyper basis function neural network for function approximation, Neural Networks, 46, 210-226 (2013) · Zbl 1296.68148
[60] Yang, Y. K.; Sun, T. Y.; Huo, C. L.; Yu, Y. H.; Liu, C. C.; Tsai, C. H., A novel self-constructing radial basis function neural-fuzzy system, Applied Soft Computing, 13, 5, 2390-2404 (2013)
[61] Yohai, V. J.; Zamar, R. H., High breakdown-point estimates of regression by means of the minimization of an efficient scale, Journal of the American Statistical Association, 83, 402, 406-413 (1988) · Zbl 0648.62036
[62] Zhu, J.; Hoi, S.; Lyu, M. T., Robust regularized kernel regression, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 38, 6, 1639-1644 (2008)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.