×

Concept drift detection and adaptation with hierarchical hypothesis testing. (English) Zbl 1453.62558

Summary: A fundamental issue for statistical classification models in a streaming environment is that the joint distribution between predictor and response variables changes over time (a phenomenon also known as concept drifts), such that their classification performance deteriorates dramatically. In this paper, we first present a hierarchical hypothesis testing (HHT) framework that can detect and also adapt to various concept drift types (e.g., recurrent or irregular, gradual or abrupt), even in the presence of imbalanced data labels. A novel concept drift detector, namely hierarchical linear four rates (HLFR), is implemented under the HHT framework thereafter. By substituting a widely-acknowledged retraining scheme with an adaptive training strategy, we further demonstrate that the concept drift adaptation capability of HLFR can be significantly boosted. The theoretical analysis on the type-I and type-II errors of HLFR is also performed. Experiments on both simulated and real-world datasets illustrate that our methods outperform state-of-the-art methods in terms of detection precision, detection delay as well as the adaptability across different concept drift types.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62H15 Hypothesis testing in multivariate analysis
PDF BibTeX XML Cite
Full Text: DOI arXiv

References:

[1] Yu, S.; Abraham, Z., Concept drift detection with hierarchical hypothesis testing, Proceedings of the 2017 SIAM International Conference on Data Mining (SDM), 768-776, (2017)
[2] Slavakis, K.; Kim, S.-J.; Mateos, G.; Giannakis, G. B., Stochastic approximation vis-a-vis online learning for big data analytics [lecture notes], IEEE Signal Process. Mag., 31, 6, 124-129, (2014)
[3] Hu, H.; Wen, Y.; Chua, T.-S.; Li, X., Toward scalable systems for big data analytics: A technology tutorial, IEEE Access, 2, 652-687, (2014)
[4] Basseville, M.; Nikiforov, I. V., Detection of Abrupt Changes: Theory and Application, 104, (1993), Prentice Hall Englewood Cliffs · Zbl 1407.62012
[5] Gama, J.; Žliobaitė, I.; Bifet, A.; Pechenizkiy, M.; Bouchachia, A., A survey on concept drift adaptation, ACM Comput. Surv., 46, 4, 44, (2014) · Zbl 1305.68141
[6] Wang, S.; Minku, L. L.; Yao, X., A systematic study of online class imbalance learning with concept drift, IEEE Trans. Neural Netw. Learn. Syst., 29, 10, 4802-4821, (2018)
[7] Ross, G. J.; Adams, N. M.; Tasoulis, D. K.; Hand, D. J., Exponentially weighted moving average charts for detecting concept drift, Pattern Recognit. Lett., 33, 2, 191-198, (2012)
[8] Widmer, G.; Kubat, M., Learning in the presence of concept drift and hidden contexts, Mach. Learn., 23, 1, 69-101, (1996)
[9] Klinkenberg, R., Learning drifting concepts: Example selection vs. example weighting, Intel. Data Anal., 8, 3, 281-300, (2004)
[10] Bifet, A.; Gavalda, R., Learning from time-changing data with adaptive windowing, Proceedings of the 2007 SIAM International Conference on Data Mining (SDM), 443-448, (2007)
[11] Du, L.; Song, Q.; Jia, X., Detecting concept drift: an information entropy based method using an adaptive sliding window, Intel. Data Anal., 18, 3, 337-364, (2014)
[12] Street, W. N.; Kim, Y., A streaming ensemble algorithm (sea) for large-scale classification, Proceedings of the seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 377-382, (2001)
[13] Katakis, I.; Tsoumakas, G.; Vlahavas, I., Tracking recurring contexts using ensemble classifiers: an application to email filtering, Knowl. Inform. Syst., 22, 3, 371-391, (2010)
[14] Katakis, I.; Tsoumakas, G.; Vlahavas, I. P., An ensemble of classifiers for coping with recurring contexts in data streams., Proceedings of the European Conference on Artificial Intelligence (ECAI), 763-764, (2008)
[15] Elwell, R.; Polikar, R., Incremental learning of concept drift in nonstationary environments, IEEE Trans. Neural Netw., 22, 10, 1517-1531, (2011)
[16] Gama, J.; Medas, P.; Castillo, G.; Rodrigues, P., Learning with drift detection, Proceedings of the Brazilian Symposium on Artificial Intelligence, 286-295, (2004), Springer · Zbl 1105.68376
[17] Wang, S.; Minku, L. L.; Ghezzi, D.; Caltabiano, D.; Tino, P.; Yao, X., Concept drift detection for online class imbalance learning, Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), 1-10, (2013)
[18] Wang, H.; Abraham, Z., Concept drift detection for streaming data, Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), 1-9, (2015)
[19] Antwi, D. K.; Viktor, H. L.; Japkowicz, N., The perfsim algorithm for concept drift detection in imbalanced data, Proceedings of the 2012 IEEE International Conference on Data Mining Workshops (ICDMW), 619-628, (2012)
[20] Alippi, C.; Boracchi, G.; Roveri, M., A hierarchical, nonparametric, sequential change-detection test, Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), 2889-2896, (2011)
[21] Alippi, C.; Boracchi, G.; Roveri, M., Hierarchical change-detection tests, IEEE Trans. Neural Netw. Learn. Syst., 28, 2, 246-258, (2017)
[22] Alippi, C.; Boracchi, G.; Roveri, M., Just-in-time classifiers for recurrent concepts, IEEE Trans. Neural Netw. Learn. Syst., 24, 4, 620-634, (2013)
[23] Minku, L. L.; White, A. P.; Yao, X., The impact of diversity on online ensemble learning in the presence of concept drift, IEEE Trans. Knowl. Data Eng., 22, 5, 730-742, (2010)
[24] Minku, L. L.; Yao, X., Ddd: A new ensemble approach for dealing with concept drift, IEEE Trans. Knowl. Data Eng., 24, 4, 619-633, (2012)
[25] Sun, Y.; Tang, K.; Zhu, Z.; Yao, X., Concept drift adaptation by exploiting historical knowledge, IEEE Trans. Neural Netw. Learn. Syst., (2018)
[26] Domingos, P.; Hulten, G., A general framework for mining massive data streams, J. Comput. Graph. Stat., 12, 4, 945-949, (2003)
[27] Krawczyk, B.; Minku, L. L.; Gama, J.; Stefanowski, J.; Woźniak, M., Ensemble learning for data stream analysis: a survey, Inform. Fus., 37, 132-156, (2017)
[28] Kelly, M. G.; Hand, D. J.; Adams, N. M., The impact of changing populations on classifier performance, Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 367-371, (1999)
[29] Widmer, G.; Kubat, M., Effective learning in dynamic environments by explicit context tracking, Proceedings of the European Conference on Machine Learning (ECML), 227-243, (1993)
[30] Harel, M.; Mannor, S.; El-Yaniv, R.; Crammer, K., Concept drift detection through resampling., Proceedings of the International Conference on Machine Learning (ICML), 1009-1017, (2014)
[31] Sandberg, I. W.; Lo, J. T.; Fancourt, C. L.; Principe, J. C.; Katagiri, S.; Haykin, S., Nonlinear dynamical systems: feedforward neural network perspectives, 21, (2001), John Wiley & Sons
[32] Brodsky, E.; Darkhovsky, B. S., Nonparametric Methods in Change Point Problems, 243, (2013), Springer Science & Business Media
[33] Chen, J.; Gupta, A. K., Parametric Statistical Change Point Analysis: with Applications to Genetics, Medicine, and Finance, (2011), Springer Science & Business Media
[34] Sethi, T. S.; Kantardzic, M., On the reliable detection of concept drift from streaming unlabeled data, Expert Syst. Appl., 82, 77-99, (2017)
[35] Vapnik, V., Principles of risk minimization for learning theory, Proceedings of the Advances in Neural Information Processing Systems (NIPS), 831-838, (1991)
[36] Souza, V. M.; Silva, D. F.; Gama, J.; Batista, G. E., Data stream classification guided by clustering on nonstationary environments and extreme verification latency, Proceedings of the 2015 SIAM International Conference on Data Mining (SDM), 873-881, (2015)
[37] Baena-Garcıa, M.; del Campo-Ávila, J.; Fidalgo, R.; Bifet, A.; Gavalda, R.; Morales-Bueno, R., Early drift detection method, Proceedings of the Fourth International Workshop on Knowledge Discovery from Data Streams (IWKDDS), 6, 77-86, (2006)
[38] Nishida, K.; Yamauchi, K., Detecting concept drift using statistical testing, Proceedings of the International Conference on Discovery Science, 264-269, (2007)
[39] Read, J.; Bifet, A.; Pfahringer, B.; Holmes, G., Batch-incremental versus instance-incremental learning in dynamic and evolving data, Advances in Intelligent Data Analysis XI, 313-323, (2012)
[40] Frías-Blanco, I.; del Campo-Ávila, J.; Ramos-Jiménez, G.; Morales-Bueno, R.; Ortiz-Díaz, A.; Caballero-Mota, Y., Online and non-parametric drift detection methods based on hoeffding bounds, IEEE Trans. Knowl. Data Eng., 27, 3, 810-823, (2015)
[41] Gonçalves, P. M.; de Carvalho Santos, S. G.; Barros, R. S.; Vieira, D. C., A comparative study on concept drift detectors, Expert Syst. Appl., 41, 18, 8144-8156, (2014)
[42] Principe, J. C.; Chalasani, R., Cognitive architectures for sensory processing, Proc. IEEE, 102, 4, 514-525, (2014)
[43] Alippi, C.; Boracchi, G.; Roveri, M., Change detection tests using the ICI rule, Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), 1-7, (2010)
[44] Helstrom, C. W., Statistical Theory of Signal Detection, (1968), Pergamon Press: Pergamon Press New York, NY, USA
[45] Siegmund, D., Sequential Analysis: Tests and Confidence Intervals, (1985), Springer, New York · Zbl 0573.62071
[46] Du, L.; Song, Q.; Zhu, L.; Zhu, X., A selective detector ensemble for concept drift detection, Comput. J., 3, (2015)
[47] Maciel, B. I.F.; Santos, S. G.T. C.; Barros, R. S.M., A lightweight concept drift detection ensemble, Proceedings of the 2015 IEEE International Conference on Tools with Artificial Intelligence (ICTAI), 1061-1068, (2015)
[48] Wang, S.; Minku, L. L.; Yao, X., A learning framework for online class imbalance learning, Proceedings of the IEEE Symposium on Computational Intelligence and Ensemble Learning (CIEL), 36-45, (2013)
[49] Good, P., Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses, (2013), Springer Science & Business Media
[50] Woźniak, M.; Ksieniewicz, P.; Cyganek, B.; Walkowiak, K., Ensembles of heterogeneous concept drift detectors-experimental study, Proceedings of the IFIP International Conference on Computer Information Systems and Industrial Management, 538-549, (2016), Springer
[51] Bhati, D.; Kgosi, P.; Rattihalli, R. N., Distribution of geometrically weighted sum of bernoulli random variables, Appl. Math., 2, 11, 1382, (2011)
[52] Bousquet, O.; Elisseeff, A., Stability and generalization, J. Mach. Learn. Res., 2, Mar, 499-526, (2002) · Zbl 1007.68083
[53] Ditzler, G.; Polikar, R., Incremental learning of concept drift from streaming imbalanced data, IEEE Trans. Knowl. Data Eng., 25, 10, 2283-2301, (2013)
[54] Yang, J.; Yan, R.; Hauptmann, A. G., Cross-domain video concept detection using adaptive svms, Proceedings of the 15th ACM International Conference on Multimedia (ICME), 188-197, (2007)
[55] Yang, J.; Yan, R.; Hauptmann, A. G., Adapting svm classifiers to data with shifted distributions, Proceedings of the Seventh IEEE International Conference on Data Mining Workshops (ICDMW), 69-76, (2007)
[56] Katakis, I.; Tsoumakas, G.; Vlahavas, I., Dynamic feature space and incremental feature selection for the classification of textual data streams, Knowl. Discov. Data Streams, 107-116, (2006)
[57] Zliobaite, I., How good is the electricity benchmark for evaluating concept drift adaptation, (2013), arXiv:1301.3524
[58] Žliobaitė, I.; Bifet, A.; Read, J.; Pfahringer, B.; Holmes, G., Evaluation methods and decision theory for classification of streaming data with temporal dependence, Mach. Learn., 98, 3, 455-482, (2015) · Zbl 1311.62094
[59] Gama, J.; Sebastião, R.; Rodrigues, P. P., On evaluating stream learning algorithms, Mach. Learn., 90, 3, 317-346, (2013) · Zbl 1260.68329
[60] Rijsbergen, C. J.V., Information Retrieval, (1979), Butterworth-Heinemann
[61] Kubat, M.; Holte, R.; Matwin, S., Learning when negative examples abound, Proceedings of the European Conference on Machine Learning (ECML), 146-153, (1997)
[62] Anagnostopoulos, C.; Tasoulis, D. K.; Adams, N. M.; Hand, D. J., Temporally adaptive estimation of logistic classifiers on data streams, Advances in Data Analysis and Classification, 3, 3, 243-261, (2009) · Zbl 1305.68139
[63] Pavlidis, N. G.; Tasoulis, D. K.; Adams, N. M.; Hand, D. J., λ-perceptron: An adaptive classifier for data streams, Pattern Recognit., 44, 1, 78-96, (2011) · Zbl 1211.68134
[64] Alippi, C.; Roveri, M., Just-in-time adaptive classifiers in non-stationary conditions, Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), 1014-1019, (2007)
[65] Principe, J. C., Information Theoretic Learning: Renyi’s Entropy and Kernel Perspectives, (2010), Springer Science & Business Media · Zbl 1206.94003
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.