×

zbMATH — the first resource for mathematics

Anomaly detection with inexact labels. (English) Zbl 07255759
Summary: We propose a supervised anomaly detection method for data with inexact anomaly labels, where each label, which is assigned to a set of instances, indicates that at least one instance in the set is anomalous. Although many anomaly detection methods have been proposed, they cannot handle inexact anomaly labels. To measure the performance with inexact anomaly labels, we define the inexact AUC, which is our extension of the area under the ROC curve (AUC) for inexact labels. The proposed method trains an anomaly score function so that the smooth approximation of the inexact AUC increases while anomaly scores for non-anomalous instances become low. We model the anomaly score function by a neural network-based unsupervised anomaly detection method, e.g., autoencoders. The proposed method performs well even when only a small number of inexact labels are available by incorporating an unsupervised anomaly detection mechanism with inexact AUC maximization. Using various datasets, we experimentally demonstrate that our proposed method improves the anomaly detection performance with inexact anomaly labels, and outperforms existing unsupervised and supervised anomaly detection and multiple instance learning methods.
MSC:
68T05 Learning and adaptive systems in artificial intelligence
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Akcay, S., Atapour-Abarghouei, A., & Breckon, T. P. (2018). Ganomaly: Semi-supervised anomaly detection via adversarial training. In 14th Asian conference on computer vision.
[2] Aleskerov, E., Freisleben, B., & Rao, B. (1997). Cardwatch: A neural network based database mining system for credit card fraud detection. In IEEE/IAFE computational intelligence for financial engineering (pp. 220-226).
[3] An, J.; Cho, S., Variational autoencoder based anomaly detection using reconstruction probability, Special Lecture on IE, 2, 1-18 (2015)
[4] Andrews, S., Tsochantaridis, I., & Hofmann, T. (2003). Support vector machines for multiple-instance learning. In Advances in neural information processing systems (pp. 577-584).
[5] Babenko, B., Yang, M.-H., & Belongie, S. (2009). Visual tracking with online multiple instance learning. In IEEE conference on computer vision and pattern recognition (pp. 983-990). IEEE.
[6] Blanchard, G.; Lee, G.; Scott, C., Semi-supervised novelty detection, Journal of Machine Learning Research, 11, Nov, 2973-3009 (2010) · Zbl 1242.68205
[7] Brefeld, U., & Scheffer, T. (2005). AUC maximizing support vector learning. In Proceedings of the ICML workshop on roc analysis in machine learning.
[8] Breiman, L., Random forests, Machine Learning, 45, 1, 5-32 (2001) · Zbl 1007.68152
[9] Breunig, MM; Kriegel, H-P; Ng, RT; Sander, J., LOF: Identifying density-based local outliers, ACM SIGMOD Record, 29, 2, 93-104 (2000)
[10] Bunescu, R., & Mooney, R. (2007). Learning to extract relations from the web using minimal supervision. In Proceedings of the 45th annual meeting of the association of computational linguistics (pp. 576-583).
[11] Campos, GO; Zimek, A.; Sander, J.; Campello, RJ; Micenková, B.; Schubert, E.; Assent, I.; Houle, ME, On the evaluation of unsupervised outlier detection: Measures, datasets, and an empirical study, Data Mining and Knowledge Discovery, 30, 4, 891-927 (2016)
[12] Carbonneau, M-A; Cheplygina, V.; Granger, E.; Gagnon, G., Multiple instance learning: A survey of problem characteristics and applications, Pattern Recognition, 77, 329-353 (2018)
[13] Chandola, V.; Banerjee, A.; Kumar, V., Anomaly detection: A survey, ACM Computing Surveys, 41, 3, 15 (2009)
[14] Chen, Y.; Bi, J.; Wang, JZ, MILES: Multiple-instance learning via embedded instance selection, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 12, 1931-1947 (2006)
[15] Chong, Y. S., & Tay, Y. H. (2017). Abnormal event detection in videos using spatiotemporal autoencoder. In International symposium on neural networks (pp. 189-196). Springer.
[16] Cinbis, RG; Verbeek, J.; Schmid, C., Weakly supervised object localization with multi-fold multiple instance learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1, 189-203 (2017)
[17] Cortes, C., & Mohri, M. (2004). AUC optimization vs. error rate minimization. In Advances in neural information processing systems (pp. 313-320).
[18] Das, S., Wong, W.-K., Dietterich, T., Fern, A., & Emmott, A. (2016). Incorporating expert feedback into active anomaly discovery. In 16th international conference on data mining (pp. 853-858). IEEE.
[19] Das, S., Wong, W.-K., Fern, A., Dietterich, T. G., & Siddiqui, M. A. (2017). Incorporating feedback into tree-based anomaly detection. In KDD workshop on interactive data exploration and analytics.
[20] Davis, J., Santos Costa, V., Ray, S., & Page, D. (2007). Tightly integrating relational learning and multiple-instance regression for real-valued drug activity prediction. In International conference on machine learning.
[21] Demšar, J., Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research, 7, Jan, 1-30 (2006) · Zbl 1222.68184
[22] Dietterich, TG; Lathrop, RH; Lozano-Pérez, T., Solving the multiple instance problem with axis-parallel rectangles, Artificial intelligence, 89, 1-2, 31-71 (1997) · Zbl 1042.68650
[23] Dodd, LE; Pepe, MS, Partial AUC estimation and regression, Biometrics, 59, 3, 614-623 (2003) · Zbl 1210.62152
[24] Dokas, P., Ertoz, L., Kumar, V., Lazarevic, A., Srivastava, J., & Tan, P.-N. (2002). Data mining for network intrusion detection. In NSF workshop on next generation data mining (pp. 21-30).
[25] Eskin, E. (2000). Anomaly detection over noisy data using learned probability distributions. In International conference on machine learning.
[26] Feng, J., & Zhou, Z.-H. (2017). Deep miml network. In Thirty-First AAAI conference on artificial intelligence.
[27] Forman, G.; Scholz, M., Apples-to-apples in cross-validation studies: Pitfalls in classifier performance measurement, ACM SIGKDD Explorations Newsletter, 12, 1, 49-57 (2010)
[28] Fujimaki, R., Yairi, T., & Machida, K. (2005). An approach to spacecraft anomaly detection problem using kernel feature space. In International conference on knowledge discovery in data mining (pp. 401-410).
[29] Fujino, A., & Ueda, N. (2016). A semi-supervised AUC optimization method with generative models. In 16th international conference on data mining (pp. 883-888). IEEE.
[30] Gao, J., Cheng, H., & Tan, P.-N. (2006). A novel framework for incorporating labeled examples into anomaly detection. In Proceedings of the 2006 SIAM international conference on data mining (pp. 594-598). SIAM.
[31] Hanley, JA; McNeil, BJ, The meaning and use of the area under a receiver operating characteristic (ROC) curve, Radiology, 143, 1, 29-36 (1982)
[32] Herrera, F.; Ventura, S.; Bello, R.; Cornelis, C.; Zafra, A.; Sánchez-Tarragó, D.; Vluymans, S., Multiple Instance Learning: Foundations and Algorithms (2016), Berlin: Springer, Berlin · Zbl 1398.68008
[33] Hodge, V.; Austin, J., A survey of outlier detection methodologies, Artificial Ntelligence Review, 22, 2, 85-126 (2004) · Zbl 1101.68023
[34] Idé, T., & Kashima, H. (2004). Eigenspace-based anomaly detection in computer systems. In International conference on knowledge discovery and data mining (pp. 440-449).
[35] Ilse, M., Tomczak, J., & Welling, M. (2018). Attention-based deep multiple instance learning. In International conference on machine learning (pp. 2132-2141).
[36] Iwata, T., & Yamanaka, Y. (2019). Supervised anomaly detection based on deep autoregressive density estimators. arXiv preprint arXiv:1904.06034
[37] Kingma, D. P., & Ba, J. (2015). ADAM: A method for stochastic optimization. In International conference on learning representations.
[38] Kingma, D. P., & Wellniga, M. (2014). Auto-encoding variational Bayes. In 2nd international conference on learning representations.
[39] Komori, O.; Eguchi, S., A boosting method for maximizing the partial area under the ROC curve, BMC Bioinformatics, 11, 1, 314 (2010)
[40] Laxhammar, R., Falkman, G., & Sviestins, E. (2009). Anomaly detection in sea traffic—A comparison of the Gaussian mixture model and the kernel density estimator. In International conference on information fusion (pp. 756-763).
[41] Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2008). Isolation forest. In Proceeding of the 8th IEEE international conference on data mining (pp. 413-422). IEEE.
[42] Markou, M.; Singh, S., Novelty detection: A review, Signal Processing, 83, 12, 2481-2497 (2003) · Zbl 1145.94402
[43] Maron, O., & Lozano-Pérez, T. (1998). A framework for multiple-instance learning. In Advances in neural information processing systems (pp. 570-576).
[44] Mukkamala, S., Sung, A., & Ribeiro, B. (2005). Model selection for kernel based intrusion detection systems. In Adaptive and natural computing algorithms (pp. 458-461). Springer.
[45] Munawar, A., Vinayavekhin, P., & De Magistris, G. (2017). Limiting the reconstruction capability of generative neural network using negative learning. In 27th international workshop on machine learning for signal processing. IEEE.
[46] Nadeem, M., Marshall, O., Singh, S., Fang, X., & Yuan, X. (2016). Semi-supervised deep neural network for network intrusion detection. In KSU conference on cybersecurity education, research and practice.
[47] Narasimhan, H.; Agarwal, S., Support vector algorithms for optimizing the partial area under the ROC curve, Neural Computation, 29, 7, 1919-1963 (2017) · Zbl 1456.68160
[48] Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., & Lerer, A. (2017). Automatic differentiation in PyTorch. In NIPS autodiff workshop.
[49] Patcha, A.; Park, J-M, An overview of anomaly detection techniques: Existing solutions and latest technological trends, Computer Networks, 51, 12, 3448-3470 (2007)
[50] Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V., Scikit-learn: Machine learning in python, Journal of Machine Learning Research, 12, 2825-2830 (2011) · Zbl 1280.68189
[51] Pimentel, T., Monteiro, M., Viana, J., Veloso, A., & Ziviani, N. (2018). A generalized active learning approach for unsupervised anomaly detection. arXiv preprint arXiv:1805.09411.
[52] Pinheiro, P. O., & Collobert, R. (2015). From image-level to pixel-level labeling with convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1713-1721).
[53] Rapaka, A.; Novokhodko, A.; Wunsch, D., Intrusion detection using radial basis function network on sequences of system calls, International Joint Conference on Neural Networks, 3, 1820-1825 (2003)
[54] Sabokrou, M.; Fathy, M.; Hoseini, M., Video anomaly detection and localisation based on the sparsity and reconstruction error of auto-encoder, Electronics Letters, 52, 13, 1122-1124 (2016)
[55] Sakai, T.; Niu, G.; Sugiyama, M., Semi-supervised AUC optimization based on positive-unlabeled learning, Machine Learning, 107, 4, 767-794 (2018) · Zbl 1458.68178
[56] Sakurada, M., & Yairi, T. (2014). Anomaly detection using autoencoders with nonlinear dimensionality reduction. In Proceedings of the MLSDA 2nd workshop on machine learning for sensory data analysis. ACM.
[57] Schölkopf, B.; Platt, JC; Shawe-Taylor, J.; Smola, AJ; Williamson, RC, Estimating the support of a high-dimensional distribution, Neural Computation, 13, 7, 1443-1471 (2001) · Zbl 1009.62029
[58] Schölkopf, B.; Smola, AJ, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond (2002), Cambridge: MIT press, Cambridge
[59] Shewhart, WA, Economic Control of Quality of Manufactured Product (1931), Milwaukee: ASQ Quality Press, Milwaukee
[60] Singh, S., & Silakari, S. (2009). An ensemble approach for feature selection of cyber attack dataset. arXiv preprint arXiv:0912.1014
[61] Suh, S., Chae, D. H., Kang, H.-G., & Choi, S. (2016). Echo-state conditional variational autoencoder for anomaly detection. In International joint conference on neural networks(pp. 1015-1022).
[62] Wong, W.-K., Moore, A. W., Cooper, G. F., & Wagner, M. M. (2003). Bayesian network anomaly pattern detection for disease outbreaks. In International conference on machine learning (pp. 808-815).
[63] Wu, J., Yu, Y., Huang, C., & Yu, K. (2015). Deep multiple instance learning for image classification and auto-annotation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3460-3469).
[64] Xu, H., Chen, W., Zhao, N., Li, Z., Bu, J., Li, Z., Liu, Y., Zhao, Y., Pei, D., Feng, Y., et al. (2018). Unsupervised anomaly detection via variational auto-encoder for seasonal kpis in web applications. In World wide web conference (pp. 187-196).
[65] Yamanishi, K.; Takeuchi, J-I; Williams, G.; Milne, P., On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms, Data Mining and Knowledge Discovery, 8, 3, 275-300 (2004)
[66] Ying, Y., Wen, L., & Lyu, S. (2016). Stochastic online AUC maximization. In Advances in neural information processing systems (pp. 451-459).
[67] Zhai, S., Cheng, Y., Lu, W., & Zhang, Z. (2016). Deep structured energy based models for anomaly detection. In International conference on machine learning (pp. 1100-1109).
[68] Zhang, Q., Goldman, S. A., Yu, W., & Fritts, J. E. (2002). Content-based image retrieval using multiple-instance learning. In International conference on machine learning.
[69] Zhou, C., & Paffenroth, R. C. (2017). Anomaly detection with robust deep autoencoders. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 665-674). ACM.
[70] Zhou, Z.-H., Sun, Y.-Y., & Li, Y.-F. (2009). Multi-instance learning by treating instances as non-iid samples. In Proceedings of the 26th annual international conference on machine learning (pp. 1249-1256).
[71] Zhu, W., Lou, Q., Vang, Y. S., & Xie, X. (2017). Deep multi-instance networks with sparse label assignment for whole mammogram classification. In International conference on medical image computing and computer-assisted intervention (pp. 603-611).
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.