×

Adaptively learning probabilistic deterministic automata from data streams. (English) Zbl 1317.68089

Summary: Markovian models with hidden state are widely-used formalisms for modeling sequential phenomena. Learnability of these models has been well studied when the sample is given in batch mode, and algorithms with PAC-like learning guarantees exist for specific classes of models such as Probabilistic Deterministic Finite Automata (PDFA). Here we focus on PDFA and give an algorithm for inferring models in this class in the restrictive data stream scenario: Unlike existing methods, our algorithm works incrementally and in one pass, uses memory sublinear in the stream length, and processes input items in amortized constant time. We also present extensions of the algorithm that (1) reduce to a minimum the need for guessing parameters of the target distribution and (2) are able to adapt to changes in the input distribution, relearning new models when needed. We provide rigorous PAC-like bounds for all of the above. Our algorithm makes a key usage of stream sketching techniques for reducing memory and processing time, and is modular in that it can use different tests for state equivalence and for change detection in the stream.

MSC:

68Q45 Formal languages and automata
68Q32 Computational learning theory
68T05 Learning and adaptive systems in artificial intelligence
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Aggarwal, C. (Ed.) (2007). Data streams—models and algorithms. Berlin: Springer. · Zbl 1126.68033
[2] Balle, B.; Castro, J.; Gavaldà, R., Bootstrapping and learning pdfa in data streams (2012)
[3] Balle, B., Castro, J., & Gavaldà, R. (2012b). Learning probabilistic automata: a study in state distinguishability. Theoretical Computer Science. · Zbl 1257.68085
[4] Bifet, A. (2010). Frontiers of artificial intelligence series and applications. Adaptive stream mining: pattern learning and mining from evolving data streams. Amsterdam: IOS Press. · Zbl 1204.68153
[5] Bousquet, O., Boucheron, S., & Lugosi, G. (2004). Introduction to statistical learning theory. Advanced Lectures on Machine Learning. · Zbl 1120.68428
[6] Carrasco, R. C., & Oncina, J. (1999). Learning deterministic regular grammars from stochastic samples in polynomial time. Informatique Théorique Et Applications, 33(1), 1-20. · Zbl 0940.68071 · doi:10.1051/ita:1999102
[7] Castro, J.; Gavaldà, R., Towards feasible PAC-learning of probabilistic deterministic finite automata (2008) · Zbl 1177.68108
[8] Clark, A., & Thollard, F. (2004). PAC-learnability of probabilistic deterministic finite state automata. Journal of Machine Learning Research. · Zbl 1222.68094
[9] Dupont, P., Denis, F., & Esposito, Y. (2005). Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms. Pattern Recognition, 38, 1349-1371. · Zbl 1101.68651 · doi:10.1016/j.patcog.2004.03.020
[10] Gama, J. (2010). Knowledge discovery from data streams. London: Taylor and Francis. · Zbl 1230.68017 · doi:10.1201/EBK1439826119
[11] Guttman, O.; Vishwanathan, S. V. N.; Williamson, R. C., Learnability of probabilistic automata via oracles (2005) · Zbl 1168.68400
[12] de la Higuera, C. (2010). Grammatical inference: learning automata and grammars. Cambridge: Cambridge University Press. · Zbl 1227.68112
[13] Hsu, D.; Kakade, S. M.; Zhang, T., A spectral algorithm for learning hidden Markov models (2009)
[14] Kearns, M. J.; Mansour, Y.; Ron, D.; Rubinfeld, R.; Schapire, R. E.; Sellie, L., On the learnability of discrete distributions (1994) · Zbl 1345.68252
[15] Lin, X.; Zhang, Y., Aggregate computation over data streams (2008)
[16] Menascé, D. A.; Almeida, V. A. F.; Fonseca, R.; Mendes, M. A., A methodology for workload characterization of e-commerce sites, 119-128 (1999), New York · doi:10.1145/336992.337024
[17] Metwally, A.; Agrawal, D.; Abbadi, A., Efficient computation of frequent and top-k elements in data streams (2005) · Zbl 1112.68371
[18] Muthukrishnan, S. (2005). Data streams: algorithms and applications. Foundations and Trends in Theoretical Computer Science.
[19] Palmer, N., & Goldberg, P. W. (2007). PAC-learnability of probabilistic deterministic finite state automata in terms of variation distance. Theoretical Computer Science. · Zbl 1143.68024
[20] Palmer, N. J. (2008). Pattern classification via unsupervised learners. PhD thesis, University of Warwick.
[21] Ron, D., Singer, Y., & Tishby, N. (1998). On the learnability and usage of acyclic probabilistic finite automata. Journal of Computing Systems Science. · Zbl 0915.68124
[22] Schmidt, J.; Kramer, S., Online induction of probabilistic real time automata, 625-634 (2012) · doi:10.1109/ICDM.2012.121
[23] Schmidt, J.; Ansorge, S.; Kramer, S., Scalable induction of probabilistic real-time automata using maximum frequent pattern based clustering, 272-283 (2012)
[24] Terwijn, S., On the learnability of hidden Markov models (2002) · Zbl 1028.68616
[25] Vershynin, R.; Eldar, Y. (ed.); Kutyniok, G. (ed.), Introduction to the non-asymptotic analysis of random matrices (2012)
[26] Vidal, E., Thollard, F., de la Higuera, C., Casacuberta, F., & Carrasco, R. C. (2005a). Probabilistic finite-state machines—part I. IEEE Transactions on Pattern Analysis and Machine Intelligence.
[27] Vidal, E., Thollard, F., de la Higuera, C., Casacuberta, F., & Carrasco, R. C. (2005b). Probabilistic finite-state machines—part II. IEEE Transactions on Pattern Analysis and Machine Intelligence.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.