an:06842363
Zbl 1425.68338
Aoga, John O. R.; Guns, Tias; Schaus, Pierre
Mining time-constrained sequential patterns with constraint programming
EN
Constraints 22, No. 4, 548-570 (2017).
00369522
2017
j
68T05 68T10 68T20 90C59
data mining; sequential pattern mining; constraint programming; global constraint; gap constraint; span constraint; time constraint
Summary: Constraint Programming (CP) has proven to be an effective platform for constraint based sequence mining. Previous work has focused on standard frequent sequence mining, as well as frequent sequence mining with a maximum `gap' between two matching events in a sequence. The main challenge in the latter is that this constraint can not be imposed independently of the omnipresent frequency constraint. Indeed, the gap constraint changes whether a subsequence is included in a sequence, and hence its frequency. In this work, we go beyond that and investigate the integration of timed events and constraining the minimum/maximum gap as well as minimum/maximum span. The latter constrains the allowed time between the first and last matching event of a pattern. We show how the three are interrelated, and what the required changes to the frequency constraint are. Key in our approach is the concept of an extension window defined by gap/span and we develop techniques to avoid scanning the sequences needlessly, as well as using a backtracking-aware data structure. Experiments demonstrate that the proposed approach outperforms both specialized and CP-based approaches in almost all cases and that the advantage increases as the minimum frequency threshold decreases.