×

Elastic translation invariant matching of trajectories. (English) Zbl 1075.68080

Summary: We investigate techniques for analysis and retrieval of object trajectories. We assume that a trajectory is a sequence of two or three dimensional points. Trajectory datasets are very common in environmental applications, mobility experiments, video surveillance and are especially important for the discovery of certain biological patterns. Such kind of data usually contain a great amount of noise, that makes all previously used metrics fail. Therefore, here we formalize non-metric similarity functions based on the longest common subsequence, which are very robust to noise and furthermore provide an intuitive notion of similarity between trajectories by giving more weight to the similar portions of the sequences. Stretching of sequences in time is allowed, as well as global translating of the sequences in space. Efficient approximate algorithms that compute these similarity measures are also provided. We compare these new methods to the widely used Euclidean and dynamic time warping distance functions (for real and synthetic data) and show the superiority of our approach, especially under the strong presence of noise. We prove a weaker version of the triangle inequality and employ it in an indexing structure to answer nearest neighbor queries. Finally, we present experimental results that validate the accuracy and efficiency of our approach.

MSC:

68T10 Pattern recognition, speech recognition
68P20 Information storage and retrieval of data
68T05 Learning and adaptive systems in artificial intelligence
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Aach, J., & Church, G. (2001). Aligning gene expression time series with time warping algorithms. Bioinformatics, 17, 495-508. · doi:10.1093/bioinformatics/17.6.495
[2] Aggarwal, C. C., Hinneburg, A., & Keim, D. A. (2001). On the surprising behavior of distance metrics in high dimensional spaces. In ICDT (pp. 420-434). · Zbl 1047.68038
[3] Agrawal, R., Faloutsos, C., & Swami, A. (1993). Efficient similarity search in sequence databases. In Proc. of the 4th International Conference of Foundations of Data Organization (FODO) (pp. 69-84).
[4] Agrawal, R., Lin, K., Sawhney, H. S., & Shim, K. (1995). Fast similarity search in the presence of noise, scaling and translation in time-series databases. In Proc of International Conference on Very Large Databases (VLDB) (pp. 490-501).
[5] Bar-Joseph, Z., Gerber, G., Gifford, D., Jaakkola, T., & Simon, I. (2002). A new approach to analyzing gene expression time series data. In Proc. of 6th RECOMB (pp. 39-48).
[6] Barbarä, D. (1999). Mobile computing and databases?A survey. IEEE TKDE (pp. 108-117).
[7] Berndt, D., & Clifford, J. (1994). Using dynamic time warping to find patterns in time series. In AAAI Workshop on Knowledge Discovery in Databases (pp. 229-248).
[8] Betke, M., Gips, J., & Fleming, P. (2002). The camera mouse: Visual tracking of body features to provide computer access for people with severe disabilities. IEEE Transactions on Neural Systems and Rehabilitation Engineering 10:1, 1-10. · doi:10.1109/TNSRE.2002.1021581
[9] Beyer, K., Goldstein, J., Ramakrishnan, R., & Shaft, U. (1999). When is nearest neighbor meaningful? In Proc of ICDT (pp. 217-235).
[10] Bollobäs, B., Das, G., Gunopulos, D., & Mannila, H. (1997). Time-Series similarity problems and well-separated geometric sets. In Proc of the 13th SCG, Nice, France. · Zbl 1005.68059
[11] Bozkaya, T., Yazdani, N., & Ozsoyoglu, M. (1997). Matching and indexing sequences of different lengths. In Proc. of the CIKM, Las Vegas.
[12] Chakrabarti, K., & Mehrotra, S. (2000). Local dimensionality reduction: A new approach to indexing high dimensional spaces. In Proc. of VLDB (pp. 89-100).
[13] Chu, K., & Wong, M. (1999). Fast time-series searching with scaling and shifting. ACM Principles of Database Systems (pp. 237-248).
[14] Das, G., Gunopulos, D., & Mannila, H. (1997). Finding similar time series. In Proc. of the First PKDD Symp. (pp. 88-100).
[15] Duda, R., & Hart, P. (1973). Pattern Classification and Scene Analysis. John Wiley and Sons, Inc. · Zbl 0277.68056
[16] Faloutsos, C., Jagadish, H., Mendelzon, A., & Milo, T. (1997). Signature technique for similarity-based queries. In Proc. of the Compression and Complexity of Sequences.
[17] Faloutsos, C., & Lin, K.-I. (1995). FastMap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In Proc. ACM SIGMOD (pp. 163-174).
[18] Faloutsos, C., Ranganathan, M., & Manolopoulos, I. (1994). Fast subsequence matching in time series databases. In Proc. of ACM SIGMOD (pp. 419-429).
[19] Gaffney, S., & Smyth, P. (1999). Trajectory clustering with mixtures of regression models. In Proc. of the 5th ACM SIGKDD, San Diego, CA (pp. 63-72).
[20] Gavrila, D., & Davis, L. (1995). Towards 3-d model-based tracking and recognition of human movement: A multi-view approach. In Int. Workshop on Face and Gesture Recognition (pp. 272-277).
[21] Ge, X., & Smyth, P. (2000). Deformable Markov model templates for time-series pattern matching. In Proc. of ACM SIGKDD (pp. 81-90).
[22] Gips, J., Betke, M., & Fleming, P. (2000). The camera mouse: Preliminary investigation of automated visual tracking for computer access. In Proceedings of the Rehabilitation Engineering and Assistive Technology Society of North America 2000 Annual Conference (pp. 98-100) Orlando, FL.
[23] Goldin, D., & Kanellakis, P. (1995). On similarity queries for time-series data. In Proc. Int. Conference on Principles and Practice of Constraint Programming (pp. 137-153).
[24] Goldstein, J., & Ramakrishnan, R. (2000). Contrast plots and P-Sphere trees: Space vs. time in nearest neighbor searches. In Proc. of VLDB (pp. 429-440).
[25] Guha, S., Rastogi, R., & Shim, K. (1998). CURE: An efficient clustering algorithm for large databases. In Proc. of ACM SIGMOD (pp. 73-84). · Zbl 1006.68661
[26] Jagadish, H. V., Mendelzon, A. O., & Milo, T. (1995). Similarity-Based aueries. In Proc. of the 14th ACM PODS (pp. 36-45).
[27] Kahveci, T., Singh, A., & Gurel, A. (2002). Similarity searching for Multi-attribute sequences. In Proc. of SSDBM.
[28] Kahveci, T., & Singh, A. K. (2001). Variable length queries for time series data. In Proc. of IEEE ICDE (pp. 273-282).
[29] Keogh, E. (2002). Exact indexing of dynamic time warping. In Proc. of VLDB.
[30] Keogh, E., Chakrabarti, K., Mehrotra, S., & Pazzani, M. (2001). Locally adaptive dimensionality reduction for indexing large time series databases. In Proc. of ACM SIGMOD (pp. 151-162). · Zbl 0989.68039
[31] Keogh, E., & Pazzani, M. (2000). Scaling up dynamic time warping for datamining applications. In Proc. 6th SIGKDD.
[32] Keogh, E., & Ratanamahatana, A. (2004). Everything you know about dynamic time warping is wrong. In Third Workshop on Mining Temporal and Sequential Data (SIGKDD).
[33] Kim, S., Park, S., & Chu, W. (2001). An Index-based approach for similarity search supporting time warping in large sequence databases. In Proc. of 17th ICDE.
[34] Kivinen, J., & Mannila, H. (1994). The power of sampling in knowledge discovery. Proceedings of PODS. · Zbl 0874.68247
[35] Kovács-Vajna, Z. (2000). A fingerprint verification system based on triangular matching and dynamic time warping. In IEEE Transactions on PAMI, 22:11, 1266-1276.
[36] Lee, S.-L., Chun, S.-J., Kim, D.-H., Lee, J.-H., & Chung, C.-W. (2000). Similarity search for multidimensional data sequences. In Proceedings of ICDE (pp. 599-608).
[37] Levenshtein, V. (1966). Binary codes capable of correcting deletions, insertions, and reversals. In Soviet Physics?Doklady, 10:10, 707-710.
[38] Medin, D., Goldstone, R., & Gentner, D. (1993). Respects for similarity. In Psychological Review, 100:2, 254-278. · doi:10.1037/0033-295X.100.2.254
[39] Munich, M., & Perona, P. (1999). Continuous dynamic time warping for translation-invariant curve alignment with applications to signature verification. In 7th International Conference on Computer Vision (pp. 108-115).
[40] Perng, S., Wang, H., Zhang, S., & Parker, D. S. (2000). Landmarks: A New model for similarity-based pattern querying in time series databases. In Proc. of International Conference on Data Engineering (ICDE) (pp. 33-42).
[41] Qu, Y., Wang, C., & Wang, X. (1998). Supporting fast search in time series for movement patterns in multiple scales. In Proc of the ACM CIKM (pp. 251-258).
[42] Rafiei, D., & Mendelzon, A. (2000). Querying time series data based on similarity. IEEE Transactions on Knowledge and Data Engineering, 12:5, 675-693. · Zbl 05109519 · doi:10.1109/69.877502
[43] Ratanamahatana, C. A., & Keogh, E. (2004). Making time-series classification more accurate using learned constraints. In Proc. of SIAM International Conference on Data Mining (SDM).
[44] Rath, T., & Manmatha, R. (2002). Word image matching using dynamic time warping. In Tec Report MM-38. Center for Intelligent Information Retrieval, University of Massachusetts Amherst.
[45] Strik, H., & Boves, L. (1988). Averaging physiological signals with the use of a DTW algorithm. In SPEECH?88, 7th FASE Symposium, Book 3 (pp. 883-890).
[46] Vlachos, M., Hadjieleftheriou, M., Gunopulos, D., & Keogh, E. (2003). Indexing multi-dimensional time-series with support for multiple distance measures. In Proc. of 9th SIGKDD (pp. 216-225).
[47] Vlachos, M., Kollios, G., & Gunopulos, D. (2002). Discovering similar multidimensional trajectories. In Proc. of ICDE (pp. 673-684).
[48] Weber, R., Schek, H.-J., & Blott, S. (1998). A quantitative analysis and performance study for similarity search methods in high-dimensional spaces. In Proc. of VLDB (pp. 194-205).
[49] Yi, B.-K., & Faloutsos, C. (2000). Fast time sequence indexing for arbitrary Lp norms. In Proc. of VLDB.
[50] Yi, B.-K., Jagadish, H. V., & Faloutsos, C. (1998). Efficient retrieval of similar time sequences under time warping. In Proc. of ICDE (pp. 201-208).
[51] Zaki, M. J., Parthasarathy, S., Li, W., & Ogihara, M. (1997). Evaluation of sampling for data mining of association rules. Proceedings of RIDE.
[52] Zhu, Y., & Shasha, D. (2003). Warping indexes with envelope transforms for query by humming. In Proc. of ACM SIGMOD. pp. 181-192.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.