×

zbMATH — the first resource for mathematics

An approach of top-\(k\) keyword querying for fuzzy XML. (English) Zbl 1387.68105
Summary: Keyword search on XML document has received wide attention. Many search semantics and algorithms have been proposed for XML keyword queries. But the existing approaches fall short in their abilities to support keyword queries over fuzzy XML documents. To overcome this limitation, in this paper, we discuss how to obtain and evaluate top-\(k\) smallest lowest common ancestor (SLCA) results of keyword queries on fuzzy XML documents. We define the fuzzy SLCA semantics on the fuzzy XML document, and then propose a novel encoding scheme to denote different types of nodes in fuzzy XML documents. After these, we propose two efficient algorithms to find \(k\) SLCA results with highest possibilities for a given keyword query on the fuzzy XML document. First one is an algorithm which can obtain the top-\(k\) SLCA results and their possibilities based on the stack technique. The second algorithm can obtain top-\(k\) SLCA results of keyword queries based on a set of SLCA’s properties. Finally, we compare and evaluate the performances of the two algorithms.
MSC:
68P20 Information storage and retrieval of data
68P05 Data structures
Software:
ProTDB; XSEarch
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Nierman A, Jagadish HV (2002) ProTDB: probabilistic data in XML. In: Proceedings of VLDB, pp 646-657
[2] Abiteboul, S; Kimelfeld, B; Sagiv, Y; Senellart, P, On the expressiveness of probabilistic XML models, VLDB J, 18, 1041-1064, (2009)
[3] Hung E, Getoor L, Subrahmanian VS (2003) PXML: a probabilistic semistructured data model and algebra. In: Proceedings of the 19th international conference on data engineering, pp 467-478
[4] Senellart P, Abiteboul S (2007) On the complexity of managing probabilistic XML data. In: Proceedings of the 26th ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, pp 283-292
[5] van Keulen M, de Keijzer A, Alink W (2005) A probabilistic XML approach to data integration. In: Proceedings of ICDE, pp 459-470
[6] Abiteboul S, Senellart P (2006) Querying and updating probabilistic information in XML. In: Proceedings of EDBT, pp 1059-1068
[7] Cohen, S; Kimelfeld, B; Sagiv, Y, Incorporating constraints in probabilistic XML, ACM Trans Database Syst, 34, 109-118, (2009)
[8] Kimelfeld B, Sagiv Y (2007) Matching twigs in probabilistic XML. In: Proceedings of the 33rd international conference on vary large data bases, pp 27-38
[9] Li Y, et al (2009) Holistically twig matching in probabilistic XML. In: Proceedings of the 25th international conference on data engineering, pp 1649-1656
[10] Ma ZM (2005) Fuzzy database modeling with XML. Springer, New York · Zbl 1087.68028
[11] Ma, ZM; Yan, L, Fuzzy XML data modeling with the UML and relational data models, Data Knowl Eng, 63, 972-996, (2007)
[12] Yan L, Ma ZM, Liu J (2009) Fuzzy data modeling based on XML schema. In: Proceedings of the 2009 ACM symposium on applied computing, pp 1563-1567
[13] Gaurav A, Alhajj R (2006) Incorporating fuzziness in XML and mapping fuzzy relational data into fuzzy XML. In: Proceedings of the 2006 ACM symposium on applied computing, pp 456-460
[14] Panić, G; Racković, M; Škrbić, S, Fuzzy XML and prioritized fuzzy xquery with implementation, J Intell Fuzzy Syst, 26, 303-316, (2014)
[15] Buche, P; Dibie-Barthèlemy, J; Haemmerlé, O; Hignette, G, Fuzzy semantic tagging and flexible querying of XML documents extracted from the web, J Intell Inf Syst, 26, 25-40, (2006)
[16] Jin Y, Veerappan S (2010) A fuzzy XML database system: data storage and query. In: Proceedings of the 2010 IEEE international conference on information reuse and integration, pp 318-321
[17] Lee, J; Fanjiang, Y, Modeling imprecise requirements with XML, Inf Softw Technol, 45, 445-460, (2003)
[18] Kimelfeld B, Senellart P (2013) Probabilistic XML: models and complexity. Advances in probabilistic databases for uncertain information management. Springer, Berlin, pp 39-66
[19] Ma, Z; Yan, L, Modeling fuzzy data with XML: a survey, Fuzzy Sets Syst, 301, 146-159, (2016)
[20] Zhou, R; Liu, CF; Li, JX; Yu, JX, ELCA evaluation for keyword search on probabilistic XML data, World Wide Web, 16, 171-193, (2013)
[21] Li J, Liu C, Zhou R, Wang W (2011) Top-k keyword search over probabilistic XML data. In: Proceedings of ICDE, pp 673-684 · Zbl 1328.68051
[22] Zhang CJ, et al (2012) Keywords filtering over probabilistic XML data. In: Web technologies and applications, pp 183-194
[23] Li JX, Liu CF, Zhou R, Yu JX Quasi-SLCA based keyword query processing over probabilistic XML data. IEEE Trans Knowl Data Eng (PrePrints)
[24] Liu J, Ma ZM, Yan L (2009) Efficient processing of twig pattern matching in fuzzy XML. In: Proceedings of CIKM, pp 117-126
[25] Liu, J; Ma, ZM; Qv, Q, Dynamically querying possibilistic XML data, Inf Sci, 261, 70-84, (2014) · Zbl 1328.68051
[26] Ma, ZM; Liu, J; Yan, L, Matching twigs in fuzzy XML, Inf Sci, 181, 184-200, (2011) · Zbl 1214.68068
[27] Liu, J; Ma, ZM; Ma, RZ, Efficient processing of twig query with compound predicates in fuzzy XML, Fuzzy Sets Syst, 229, 33-53, (2013) · Zbl 1284.68232
[28] Xu Y, Papakonstantinou Y (2005) Efficient keyword search for smallest LCAs in XML databases. In: Proceedings of SIGMOD, pp 527-538
[29] Xu Y, Papakonstantinou Y (2008) Efficient LCA based keyword search in XML data. In: Proceedings of EDBT, pp 535-546
[30] Liu Z, Chen Y (2007) Identifying meaningful return information for XML keyword search. In: Proceedings of SIGMOD, pp 329-340
[31] Li Y, Yu C, Jagadish HV (2004) Schema-free XQuery. In: Proceedings of VLDB, pp 72-83
[32] Guo L, Shao F, Botev C, Shanmugasundaram J (2003) XRANK: ranked keyword search over XML documents. In: Proceedings of SIGMOD
[33] Li J, et al (2009) Processing XML keyword search by constructing effective structured queries. In: Proceedings of the joint international conferences on advances in data and web management, pp 88-99
[34] Sun C, Chan CY, Goenka AK (2007) Multiway SLCA-based keyword search in XML data. In: Proceedings of WWW, pp 1043-1052
[35] Li T, Li X, Meng XF (2012) Rtop-k: a keyword proximity search method based on semantic and structural relaxation. In: Proceedings of the 2012 IEEE international conference on systems, man and cybernetics, pp 2079-2084
[36] Kong L, Gilleron R, Lemay A (2009) Retrieving meaningful relaxed tightest fragments for XML keyword search. In: Proceedings of EDBT
[37] Cohen S, Namou J, Kanza Y, Sagiv Y (2003) XSEarch: a semantic search engine for XML. In: Proceedings of VLDB
[38] Hristidis, V; Koudas, N; Papakonstantinou, Y; Srivastava, D, Keyword proximity search in XML trees, IEEE Trans Knowl Data Eng, 18, 525-539, (2006)
[39] Li GL, Feng JH, Wang JY, Zhou LZ (2007) Efficient keyword search for valuable LCAs over XML documents. In: Proceedings of the 16th ACM conference on conference on information and knowledge management, pp 31-40
[40] Bhalotia G, Nakhe C, Hulgeri A, Chakrabarti S, Sudarshan S (2002) Keyword searching and browsing in databases using BANKS. In: Proceedings of the 18th conference on data engineering, pp 431-440
[41] Zadeh, LA, Fuzzy sets, Inf Control, 8, 338-353, (1965) · Zbl 0139.24606
[42] Dewey Decimal Classification. http://www.oclc.org/dewey/
[43] George JK, Bo Y (1995) Fuzzy sets and fuzzy logic, theory and applications. Prentice Hall, Upper Saddle River · Zbl 0915.03001
[44] Klir G, Folder T (1988) Fuzzy sets, uncertainty and information. Prentice Hall, Englewood Cliff
[45] DBLP. http://dblp.uni-trier.de/xml/
[46] XMARK the XML-benchmark Project. http://www.monetdb.cwi.nl/xml/index.html
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.