XML is currently the most popular format for exchanging and representing data on the web. It is used in various applications and for different types of data including structured, semistructured, and unstructured heterogeneous data types. During the period, XML was establishing itself, data streaming applications have gained increased attention and importance. Because of these developments, the querying and efficient processing of XML streams has became a central issue. In this study, we survey the state of the art in XML streaming evaluation techniques. We focus on both the streaming evaluation of XPath expressions and of XQuery queries. We classify the XPath streaming evaluation approaches according to the main data structure used for the evaluation into three categories: automaton-based approach, array-based approach, and stack-based approach. We review, analyze, and compare the major techniques proposed for each approach. We also review multiple query streaming evaluation techniques. For the XQuery streaming evaluation problem, we identify and discuss four processing paradigms adopted by the existing XQuery stream query engines: the transducer-based paradigm, the algebra-based paradigm, the automata-algebra paradigm, and the pull-based paradigm. In addition, we review optimization techniques for XQuery streaming evaluation. We address the problem of optimizing XQuery streaming evaluation as a buffer optimization problem. For all techniques discussed, we describe the research issues and the proposed algorithms and we compare them with other relevant suggested techniques.
All Science Journal Classification (ASJC) codes
- Information Systems
- Hardware and Architecture
- XML query optimization
- XML streaming evaluation