XML Stream Attribute Grammars (XSAGs)

(Christoph Koch and Stefanie Scherzinger)
Abstract
We
introduce the new notion of XML Stream Attribute Grammars (XSAGs).
XSAGs are the first scalable query language for XML streams (running
strictly in linear time with bounded memory consumption independent of
the size of the stream) that allows for actual data transformations
rather than just document filtering. XSAGs are also relatively easy to
use for humans.
Moreover, the XSAG formalism provides a strong intuition for which
queries can or cannot be processed scalably on streams. We introduce
XSAGs together with the necessary language-theoretic machinery, study
their theoretical properties such as their expressiveness and
complexity, and discuss their implementation.
Publications
Christoph Koch, Stefanie Scherzinger: Attribute Grammars for Scalable Query Processing on XML Streams. DBPL 2003: 233-256. (209kB PDF file).
Christoph Koch, Stefanie Scherzinger: Attribute Grammars for Scalable Query Processing on XML Streams. To appear in VLDB Journal, 2005. Journal version.
Additional Resources

Our
platform is a laptop with a 1.3 GHz processor and 256 MB of RAM running
Linux. The compiler is gcc version 3.2 with the -O3 command-line
option. We parse XML files with a simple SAX parser based on flex
(version 2.5.4).
Two values are of immediate interest in XML stream processing, namely the memory consumption during query evaluation and the throughput of the prototype implementations.
We first study the evaluation of the following yXSAG queries: Q1: "Validate the input", Q2: "Select all books", Q3: "Select all books published in 2003", and Q4: "Select all books".
back to
main page of Saarland
University Database
Group