The DBToaster Project
DBToaster is a novel SQL compiler that generates database engines for high-performance main-memory processing of streaming data. In a nutshell, DBToaster aggressively compiles aggregate queries to incremental (or delta-) form, enabling stream data to be processed highly efficiently, in contrast to today's operator-centric query plan interpreters. The DBToaster compiler produces database engines in native code to perform incremental view maintenance of continuous queries posed on update streams. Update streams cannot be addressed efficiently by today's systems, and one clear motivating application is that of algorithmic trading on orderbook data, where buy and sell orders on an exchange's orderbooks are updated arbitrarily. Algorithmic trading applications require high-performance processing, have bounded size state in practice, and often involve queries with a processing scope that cannot be addressed by constructs such as windows or punctuations.
The novelty behind DBToaster's query compilation to view maintenance code is twofold, first in its use of a recursive compilation technique to aggressively simplify queries to delta forms, and second, its maintenance of multiple map data structures throughout recursive compilation to support delta processing. To define recursive compilation, we contrast to traditional view maintenance algorithms. Current view maintenance techniques use query plans to compute result deltas from changes to a single base relation. To emphasize, the incremental form of a user-specified query is itself a query, but critically, a simpler query that exploits new query inputs. Typically, the new inputs allow the elimination of joins and simplifications of aggregate computations.
Today's view maintenance algorithms stop at this point. DBToaster on the other hand ploughs straight ahead, recursively applying compilation to transform delta forms that are themselves queries, to simpler and simpler queries, by considering combinations of base relation deltas. Our recursive compilation bottoms out at queries that can be represented as very simple procedural code statements. Furthermore, DBToaster maintains each delta form encountered as a map datastructure, essentially a group-by aggregate index derived from applying aggregate distributivity properties together with join-graph decompositions. These maps are incrementally maintained by the procedural code generated by recursively compiling the maps' delta forms. DBToaster internally uses a map algebra to represent and reason about queries and map datastructures, and performs recursive compilation through a set of transformation rules defined in our map algebra.
Here is an announcement of our demo at VLDB 2009.
Project Participants
- Yanif Ahmad
- Oliver Kennedy
- Christoph Koch
- Ki Suh Lee
- Anton Morozov
- Richard Yu Cheng
Publications
-
DBToaster: A SQL Compiler for High-Performance Delta Processing in Main-Memory Databases
Yanif Ahmad and Christoph Koch.
To appear in Proc. VLDB 2009. Demo paper. (pdf)