Stanford InfoLab Publication Server

Continuous Queries over Data Streams

Arasu, Arvind (2006) Continuous Queries over Data Streams. PhD thesis, Stanford University.




Continuous queries (CQs) represent a new paradigm for interacting with dynamically-changing data. Unlike traditional one-time queries, a CQ is registered with a data management system and provides continuous results as data and updates stream into the system. Applications include tracking real-time trends in stock market data, monitoring the health of a computer network, and online processing of sensor data. This thesis addresses several fundamental challenges in building a system for processing declaratively-specified continuous queries. We first present a new language---an intuitive and natural extension of a traditional database query language---for specifying CQs. The language has been implemented in a comprehensive, publicly-available research prototype called STREAM (for STanford stREam datA Manager). Since CQs are long-running, potentially requiring large amounts of memory, we next present a precise characterization of the amount of memory required for any query in the language. For an important class of queries that require unbounded memory, we describe algorithms that trade off answer accuracy for a lower memory requirement. Finally, we describe techniques for sharing resources such as computation and state across multiple CQs, enabling scalability to a very large number of concurrent CQs.

Item Type:Thesis (PhD)
Uncontrolled Keywords:Data Streams, Continuous Queries
Subjects:Computer Science > Data Streams
Related URLs:Project Homepage
ID Code:792
Deposited By:Import Account
Deposited On:26 Feb 2006 16:00
Last Modified:18 Dec 2008 14:37

Download statistics

Repository Staff Only: item control page