Babcock, Brian and Olston, Chris (2003) Distributed Top-K Monitoring. In: ACM International Conference on Management of Data (SIGMOD 2003), June 9-12, 2003, San Diego, California.
This is the latest version of this item.
The querying and analysis of data streams has been a topic of much recent interest, motivated by applications from the fields of networking, web usage analysis, sensor instrumentation, telecommunications, and others. Many of these applications involve monitoring answers to continuous queries over data streams produced at physically distributed locations, and most previous approaches require streams to be transmitted to a single location for centralized processing. Unfortunately, the continual transmission of a large number of rapid data streams to a central location can be impractical or expensive. We study a useful class of queries that continuously report the k largest values obtained from distributed data streams ("top-k monitoring queries"), which are of particular interest because they can be used to reduce the overhead incurred while running other types of monitoring queries. We show that transmitting entire data streams is unnecessary to support these queries and present an alternative approach that reduces communication significantly. In our approach, arithmetic constraints are maintained at remote stream sources to ensure that the most recently provided top-k answer remains valid to within a user-specified error tolerance. Distributed communication is only necessary on occasion, when constraints are violated, and we show empirically through extensive simulation on real-world data that our approach reduces overall communication cost by an order of magnitude compared with alternatives that offer the same error guarantees.
|Item Type:||Conference or Workshop Item (Paper)|
|Uncontrolled Keywords:||top-k, online monitoring, continuous queries, data streams|
|Subjects:||Computer Science > Data Streams|
Computer Science > Distributed Systems
|Related URLs:||Project Homepage, Project Homepage||http://infolab.stanford.edu/stream/, http://infolab.stanford.edu/trapp/|
|Deposited By:||Import Account|
|Deposited On:||27 Feb 2003 16:00|
|Last Modified:||24 Dec 2008 08:37|
Available Versions of this Item
- Distributed Top-K Monitoring. (deposited 18 Dec 2002 16:00)
- Distributed Top-K Monitoring. (deposited 27 Feb 2003 16:00) [Currently Displayed]
Repository Staff Only: item control page