Stanford InfoLab Publication Server

Distributed Top-K Monitoring

Babcock, Brian and Olston, Chris (2002) Distributed Top-K Monitoring. Technical Report. Stanford.

BibTeXDublinCoreEndNoteHTML
WarningThere is a more recent version of this item available.

[img]
Preview
PDF
380Kb

Abstract

The querying and analysis of data streams has been a topic of much recent interest, motivated by applications from the fields of networking, web usage analysis, sensor instrumentation,telecommunications, and others. Many of these applications involve onitoring answers to continuous queries over data streams produced at physically distributed locations, and most previous approaches require streams to be transmitted to a single location for centralized rocessing. Unfortunately, the continual transmission of a large number of rapid data streams to a central location can be impractical or expensive. We study a useful class of queries that continuously report the k largest values obtained from distributed data streams "top-k monitoring queries"), which are of particular interest ecause they can be used to reduce the overhead incurred while running other types of monitoring queries. We show that transmitting entire data streams is unnecessary to support these queries and present an alternative approach that reduces communication significantly. In our approach, arithmetic constraints are maintained at remote stream sources to ensure that the most recently provided top-k answer remains valid to within a user-specified error tolerance. Distributed communication is only necessary on occasion, when constraints are violated, and we show empirically through extensive simulation on real-world data that our approach reduces overall communication cost by an order of magnitude compared with alternatives that offer the same error guarantees.

Item Type:Techreport (Technical Report)
Uncontrolled Keywords:top-k; online monitoring; data streams; distributed streams
Subjects:Computer Science > Data Streams
Projects:TRAPP
Related URLs:Project Homepagehttp://infolab.stanford.edu/trapp/
ID Code:568
Deposited By:Import Account
Deposited On:18 Dec 2002 16:00
Last Modified:25 Dec 2008 08:35

Available Versions of this Item

Download statistics

Repository Staff Only: item control page