Stanford InfoLab Publication Server

Query Optimization over Crowdsourced Data

Park, Hyunjung and Widom, Jennifer (2013) Query Optimization over Crowdsourced Data. In: 39th International Conference on Very Large Data Bases (VLDB), Trento, Italy.




Deco is a comprehensive system for answering declarative queries posed over stored relational data together with data obtained on-demand from the crowd. In this paper we describe Deco's cost-based query optimizer, building on Deco's data model, query language, and query execution engine presented earlier. Deco's objective in query optimization is to find the best query plan to answer a query, in terms of estimated monetary cost. Deco's query semantics and plan execution strategies require several fundamental changes to traditional query optimization. Novel techniques incorporated into Deco's query optimizer include a cost model distinguishing between "free" existing data versus paid new data, a cardinality estimation algorithm coping with changes to the database state during query execution, and a plan enumeration algorithm maximizing reuse of common subplans in a setting that makes reuse challenging. We experimentally evaluate Deco's query optimizer, focusing on the accuracy of cost estimation and the efficiency of plan enumeration.

Item Type:Conference or Workshop Item (Paper)
ID Code:1063
Deposited By:Hyunjung Park
Deposited On:31 Dec 2012 21:28
Last Modified:18 Jun 2013 17:54

Download statistics

Repository Staff Only: item control page