Stanford InfoLab Publication Server

Exploiting Correlations for Evaluating Complex Queries

Joglekar, Manas and Garcia-Molina, Hector and Parameswaran, Aditya and Re, Chris Exploiting Correlations for Evaluating Complex Queries. Technical Report. Stanford InfoLab.

BibTeXDublinCoreEndNoteHTML
WarningThere is a more recent version of this item available.

[img]
Preview
PDF - Submitted for Publication
528Kb

Abstract

User Defined Function(UDFs) are used increasingly to augment query languages with extra, application dependent functionality. UDFs tend to be expensive, either in terms of monetary cost or latency. In this paper, we study ways to efficiently evaluate selection queries involving UDFs. We provide a family of techniques for processing queries at low cost while satisfying user-specified precision and recall constraints. Our techniques are applicable to a wide variety of scenarios, such as when selection probabilities of tuples are available beforehand, when this information is available but noisy, or when no such prior information is available. We also generalize our techniques to more complex queries. Finally, we test our techniques on real datasets, and show that they achieve significant savings in cost of upto 80%, while incurring only a small reduction in accuracy.

Item Type:Techreport (Technical Report)
ID Code:1092
Deposited By:Aditya Parameswaran
Deposited On:02 Apr 2014 15:38
Last Modified:29 May 2014 17:08

Available Versions of this Item

Download statistics

Repository Staff Only: item control page