Stanford InfoLab Publication Server

Exploiting Correlations for Expensive Predicate Evaluation

Joglekar, Manas and Garcia-Molina, Hector and Parameswaran, Aditya and Re, Chris Exploiting Correlations for Expensive Predicate Evaluation. Technical Report. Stanford InfoLab.


This is the latest version of this item.



User Defined Function(UDFs) are used increasingly to augment query languages with extra, application dependent functionality. UDFs tend to be expensive, either in terms of monetary cost or latency. In this paper, we study ways to efficiently evaluate selection queries involving UDFs. We provide a family of techniques for processing queries at low cost while satisfying user-specified precision and recall constraints. Our techniques are applicable to a wide variety of scenarios, such as when selection probabilities of tuples are available beforehand, when this information is available but noisy, or when no such prior information is available. We also generalize our techniques to more complex queries. Finally, we test our techniques on real datasets, and show that they achieve significant savings in cost of upto 80%, while incurring only a small reduction in accuracy.

Item Type:Techreport (Technical Report)
ID Code:1108
Deposited By:Aditya Parameswaran
Deposited On:12 Nov 2014 13:53
Last Modified:12 Nov 2014 13:54

Available Versions of this Item

Download statistics

Repository Staff Only: item control page