Joglekar, Manas and Garcia-Molina, Hector and Parameswaran, Aditya and Re, Chris Exploiting Correlations for Expensive Predicate Evaluation. Technical Report. Stanford InfoLab.
This is the latest version of this item.
User Defined Function(UDFs) are used increasingly to augment query languages with extra, application dependent functionality. UDFs tend to be expensive, either in terms of monetary cost or latency. In this paper, we study ways to efficiently evaluate selection queries involving UDFs. We provide a family of techniques for processing queries at low cost while satisfying user-specified precision and recall constraints. Our techniques are applicable to a wide variety of scenarios, such as when selection probabilities of tuples are available beforehand, when this information is available but noisy, or when no such prior information is available. We also generalize our techniques to more complex queries. Finally, we test our techniques on real datasets, and show that they achieve significant savings in cost of upto 80%, while incurring only a small reduction in accuracy.
|Item Type:||Techreport (Technical Report)|
|Deposited By:||Aditya Parameswaran|
|Deposited On:||12 Nov 2014 13:53|
|Last Modified:||12 Nov 2014 13:54|
Available Versions of this Item
- Exploiting Correlations for Evaluating Complex Queries. (deposited 02 Apr 2014 15:38)
- Exploiting Correlations for Evaluating Complex Queries. (deposited 29 May 2014 17:08)
- Exploiting Correlations for Expensive Predicate Evaluation. (deposited 12 Nov 2014 13:53) [Currently Displayed]
Repository Staff Only: item control page