Stanford InfoLab Publication Server

Efficient Query Processing for Modern Data Management

Srivastava, Utkarsh (2006) Efficient Query Processing for Modern Data Management. PhD thesis, Stanford University.




Efficient query processing in any data management system typically relies on: (a) A profiling component that gathers statistics used to evaluate possible query execution plans, and (b) A planning component that picks the plan with the best predicted performance. For query processing in a range of new data management scenarios, e.g., query processing over data streams, and web services, traditional profiling and planning techniques developed for conventional relational database management systems are inadequate. This thesis develops several novel profiling and planning techniques to enable efficient query processing in these new scenarios. When data is arriving rapidly in the formof streams, andmany registered queries must be continuously executed over this data, system resources such as memory and processing power may be stretched to their limit. First, for a class of computation-intensive queries, we describe how system throughput can be increased by exploiting sharing of computation among the registered queries. Then, for a class of memory-intensive queries, we consider the case when system memory is insufficient for obtaining exact answers, and give techniques for maximizing result accuracy under the given memory constraints. We then consider a distributed setting such as that of a sensor network, and give techniques for deciding the placement of query operators at network nodes in order to minimize systemwide consumption of resources. We then consider the scenario of web services, which have been emerging as a popular standard for sharing data and functionality among loosely-coupled systems. For queries involving multiple web services, we give algorithms for finding the optimal execution plan. Finally, we turn to the profiling component, and describe new techniques for gathering statistics by not looking at the data but only at the query results. Such a technique is required when data access for collecting statistics is infeasible, as for web services, but can also be useful in traditional databases.

Item Type:Thesis (PhD)
Uncontrolled Keywords:data streams, query optimization, query processing, statistics, web services
Subjects:Computer Science > Data Streams
Computer Science > Query Processing
Related URLs:Project Homepage, Project Homepage,
ID Code:786
Deposited By:Import Account
Deposited On:25 Sep 2006 17:00
Last Modified:19 Dec 2008 10:24

Download statistics

Repository Staff Only: item control page