Integrating Heterogeneous Databases: Lazy or Eager?

Widom, J. (1996) Integrating Heterogeneous Databases: Lazy or Eager? Technical Report. Stanford InfoLab. (Publication Note: ACM Computing Surveys 28A(4), December 1996, invited position paper)




Providing integrated access to multiple, distributed, heterogeneous, autonomous databases and other information sources is a topic that has been studied in the database research community for well over a decade. There has been a surge of work in the area recently, due primarily to increased demand from customers ("real" customers as well as funding Nevertheless, despite the longevity of the subfield and the current large population of researchers working in the area, no winning solution or even consensus of approach has emerged. In the research community, most approaches to solving the data integration problem are based very roughly on the following two-step process: 1. Accept a query, determine the appropriate set of information sources to answer the query, and generate the appropriate subqueries or commands for each information source. 2. Obtain results from the information sources, perform appropriate translation, filtering, and merging of the information, and return the final answer to the user or application (hereafter called the client).

