Chang, C. and Garcia-Molina, H. (1999) Approximate Query Translation (Extended Version). Technical Report. Stanford.
In this paper we present a mechanism for approximately translating Boolean query constraints across heterogeneous information sources. Achieving the best translation is challenging because sources support different constraints for formulating queries, and often these constraints cannot be precisely translated. For instance, a query [score > 8] might be ``perfectly'' translated as [rating][>][0.8] at some site, but can only be approximated as [grade][=][A] at another. Unlike other work, our general framework adopts a customizable ``closeness'' metric for the translation that combines both precision and recall. Our results show that for query translation we need to handle interdependencies among both query conjuncts as well as disjuncts. As the basis, we identify the essential requirements of a rule system for users to encode the mappings for atomic semantic units. Our algorithm then translates complex queries by rewriting them in terms of the semantic units. We show that, under practical assumptions, our algorithm generates the best approximate translations with respect to the closeness metric of choice. We also discuss how the precision and recall of the translated queries can be estimated.
|Item Type:||Techreport (Technical Report)|
|Uncontrolled Keywords:||constraint mapping, approximate query translation, heterogeneity, information integration|
|Subjects:||Computer Science > Digital Libraries|
Computer Science > Query Processing
|Related URLs:||Project Homepage||http://www-diglib.stanford.edu/diglib/pub/|
|Deposited By:||Import Account|
|Deposited On:||25 Feb 2000 16:00|
|Last Modified:||27 Dec 2008 16:39|
Repository Staff Only: item control page