Stanford InfoLab Publication Server

Precision and Recall of GlOSS Estimators for Database Discovery

Gravano, L. and Garcia-Molina, H. and Tomasic, A. (1994) Precision and Recall of GlOSS Estimators for Database Discovery. Technical Report. Stanford University. (Publication Note: Part of paper appeared in Third International Conference on Parallel and Distributed Information Systems, Austin, Texas, September 28-30, 1994 (PDIS 1994))




The availability of large numbers of network information sources has led to a new problem: finding which text databases (out of perhaps thousands of choices) are the most relevant to a query. We call this the text-database discovery problem. Our solution to this problem, GlOSS{Glossary-Of-Servers Server, keeps statistics on the available databases to decide which ones are potentially useful for a given query. In this paper we present different query-result size estimators for GlOSS and we evaluate them with metrics based on the precision and recall concepts of text-document information-retrieval theory. Our generalization of these metrics uses different notions of the set of relevant databases to define different query semantics.

Item Type:Techreport (Technical Report)
Subjects:Computer Science
Related URLs:Project Homepage
ID Code:62
Deposited By:Import Account
Deposited On:25 Feb 2000 16:00
Last Modified:05 Feb 2009 15:24

Download statistics

Repository Staff Only: item control page