Stanford InfoLab Publication Server

The Efficacy of GlOSS for the Text Database Discovery Problem

Gravano, L. and Garcia-Molina, H. and Tomasic, A. (1993) The Efficacy of GlOSS for the Text Database Discovery Problem. Technical Report. Stanford InfoLab. (Publication Note: Part of paper appeared in SIGMOD '94.)




The popularity of information retrieval has led users to a new problem: finding which text databases (out of thousands of candidate choices) are the most relevant to a user. Answering a given query with a list of relevant databases is the text database discovery problem. The first part of this paper presents a practical method for attacking this problem based on estimating the result size of a query and a database. The method is termed GlOSS-Glossary of Servers Server. The second part of this paper evaluates GlOSS using four different semantics to answer a user's queries. Real users' queries were used in the experiments. We also describe several variations of GlOSS and compare their effcacy. In addition, we analyze the storage cost of our approach to the problem.

Item Type:Techreport (Technical Report)
Subjects:Computer Science
Related URLs:Project Homepage
ID Code:29
Deposited By:Import Account
Deposited On:25 Feb 2000 16:00
Last Modified:02 Dec 2008 14:29

Download statistics

Repository Staff Only: item control page