Stanford InfoLab Publication Server

Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies

Gravano, L. and Garcia-Molina, H. (1995) Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies. Technical Report. Stanford InfoLab. (Publication Note: Shortened version appeared in VLDB '95.)




As large numbers of text databases have become available on the Internet, it is getting harder to locate the right sources for given queries. In this paper we present gGlOSS, a generalized Glossary-Of-Servers Server, that keeps statistics on the available databases to estimate which databases are the potentially most useful for a given query. gGlOSS extends our previous work [GGMT94a], which focused on databases using the boolean model of document retrieval, to cover databases using the more sophisticated vector-space retrieval model. We evaluate our new techniques using real-user queries and 53 databases. Finally, we further generalize our approach by showing how to build a hierarchy of gGlOSS brokers. The top level of the hierarchy is so small it could be widely replicated, even at end-user workstations. Keywords: resource discovery, database selection, vector-space retrieval model, information retrieval, text databases

Item Type:Techreport (Technical Report)
Subjects:Computer Science
Related URLs:Project Homepage
ID Code:105
Deposited By:Import Account
Deposited On:25 Feb 2000 16:00
Last Modified:02 Dec 2008 15:48

Download statistics

Repository Staff Only: item control page