Stanford InfoLab Publication Server

Caching and Database Scaling in Distributed Shared-Nothing Information Retrieval Systems

Tomasic, A. and Garcia-Molina, H. (1993) Caching and Database Scaling in Distributed Shared-Nothing Information Retrieval Systems. In: ACM SIGMOD International Conference on Management of Data (SIGMOD 1993), May 26-28, 1993, Washington, D.C.

BibTeXDublinCoreEndNoteHTML

[img]
Preview
PDF
229Kb

Abstract

A common class of existing information retrieval system provides access to abstracts. For example Stanford University, through its FOLIO system, provides access to the INSPEC database of abstracts of the literature on physics, computer science, electrical engineering, etc. In this paper this database is studied by using a trace-driven simulation. We focus on physical index design, inverted index caching, and database scaling in a distributed shared-nothing system. All three issues are shown to have a strong effect on response time and throughput. Database scaling is explored in two ways. One way assumes an "optimal" configuration for a single host and then linearly scales the database by duplicating the host architecture as needed. The second way determines the optimal number of hosts given a fixed database size.

Item Type:Conference or Workshop Item (Paper)
Subjects:Computer Science
Projects:Miscellaneous
Related URLs:Project Homepagehttp://infolab.stanford.edu/
ID Code:36
Deposited By:Import Account
Deposited On:25 Feb 2000 16:00
Last Modified:05 Feb 2009 16:08

Download statistics

Repository Staff Only: item control page