Stanford InfoLab Publication Server

Building a Distributed Full-Text Index for the Web

Melnik, Sergey and Raghavan, Sriram and Yang, Beverly and Garcia-Molina, Hector (2000) Building a Distributed Full-Text Index for the Web. Technical Report. Stanford.

WarningThere is a more recent version of this item available.



We identify crucial design issues in building a distributed inverted index for a large collection of web pages. We introduce a novel pipelining technique for structuring the core index-building system that substantially reduces the index construction time. We also propose a storage scheme for creating and managing inverted files using an embedded database system. We propose and compare different strategies for addressing various issues relevant to distributed index construction. Finally, we present performance results from experiments on a testbed distributed indexing system that we have implemented.

Item Type:Techreport (Technical Report)
Additional Information:Extended version of paper submitted to ICDE 2001
Uncontrolled Keywords:Full-text index, Web, WebBase, Text retrieval
Subjects:Computer Science > Databases and the Web
Computer Science > Digital Libraries
Projects:Digital Libraries
Related URLs:Project Homepage
ID Code:448
Deposited By:Import Account
Deposited On:30 Jul 2000 17:00
Last Modified:27 Dec 2008 15:01

Available Versions of this Item

Download statistics

Repository Staff Only: item control page