Melnik, Sergey and Raghavan, Sriram and Yang, Beverly and Garcia-Molina, Hector (2000) Building a Distributed Full-Text Index for the Web. Technical Report. Stanford InfoLab.
This is the latest version of this item.
We identify crucial design issues in building a distributed inverted index for a large collection of web pages. We introduce a novel pipelining technique for structuring the core index-building system that substantially reduces the index construction time. We also propose a storage scheme for creating and managing inverted files using an embedded database system. We propose and compare different strategies for collecting global statistics from distributed inverted indexes. Finally, we present performance results from experiments on a testbed distributed indexing system that we have implemented.
|Item Type:||Techreport (Technical Report)|
|Additional Information:||Previous number SIDL-WP-2000-0141|
|Subjects:||Computer Science > Digital Libraries|
|Related URLs:||Project Homepage||http://www-diglib.stanford.edu/diglib/pub/|
|Deposited By:||Import Account|
|Deposited On:||30 Oct 2001 16:00|
|Last Modified:||27 Dec 2008 15:08|
Available Versions of this Item
- Building a Distributed Full-Text Index for the Web. (deposited 30 Jul 2000 17:00)
- Building a Distributed Full-Text Index for the Web. (deposited 30 Oct 2001 16:00) [Currently Displayed]
Repository Staff Only: item control page