Stanford InfoLab Publication Server

Beyond Document Similarity: Understanding Value-Based Search and Browsing Technologies

Paepcke, A. and Garcia-Molina, H. and Rodriguez-Mula, G. and Cho, J. (2000) Beyond Document Similarity: Understanding Value-Based Search and Browsing Technologies. SIGMOD Record, 29 (1).




In the face of small, one or two word queries, high volumes of diverse documents on the Web are overwhelming search and ranking technologies that are based on document similarity measures. The increase of multimedia data within documents sharply exacerbates the shortcomings of these approaches. Recently, research prototypes and commercial experiments have added techniques that augment similarity-based search and ranking. These techniques rely on judgments about the 'value' of documents. Judgments are obtained directly from users, are derived by conjecture based on observations of user behavior, or are surmised from analyses of documents and collections. All these systems have been pursued independently, and no common understanding of the underlying processes has been presented. We survey existing value-based approaches, develop a reference architecture that helps compare the approaches, and categorize the constituent algorithms. We explain the options for collecting value metadata, and for using that metadata to improve search, ranking of results, and the enhancement of information browsing. Based on our survey and analysis, we then point to several open problems.

Item Type:Article
Uncontrolled Keywords:Information retrieval, information filters, metadata, relevance, World-Wide Web, search engines, ranking, links, hypertext, collaborative filtering
Subjects:Computer Science > Digital Libraries
Projects:Digital Libraries
Related URLs:Project Homepage
ID Code:481
Deposited By:Import Account
Deposited On:25 Feb 2000 16:00
Last Modified:27 Dec 2008 15:19

Download statistics

Repository Staff Only: item control page