Paepcke, A. and Garcia-Molina, H. and Rodriguez-Mula, G. and Cho, J. (2000) Beyond Document Similarity: Understanding Value-Based Search and Browsing Technologies. SIGMOD Record, 29 (1).
In the face of small, one or two word queries, high volumes of diverse documents on the Web are overwhelming search and ranking technologies that are based on document similarity measures. The increase of multimedia data within documents sharply exacerbates the shortcomings of these approaches. Recently, research prototypes and commercial experiments have added techniques that augment similarity-based search and ranking. These techniques rely on judgments about the 'value' of documents. Judgments are obtained directly from users, are derived by conjecture based on observations of user behavior, or are surmised from analyses of documents and collections. All these systems have been pursued independently, and no common understanding of the underlying processes has been presented. We survey existing value-based approaches, develop a reference architecture that helps compare the approaches, and categorize the constituent algorithms. We explain the options for collecting value metadata, and for using that metadata to improve search, ranking of results, and the enhancement of information browsing. Based on our survey and analysis, we then point to several open problems.
|Uncontrolled Keywords:||Information retrieval, information filters, metadata, relevance, World-Wide Web, search engines, ranking, links, hypertext, collaborative filtering|
|Subjects:||Computer Science > Digital Libraries|
|Related URLs:||Project Homepage||http://www-diglib.stanford.edu/diglib/pub/|
|Deposited By:||Import Account|
|Deposited On:||25 Feb 2000 16:00|
|Last Modified:||27 Dec 2008 15:19|
Repository Staff Only: item control page