Stanford InfoLab Publication Server

A path-based approach for web page retrieval

Li, JianQiang and Zhao, Yu and Garcia-Molina, Hector A path-based approach for web page retrieval. Technical Report. Stanford InfoLab.


PDF (PathRank) - Submitted for Publication


Use of links to enhance page ranking has been widely studied. The underlying assumption is that links convey recommendations. Although this technique has been used successfully in global web search, it produces poor results for website search, because the majority of the links in a website are used to organize information and convey no recommendations. By distinguishing these two kinds of links, respectively for recommendation and information organization, this paper describes a path-based method for web page ranking. We define the Hierarchical Navigation Path (HNP) as a new resource for improving web search. HNP is composed of multi-step navigation information in visitors’ website browsing. It provides indications of the content of the destination page. We first classify the links inside a website. Then, the links for web page organization are exploited to construct the HNPs for each page. Finally, the PathRank algorithm is described for web page retrieval. The experiments show that our approach results in significant improvements over existing solutions.

Item Type:Techreport (Technical Report)
ID Code:967
Deposited By:JianQiang Li
Deposited On:20 Apr 2010 09:21
Last Modified:22 Apr 2010 09:33

Download statistics

Repository Staff Only: item control page