Stanford InfoLab Publication Server

Generic Entity Resolution in the SERF Project

Benjelloun, Omar and Garcia-Molina, Hector and Kawai, Hideki and Larson, Tait Eliott and Menestrina, David and Su, Qi and Thavisomboon, Sutthipong and Widom, Jennifer (2006) Generic Entity Resolution in the SERF Project. Technical Report. Stanford InfoLab. (Publication Note: IEEE Data Engineering Bulletin, June 2006)




The SERF project at Stanford deals with the Entity Resolution (ER) problem, in which records determined to represent the same real-life ``entities'' (such as people or products) are successively located and combined. The approach we pursue is ``generic'', in the sense that the specific functions used to match and merge records are viewed as black boxes, which permits efficient, expressive and extensible ER solutions. This paper motivates and introduces the principles of generic ER, and gives an overview of the research directions we have been exploring in the SERF project over the past two years.

Item Type:Techreport (Technical Report)
Uncontrolled Keywords:Data Cleaning, Generic Entity Resolution, Record Linkage, Deduplication
Subjects:Computer Science > Data Integration and Mediation
ID Code:779
Deposited By:Import Account
Deposited On:11 Jun 2006 17:00
Last Modified:18 Dec 2008 14:42

Download statistics

Repository Staff Only: item control page