Benjelloun, Omar and Garcia-Molina, Hector and Kawai, Hideki and Larson, Tait Eliott and Menestrina, David and Su, Qi and Thavisomboon, Sutthipong and Widom, Jennifer (2006) Generic Entity Resolution in the SERF Project. Technical Report. Stanford InfoLab. (Publication Note: IEEE Data Engineering Bulletin, June 2006)
BibTeX | DublinCore | EndNote | HTML |
| PDF 149Kb |
Abstract
The SERF project at Stanford deals with the Entity Resolution (ER) problem, in which records determined to represent the same real-life ``entities'' (such as people or products) are successively located and combined. The approach we pursue is ``generic'', in the sense that the specific functions used to match and merge records are viewed as black boxes, which permits efficient, expressive and extensible ER solutions. This paper motivates and introduces the principles of generic ER, and gives an overview of the research directions we have been exploring in the SERF project over the past two years.
Item Type: | Techreport (Technical Report) |
---|---|
Uncontrolled Keywords: | Data Cleaning, Generic Entity Resolution, Record Linkage, Deduplication |
Subjects: | Computer Science > Data Integration and Mediation |
Projects: | Miscellaneous |
ID Code: | 779 |
Deposited By: | Import Account |
Deposited On: | 11 Jun 2006 17:00 |
Last Modified: | 18 Dec 2008 14:42 |
Download statistics
Repository Staff Only: item control page