Agrawal, Parag and Ikeda, Robert and Park, Hyunjung and Widom, Jennifer (2009) Trio-ER: The Trio System as a Workbench for Entity-Resolution. Technical Report. Stanford InfoLab.
BibTeX | DublinCore | EndNote | HTML |
| PDF 101Kb |
Abstract
Entity-resolution was one of the original motivating applications for the Trio system, which has been under development at Stanford over the past several years. Trio-ER is a new variant of the Trio system tailored specifically as a workbench for entity-resolution. Trio-ER enables rapid prototyping of an important basic class of entity-resolution algorithms. We begin by showing several new (and some old) constructs in Trio's data model and query language, and how they enable easy specification and refinement of entity-resolution matching and merging. We then show how iterative entity-resolution algorithms are performed using Trio, how Trio's lineage capabilities are integral to the process, and how confidence values are incorporated at several levels.
Item Type: | Techreport (Technical Report) |
---|---|
ID Code: | 912 |
Deposited By: | Jennifer Widom |
Deposited On: | 20 Mar 2009 09:49 |
Last Modified: | 29 Oct 2009 18:10 |
Download statistics
Repository Staff Only: item control page