Agrawal, Parag and Ikeda, Robert and Park, Hyunjung and Widom, Jennifer (2009) Trio-ER: The Trio System as a Workbench for Entity-Resolution. Technical Report. Stanford InfoLab.
Entity-resolution was one of the original motivating applications for the Trio system, which has been under development at Stanford over the past several years. Trio-ER is a new variant of the Trio system tailored specifically as a workbench for entity-resolution. Trio-ER enables rapid prototyping of an important basic class of entity-resolution algorithms. We begin by showing several new (and some old) constructs in Trio's data model and query language, and how they enable easy specification and refinement of entity-resolution matching and merging. We then show how iterative entity-resolution algorithms are performed using Trio, how Trio's lineage capabilities are integral to the process, and how confidence values are incorporated at several levels.
|Item Type:||Techreport (Technical Report)|
|Deposited By:||Jennifer Widom|
|Deposited On:||20 Mar 2009 09:49|
|Last Modified:||29 Oct 2009 18:10|
Repository Staff Only: item control page