Stanford InfoLab Publication Server

Trio-ER: The Trio System as a Workbench for Entity-Resolution

Agrawal, Parag and Ikeda, Robert and Park, Hyunjung and Widom, Jennifer (2009) Trio-ER: The Trio System as a Workbench for Entity-Resolution. Technical Report. Stanford InfoLab.

BibTeXDublinCoreEndNoteHTML

[img]
Preview
PDF
101Kb

Abstract

Entity-resolution was one of the original motivating applications for the Trio system, which has been under development at Stanford over the past several years. Trio-ER is a new variant of the Trio system tailored specifically as a workbench for entity-resolution. Trio-ER enables rapid prototyping of an important basic class of entity-resolution algorithms. We begin by showing several new (and some old) constructs in Trio's data model and query language, and how they enable easy specification and refinement of entity-resolution matching and merging. We then show how iterative entity-resolution algorithms are performed using Trio, how Trio's lineage capabilities are integral to the process, and how confidence values are incorporated at several levels.

Item Type:Techreport (Technical Report)
ID Code:912
Deposited By:Jennifer Widom
Deposited On:20 Mar 2009 09:49
Last Modified:29 Oct 2009 18:10

Download statistics

Repository Staff Only: item control page