Stanford InfoLab Publication Server

Compare Me Maybe: Crowd Entity Resolution Interfaces

Whang, Steven Euijong and McAuley, Julian and Garcia-Molina, Hector Compare Me Maybe: Crowd Entity Resolution Interfaces. Technical Report. Stanford InfoLab.


PDF - Draft Version


We study the problem of enhancing entity resolution (ER) with the help of crowdsourcing. ER is the problem of identifying records that refer to the same real-world entity and can be an extremely difficult process for computer algorithms alone. For example, figuring out which images refer to the same person can be a hard task for computers, but an easy one for humans. An important component of crowdsourcing is the interface that is used for human and algorithm interaction. In this paper, we explore how the interface design along with other factors impact the human quality of comparing records. We also propose a model for separating good human workers from bad workers. Our analysis is based on extensive experiments on Amazon Mechanical Turk using real and synthetic image datasets.

Item Type:Techreport (Technical Report)
ID Code:1061
Deposited By:Steven Whang
Deposited On:25 Nov 2012 21:19
Last Modified:25 Nov 2012 21:24

Download statistics

Repository Staff Only: item control page