Verroios, Vasilis and Garcia-Molina, Hector Entity Resolution with Crowd Errors. Technical Report. Stanford InfoLab.
BibTeX | DublinCore | EndNote | HTML |
This is the latest version of this item.
PDF 1584Kb |
Abstract
Given a set of records, an ER algorithm finds records that refer to the same real-world entity. Humans can often determine if two records refer to the same entity, and hence we study the problem of selecting questions to ask error-prone humans. We give a Maximum Likelihood formulation for the problem of finding the "most beneficial" questions to ask next. Our theoretical results lead to a lightweight and practical algorithm, bDENSE, for selecting questions to ask humans. Our experimental results show that bDENSE can more quickly reach an accurate outcome, compared to two approaches proposed recently. Moreover, through our experimental evaluation, we identify the strengths and weaknesses of all three approaches.
Item Type: | Techreport (Technical Report) |
---|---|
ID Code: | 1097 |
Deposited By: | vasilis verroios |
Deposited On: | 03 Aug 2014 14:42 |
Last Modified: | 03 Aug 2014 14:42 |
Available Versions of this Item
- Entity Resolution with Crowd Errors. (deposited 24 Feb 2014 19:25)
- Entity Resolution with Crowd Errors. (deposited 03 Aug 2014 14:42) [Currently Displayed]
Download statistics
Repository Staff Only: item control page