Guo, Stephen and Parameswaran, Aditya and Garcia-Molina, Hector (2011) So Who Won? Dynamic Max Discovery with the Crowd. Technical Report. Stanford InfoLab.
This is the latest version of this item.
|PDF (Technical Report (Long Version))|
A crowdsourcing database system is one that uses human workers to cleanse, populate, or ﬁlter its data. Just like a conventional DB system, such a crowdsourcing DB system requires data manipulation functions such as select, aggregate, maximum, average, and so on, except that now it must rely on human operators (that for example compare two objects) with very different latency, cost and accuracy characteristics. In this paper, we focus on one such function, maximum, that ﬁnds the highest ranked object or tuple in a set. In particular, we study two problems: First, given a set of crowdsourced votes (pairwise comparisons among objects), how do we select the maximum? Second, how do we improve our estimate by requesting additional votes? We show that in a crowdsourcing DB system, the optimal solution to both problems is NP-Hard. We then provide heuristic functions to select the maximum given evidence, and to select additional votes. We experimentally evaluate our functions to highlight their strengths and weaknesses.
|Item Type:||Techreport (Technical Report)|
|Uncontrolled Keywords:||Crowdsourcing, Human Computation, Max, Voting, Aggregation|
|Deposited By:||Stephen Guo|
|Deposited On:||13 Mar 2012 11:05|
|Last Modified:||13 Mar 2012 11:46|
Available Versions of this Item
- So Who Won? Dynamic Max Discovery with the Crowd. (deposited 13 Mar 2012 11:05) [Currently Displayed]
Repository Staff Only: item control page