Stanford InfoLab Publication Server

Matching Bounds for the All-Pairs MapReduce Problem

Afrati, Foto N. and Ullman, Jeffrey D. (2013) Matching Bounds for the All-Pairs MapReduce Problem. Technical Report. Stanford InfoLab.


PDF (Upper and matching lower bounds for all-pairs MapReduce problem) - Submitted for Publication


The all-pairs problem is an input-output relationship where each output corresponds to a pair of inputs, and each pair of inputs has a corresponding output. It models similarity joins where no simplification of the search for similar pairs, e.g., locality-sensitive hashing, is possible, and each input must be compared with every other input to determine those pairs that are "similar." When implemented by a MapReduce algorithm, there was a gap, a factor of 2, between the lower bound on necessary communication and the communication required by the best known algorithm. In this brief paper we show that the lower bound can essentially be met.

Item Type:Techreport (Technical Report)
Uncontrolled Keywords:MapReduce, similarity join, lower bound, all-pairs problem, replication rate, reducer size
ID Code:1074
Deposited By:Prof Jeffrey Ullman
Deposited On:09 Feb 2015 22:43
Last Modified:09 Feb 2015 22:43

Download statistics

Repository Staff Only: item control page