Park, Hyunjung and Ikeda, Robert and Widom, Jennifer (2011) RAMP: A System for Capturing and Tracing Provenance in MapReduce Workflows. In: 37th International Conference on Very Large Data Bases (VLDB), Seattle, Washington.
BibTeX | DublinCore | EndNote | HTML |
PDF - Published Version 1108Kb |
Abstract
RAMP (Reduce And Map Provenance) is an extension to Hadoop that supports provenance capture and tracing for workflows of MapReduce jobs. RAMP uses a wrapper-based approach, requiring little if any user intervention in most cases, while retaining Hadoop’s parallel execution and fault tolerance. We demonstrate RAMP on a real-world MapReduce workflow generated from a Pig script that performs sentiment analysis over Twitter data. We show how RAMP’s automatic provenance capture and tracing capabilities provide a convenient and efficient means of drilling-down and verifying output elements.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
ID Code: | 995 |
Deposited By: | Hyunjung Park |
Deposited On: | 22 Mar 2011 23:49 |
Last Modified: | 17 Jul 2011 11:03 |
Download statistics
Repository Staff Only: item control page