Online Aggregation for Large MapReduce Jobs Niketan Pansare 1 , - PowerPoint PPT Presentation

OLA over single machine Confidence interval found using classical sampling theory  Tuples are bundled into blocks  Blocks arrive in random order  Example: Find SUM of below values 5, 9  7, 4, 2 7, 4, 2 8, 3 5, 9 1, 10, 6 8, 3 1, 10, 6 Sample = {13, 11, 14} Estimate = (13 + 11 + 14) * 4 / 3 = 50.67 37

OLA over single machine Confidence interval found using classical sampling theory  Tuples are bundled into blocks  Blocks arrive in random order  Example: Find SUM of below values 5, 9  7, 4, 2 7, 4, 2 8, 3 5, 9 1, 10, 6 8, 3 1, 10, 6 Sample = {13, 11, 14} Estimate = (13 + 11 + 14) * 4 / 3 = 50.67 38

OLA over single machine Confidence interval found using classical sampling theory  Tuples are bundled into blocks  Blocks arrive in random order  Example: Find SUM of below values 5, 9  7, 4, 2 7, 4, 2 8, 3 5, 9 1, 10, 6 8, 3 1, 10, 6 Sample = {13, 11, 14, 17} Estimate = (13 + 11 + 14 + 17) * 4 / 4 = 55 39

Extend existing approaches OLA over single machine   Confidence interval found using classical sampling theory  Tuples are bundled into blocks  Blocks arrive in random order OLA over multiple machines   Blocks → Non-uniform → Size, Locality, Machine, Network  Processing time for block can be large and highly variable Why it won't work ?  How do we deal with those issues ?  40

OLA over multiple machines Blocks → Non-uniform → Size, Locality, Machine, Network  Processing time for block can be large and highly variable  So, instead of Example: Find SUM of below values  5, 9 7, 4, 2 7, 4, 2 8, 3 5, 9 1, 10, 6 8, 3 1, 10, 6 41

OLA over multiple machines Blocks → Non-uniform → Size, Locality, Machine, Network  Processing time for block can be large and highly variable  1, 10, 6 5, 9 8, 3 Example: Find SUM of below values  7, 4, 2 8, 3 5, 9 1, 10, 6 X axis = Processing Time 7, 4, 2 42

OLA over multiple machines Blocks → Non-uniform → Size, Locality, Machine, Network  Processing time for block can be large and highly variable  Example: Find SUM of below values  7, 4, 2 8, 3 5, 9 1, 10, 6 Blocks that take   long time to process = RED  Short time to process = Green 43

OLA over multiple machines Blocks → Non-uniform → Size, Locality, Machine, Network  Processing time for block can be large and highly variable  Example: Find SUM of below values  7, 4, 2 8, 3 5, 9 1, 10, 6 Arrows = Random Time Instances (Polling blocks) 44

OLA over multiple machines Blocks → Non-uniform → Size, Locality, Machine, Network  Processing time for block can be large and highly variable  Example: Find SUM of below values  7, 4, 2 8, 3 5, 9 1, 10, 6 45

OLA over multiple machines Blocks → Non-uniform → Size, Locality, Machine, Network  Processing time for block can be large and highly variable  Example: Find SUM of below values  7, 4, 2 8, 3 5, 9 1, 10, 6 Notice, there are more arrows on red region than green region 54

OLA over multiple machines Blocks → Non-uniform → Size, Locality, Machine, Network  Processing time for block can be large and highly variable  Example: Find SUM of below values  7, 4, 2 8, 3 5, 9 1, 10, 6 Notice, there are more arrows on red region than green region Inspection Paradox: At any random time t, (stochastically) you will be processing those blocks that take long time 55

Extend existing approaches OLA over single machine   Confidence interval found using classical sampling theory  Tuples are bundled into blocks - Arrive in random order OLA over multiple machines   Blocks → Non-uniform → Size, Locality, Machine, Network  Processing time for block can be large and highly variable Why it won't work ?  How do we deal with those issues ?  56

Why won't previous approach work ? Inspection paradox → At the time of estimation, processing  longer blocks Possible: correlation between processing time and value   Eg: count query 57

Why won't previous approach work ? Inspection paradox → At the time of estimation, processing  longer blocks Possible: correlation between processing time and value   Eg: count query Biased estimates → current techniques won't work  58

Why won't previous approach work ? Inspection paradox → At the time of estimation, processing  longer blocks This effect is found experimentally in the paper: 'MapReduce Online' Possible: correlation between processing time and value   Eg: count query Biased estimates → current techniques won't work  59

Why won't previous approach work ? Inspection paradox → At the time of estimation, processing  longer blocks Possible: correlation between processing time and value   Eg: count query Biased estimates → current techniques won't work  Therefore, need to deal with inspection paradox in principled  fashion 60

Extend existing approaches OLA over single machine   Confidence interval found using classical sampling theory  Tuples are bundled into blocks - Arrive in random order OLA over multiple machines   Blocks → Non-uniform → Size, Locality, Machine, Network  Processing time for block can be large and highly variable Why it won't work ?  How do we deal with those issues ?  61

How do we deal with Inspection Paradox Capture timing information (i.e. processing time of block)   Along with values Instead of using classical sampling theory, we output estimates  using bayesian model that:  Allows for correlation between processing time and values  And also takes into account the processing time of current block 62

Outline  Motivation  Implementation  Experiments  Conclusion 63

Implementation Overview Framework for distributed systems: MapReduce   Hadoop - Staged processing → Online  Hyracks (developed at UC Irvine) - Pipelining → ”Online” - Architecture (and API) similar to Hadoop - http://code.google.com/p/hyracks/ For estimates of ”Aggregation”,   2 modifications to MapReduce (Hyracks)  Bayesian Estimator 64

Modifications to MapReduce (Hyracks) Master   Maintains random ordering of blocks - Logical not physical queue  Assigns block from head of queue  Block comes to head of queue → Timer starts (processing time) Two intermediates set of files   Data file → Values  Metadata file → Timing information  Shuffle phase of reducer 67

Modifications to MapReduce (Hyracks) Client Master select sum(stock_price) from nasdaq_db group by company; Blk1 MSFT 2 AAPL 4 Blk2 ORCL 3 Blk3 AAPL 4 Blk4 MSFT 2 Blk5 ORCL 3 Blk6 MSFT 2 Blk7 AAPL 4 68

Modifications to MapReduce (Hyracks) Client Master Blk1 MSFT 2 AAPL 4 Blk2 ORCL 3 Blk3 AAPL 4 Blk4 MSFT 2 Blk5 ORCL 3 Blk6 MSFT 2 Blk7 AAPL 4 69 Time t = 0

Modifications to MapReduce (Hyracks) Blk 1 Blk 2 Blk 3 Blk 4 Blk 5 Blk 6 Blk 7 Client Master Blk1 MSFT 2 Master maintains a logical AAPL 4 queue of the blocks Blk2 ORCL 3 Blk3 AAPL 4 Blk4 MSFT 2 Blk5 ORCL 3 Blk6 MSFT 2 Blk7 AAPL 4 70 Time t = 1

Modifications to MapReduce (Hyracks) Blk 6 Blk 5 Blk 3 Blk 1 Blk 4 Blk 7 Blk 2 Client Master Blk1 MSFT 2 Master randomizes the AAPL 4 queue Blk2 ORCL 3 Blk3 AAPL 4 Blk4 MSFT 2 Blk5 ORCL 3 Blk6 MSFT 2 Blk7 AAPL 4 71 Time t = 1

Modifications to MapReduce (Hyracks) Blk 6 Blk 5 Blk 3 Blk 1 Blk 4 Blk 7 Blk 2 Client Master Blk1 MSFT 2 Master forks workers AAPL 4 Blk2 ORCL 3 Worker 1 Blk3 AAPL 4 Blk4 MSFT 2 Blk5 ORCL 3 Worker 2 Blk6 MSFT 2 Blk7 AAPL 4 72 Time t = 2

Modifications to MapReduce (Hyracks) Blk 6 Blk 5 Blk 3 Blk 1 Blk 4 Blk 7 Blk 2 Client Master Blk1 MSFT 2 Workers request for blocks AAPL 4 Blk2 ORCL 3 Worker 1 Blk3 AAPL 4 Blk4 MSFT 2 Blk5 ORCL 3 Worker 2 Blk6 MSFT 2 Blk7 AAPL 4 73 Time t = 3

Modifications to MapReduce (Hyracks) Blk 6 Blk 5 Blk 3 Blk 1 Blk 4 Blk 7 Blk 2 Client Master Blk6 Blk1 MSFT 2 Masters reads head of AAPL 4 queue and assigns it to Blk2 ORCL 3 Worker 1 first worker Blk3 AAPL 4 Blk4 MSFT 2 Blk5 ORCL 3 Worker 2 Blk6 MSFT 2 Blk7 AAPL 4 74 Time t = 4

Modifications to MapReduce (Hyracks) Blk 6 Blk 5 Blk 3 Blk 1 Blk 4 Blk 7 Blk 2 Client Master Blk1 MSFT 2 Blk6 Worker1 starts reading AAPL 4 Blk6 Blk2 ORCL 3 Worker 1 Blk3 AAPL 4 Blk4 MSFT 2 Blk5 ORCL 3 Worker 2 Blk6 MSFT 2 Blk7 AAPL 4 75 Time t = 5

Modifications to MapReduce (Hyracks) Blk 6 Blk 5 Blk 3 Blk 1 Blk 4 Blk 7 Blk 2 Client Master Blk5 Blk1 MSFT 2 <MSFT, 2> Assigns Blk5 to Worker2 AAPL 4 Blk2 ORCL 3 Worker 1 Blk3 AAPL 4 Blk4 MSFT 2 Blk5 ORCL 3 Worker 2 Blk6 MSFT 2 Blk7 AAPL 4 76 Time t = 6

Modifications to MapReduce (Hyracks) Blk 6 Blk 5 Blk 3 Blk 1 Blk 4 Blk 7 Blk 2 Client Master Blk1 MSFT 2 <MSFT, 2> Worker1 does its map task AAPL 4 Blk2 ORCL 3 Worker 1 Blk3 AAPL 4 Blk5 Blk4 MSFT 2 Blk5 ORCL 3 Worker 2 Blk6 MSFT 2 Blk7 AAPL 4 77 Time t = 7

Modifications to MapReduce (Hyracks) t process = 4 Blk 6 Blk 5 Blk 3 Blk 1 Blk 4 Blk 7 Blk 2 Client Master Reducer Blk1 MSFT 2 <MSFT, 2> AAPL 4 Shuffle Reduce Blk2 ORCL 3 Phase Phase Worker 1 Blk3 AAPL 4 <MSFT, 2> Blk5 Blk4 MSFT 2 Blk5 ORCL 3 Worker 2 Blk6 MSFT 2 Blk7 AAPL 4 78 Time t = 8

Modifications to MapReduce (Hyracks) t process = 4 Blk 6 Blk 5 Blk 3 Blk 1 Blk 4 Blk 7 Blk 2 Client Master Reducer Blk1 MSFT 2 AAPL 4 Shuffle Reduce Blk2 ORCL 3 Phase Phase Worker 1 Blk3 AAPL 4 <MSFT, 2> <MSFT, 2> Blk5 Blk4 MSFT 2 Reducer-MSFT Blk5 ORCL 3 Worker 2 Blk6 MSFT 2 Blk7 AAPL 4 79 Time t = 9

Modifications to MapReduce (Hyracks) t process = 4 Blk 6 Blk 5 Blk 3 Blk 1 Blk 4 Blk 7 Blk 2 Client Master Reducer Blk1 MSFT 2 AAPL 4 Shuffle Reduce Blk2 ORCL 3 Phase Phase Worker 1 Blk3 AAPL 4 <MSFT, 2> <MSFT, 2> Blk5 Blk4 MSFT 2 Reducer-MSFT Blk5 ORCL 3 Worker 2 Blk6 MSFT 2 Blk7 AAPL 4 Random Time instance: Do estimation 80 Time t = 9

Modifications to MapReduce (Hyracks) t process = 4 Blk 6 Blk 5 Blk 3 Blk 1 Blk 4 Blk 7 Blk 2 Client Master t process > 3 Reducer Blk1 MSFT 2 AAPL 4 Shuffle Reduce Blk2 ORCL 3 Phase Phase Worker 1 Blk3 AAPL 4 <MSFT, 2> <MSFT, 2> Blk5 Blk4 MSFT 2 Reducer-MSFT Blk5 ORCL 3 Worker 2 Blk6 MSFT 2 Blk7 AAPL 4 Random Time instance: Do estimation 81 Time t = 9

Modifications to MapReduce (Hyracks) t process = 4 Blk 6 Blk 5 Blk 3 Blk 1 Blk 4 Blk 7 Blk 2 Client Master t process > 3 Reducer Blk1 MSFT 2 AAPL 4 Shuffle Reduce Blk2 ORCL 3 Phase Phase Worker 1 Blk3 AAPL 4 <MSFT, 2> <MSFT, 2> Blk5 Blk4 MSFT 2 Reducer-MSFT Blk5 ORCL 3 Worker 2 Blk6 MSFT 2 Blk7 AAPL 4 Random Time instance: Do estimation 82 Time t = 9

Modifications to MapReduce (Hyracks) t process = 4 Blk 6 Blk 5 Blk 3 Blk 1 Blk 4 Blk 7 Blk 2 Client Master t process > 3 Reducer Blk1 MSFT 2 AAPL 4 Shuffle Reduce Blk2 ORCL 3 Phase Phase Worker 1 Blk3 AAPL 4 <MSFT, 2> <MSFT, 2> Blk5 Blk4 MSFT 2 Reducer-MSFT Blk5 ORCL 3 Worker 2 Blk6 MSFT 2 Blk7 AAPL 4 Blk6: <MSFT, 2> Estimation Blk6: t process = 4 code Random Time instance: Do estimation 83 Time t = 9 Blk5: t process > 3

Modifications to MapReduce (Hyracks) t process = 4 Blk 6 Blk 5 Blk 3 Blk 1 Blk 4 Blk 7 Blk 2 Client Master t process > 3 Reducer Blk1 MSFT 2 AAPL 4 Shuffle Reduce Blk2 ORCL 3 Phase Phase Worker 1 Blk3 AAPL 4 <MSFT, 2> <MSFT, 2> Blk5 Blk4 MSFT 2 Reducer-MSFT Blk5 ORCL 3 Worker 2 Blk6 MSFT 2 Blk7 AAPL 4 Estimation code Random Time instance: Do estimation 84 [5.8, 8] Time t = 9

Bayesian Estimator Why ? → To deal with Inspection Paradox  86

Bayesian Estimator Why ? → To deal with Inspection Paradox  How ?   Allows for correlation between processing time and values  And also take into account the processing time of current block 87

Bayesian Estimator Why ? → To deal with Inspection Paradox  How ?   Allows for correlation between processing time and values  And also take into account the processing time of current block Implementation:   C++ code using GNU Scientific Library and Minuit2  Input: Data file and Metadata file from Reducer  Output: Confidence Interval → Eg:[995, 1005] with 95% prob 88

Bayesian Estimator (Model) Parameterized model:   Timing Information:T process , T scheduling  Value: X 89

Bayesian Estimator (Model) Parameterized model:   Timing Information:T process , T scheduling  Value: X Underlying distribution   Classical sampling theory: f(X) 90

Bayesian Estimator (Model) Parameterized model:   Timing Information:T process , T scheduling  Value: X Underlying distribution   Classical sampling theory: f(X)  Our approach: f(X, T process , T scheduling ) 91

Bayesian Estimator (Model) Parameterized model:   Timing Information:T process , T scheduling  Value: X Underlying distribution   Classical sampling theory: f(X)  Our approach: f(X, T process , T scheduling ) - Correlation between X, T process and T scheduling 92

Bayesian Estimator (Model) Parameterized model:   Timing Information:T process , T scheduling  Value: X Underlying distribution   Classical sampling theory: f(X)  Our approach: f(X, T process , T scheduling ) - Correlation between X, T process and T scheduling - f(X | T process > 100000000, T scheduling = 22) ≠ f(X) 93

Bayesian Estimator (Model) Parameterized model:   Timing Information:T process , T scheduling  Value: X Underlying distribution   Classical sampling theory: f(X)  Our approach: f(X, T process , T scheduling ) - Correlation between X, T process and T scheduling - f(X | T process > 100000000, T scheduling = 22) ≠ f(X) Estimation using Bayesian Machinery   Gibbs Sampler 94 - Developed probability (or update) equations

Bayesian Estimator (Model) Parameterized model:   Timing Information:T process , T scheduling  Value: X Underlying distribution   Classical sampling theory: f(X) Detailed discussion in the paper  Our approach: f(X, T process , T scheduling ) - Correlation between X, T process and T scheduling - f(X | T process > 100000000, T scheduling = 22) ≠ f(X) Estimation using Bayesian Machinery   Gibbs Sampler 95 - Developed probability (or update) equations

Outline  Motivation  Implementation  Experiments  Conclusion 96

Experiments Hypothesis:   Randomized Queue required  Allow correlation between processing time and value  Convergence of estimates Experiment 1: (Real dataset)   select sum(page_count) from wikipedia_log group by language  6 months Wikipedia log (220 GB compressed, 3960 blocks)  11 node cluster (4 disks, 4 cores, 12GB RAM)  Uniform configuration: Machines, Blocks  80 mappers and 10 reducer 97

Experiments Hypothesis:   Randomized Queue required  Allow correlation between processing time and value  Convergence of estimates Experiment 1: (Real dataset)   select sum(page_count) from wikipedia_log group by language  6 months Wikipedia log (220 GB compressed, 3960 blocks)  11 node cluster (4 disks, 4 cores, 12GB RAM)  Uniform configuration: Machines, Blocks  80 mappers and 10 reducer 98

Experiments Hypothesis:  Reading the figures  Randomized Queue required  Allow correlation between processing time and value  Convergence of estimates Experiment 1: (Real dataset)   6 months Wikipedia log (220 GB compressed, 3960 blocks) Percentage of data processed  11 node cluster (4 disks, 4 cores, 12GB RAM)  Uniform configuration: Machines, Blocks  80 mappers and 10 reducer Experiment 2: (Simulated data set)   ↑ correlation (Non-uniform configuration) 99

Experiments Hypothesis:  Reading the figures  Randomized Queue required  Allow correlation between processing time and value  Convergence of estimates Experiment 1: (Real dataset)   6 months Wikipedia log (220 GB compressed, 3960 blocks)  11 node cluster (4 disks, 4 cores, 12GB RAM)  Uniform configuration: Machines, Blocks  80 mappers and 10 reducer Experiment 2: (Simulated data set)   ↑ correlation (Non-uniform configuration) 100

Online Aggregation for Large MapReduce Jobs Niketan Pansare 1 , - PowerPoint PPT Presentation

Online Aggregation for Large MapReduce Jobs Niketan Pansare 1 , Vinayak Borkar 2 , Chris Jermaine 1 , Tyson Condie 3 1 Rice University, 2 UC Irvine, 3 Yahoo! Research Outline Motivation Implementation Experiments Conclusion 2

JOBS, JOBS, JOBS! JOBS, JOBS, JOBS! Jobs, jobs, JO JOBS! JOBS, JOBS, JOBS! The other reality

RESTORE: REUSING RESULTS OF MAPREDUCE JOBS Junjie Hu 1 Introduction Current practice

MapReduce Andrew Crotty Alex Galakatos What is MapReduce? MapReduce is a framework for:

Cutting MapReduce Cost with Spot Market Huan Liu Accenture Technology Labs Why spot market? 2

Mrs: MapReduce for Scientific Computing in Python Andrew McNabb, Jeff Lund , and Kevin Seppi

MapReduce 320302 Databases & Web Services (P. Baumann) 1 Why MapReduce? Motivation: Large

Lecture 16: Overview of MapReduce MapReduce is a parallel, distributed programming model and

Jobs at sea TRINITY HOUSE // KEY STAGE 2 JOBS AT SEA Starter Activity 1 TRINITY HOUSE //

Hadoop Map Reduce 1 MapReduce 2-in-1 A programming paradigm A query execution engine A kind

MapReduce 340151 Big Data & Cloud Services (P. Baumann) 1 Overview MapReduce : the

COMP9313: Big Data Management MapReduce Data Structure in MapReduce Key-value pairs are the

Lecture 36: MapReduce Frameworks [Adapted from slides by John DeNero and MapReduce is a

Laboratory Session: MapReduce Algorithm Design in MapReduce Pietro Michiardi Eurecom Pietro

Large-scale Data Mining: MapReduce and Beyond Part 2: Algorithms Spiros Papadimitriou, IBM

MapReduce and Frequent Itemsets Mining Yang Wang 1 MapReduce (Hadoop) Programming model

Phoenix Rebirth: Scalable MapReduce on a Large-Scale Shared-Memory System Richard Yoo, Anthony

Dark Trading and Financial Markets Stability Jorge Gonalves, Roman Krussl, Vladimir Levin

Next-Generation AMQP Messaging Performance, Architectures, and Ecosystems with Red Hat Enterprise

Safe Harbor Any statements contained in this presentation which do not describe historical facts

CSC 530 Lecture Notes Week 3 A Brief Review of Lambda Calculus Introduction to Programming

Setting a National Agenda for Quality and Safety Janet M. Corrigan, PhD, MBA President and CEO

Latin America H. Xavier Jara Olivier Bargain ISER, University of Essex Universit de Bordeaux

ISS - A SUCCESSFUL TRIAL FOR INTER- NATIONALIZATION OF SPECIALIZED EDUCATION AND TRAINING Prof.

Nationalism Lecture 11: Ethnic Conflict Prof. Lars-Erik Cederman Swiss Federal Institute of

Online Aggregation for Large MapReduce Jobs Niketan Pansare 1 , - PowerPoint PPT Presentation

Online Aggregation for Large MapReduce Jobs Niketan Pansare 1 , Vinayak Borkar 2 , Chris Jermaine 1 , Tyson Condie 3 1 Rice University, 2 UC Irvine, 3 Yahoo! Research Outline Motivation Implementation Experiments Conclusion 2

JOBS, JOBS, JOBS! JOBS, JOBS, JOBS! Jobs, jobs, JO JOBS! JOBS, JOBS, JOBS! The other reality

RESTORE: REUSING RESULTS OF MAPREDUCE JOBS Junjie Hu 1 Introduction Current practice

MapReduce Andrew Crotty Alex Galakatos What is MapReduce? MapReduce is a framework for:

Cutting MapReduce Cost with Spot Market Huan Liu Accenture Technology Labs Why spot market? 2

Mrs: MapReduce for Scientific Computing in Python Andrew McNabb, Jeff Lund , and Kevin Seppi

MapReduce 320302 Databases &amp; Web Services (P. Baumann) 1 Why MapReduce? Motivation: Large

Lecture 16: Overview of MapReduce MapReduce is a parallel, distributed programming model and

Jobs at sea TRINITY HOUSE // KEY STAGE 2 JOBS AT SEA Starter Activity 1 TRINITY HOUSE //

Hadoop Map Reduce 1 MapReduce 2-in-1 A programming paradigm A query execution engine A kind

MapReduce 340151 Big Data &amp; Cloud Services (P. Baumann) 1 Overview MapReduce : the

COMP9313: Big Data Management MapReduce Data Structure in MapReduce Key-value pairs are the

Lecture 36: MapReduce Frameworks [Adapted from slides by John DeNero and MapReduce is a

Laboratory Session: MapReduce Algorithm Design in MapReduce Pietro Michiardi Eurecom Pietro

Large-scale Data Mining: MapReduce and Beyond Part 2: Algorithms Spiros Papadimitriou, IBM

MapReduce and Frequent Itemsets Mining Yang Wang 1 MapReduce (Hadoop) Programming model

Phoenix Rebirth: Scalable MapReduce on a Large-Scale Shared-Memory System Richard Yoo, Anthony

Dark Trading and Financial Markets Stability Jorge Gonalves, Roman Krussl, Vladimir Levin

Next-Generation AMQP Messaging Performance, Architectures, and Ecosystems with Red Hat Enterprise

Safe Harbor Any statements contained in this presentation which do not describe historical facts

CSC 530 Lecture Notes Week 3 A Brief Review of Lambda Calculus Introduction to Programming

Setting a National Agenda for Quality and Safety Janet M. Corrigan, PhD, MBA President and CEO

Latin America H. Xavier Jara Olivier Bargain ISER, University of Essex Universit de Bordeaux

ISS - A SUCCESSFUL TRIAL FOR INTER- NATIONALIZATION OF SPECIALIZED EDUCATION AND TRAINING Prof.

Nationalism Lecture 11: Ethnic Conflict Prof. Lars-Erik Cederman Swiss Federal Institute of

MapReduce 320302 Databases & Web Services (P. Baumann) 1 Why MapReduce? Motivation: Large

MapReduce 340151 Big Data & Cloud Services (P. Baumann) 1 Overview MapReduce : the