Large-Scale Data Engineering Designing and implementing algorithms - PowerPoint PPT Presentation

Large-Scale Data Engineering Designing and implementing algorithms for MapReduce event.cwi.nl/lsde2015

PROGRAMMING FOR A DATA CENTRE event.cwi.nl/lsde2015

Programming for a data centre • Understanding the design of warehouse-sized computes – Different techniques for a different setting – Requires quite a bit of rethinking • MapReduce algorithm design – How do you express everything in terms of map() , reduce() , combine() , and partition() ? – Are there any design patterns we can leverage? event.cwi.nl/lsde2015

Building Blocks event.cwi.nl/lsde2015 Source: Barroso and Urs Hölzle (2009)

Storage Hierarchy event.cwi.nl/lsde2015

Scaling up vs. out • No single machine is large enough – Smaller cluster of large SMP machines vs. larger cluster of commodity machines (e.g., 8 128-core machines vs. 128 8-core machines) • Nodes need to talk to each other! – Intra-node latencies: ~100 ns – Inter-node latencies: ~100  s • Let’s model communication overhead event.cwi.nl/lsde2015

Modelling communication overhead • Simple execution cost model: – Total cost = cost of computation + cost to access global data – Fraction of local access inversely proportional to size of cluster – n nodes (ignore cores for now) 1 ms + f  [100 ns  (1/ n) + 100  s  (1 - 1/ n )] • Light communication: f =1 • Medium communication: f =10 • Heavy communication: f =100 • What is the cost of communication? event.cwi.nl/lsde2015

Overhead of communication event.cwi.nl/lsde2015

Seeks vs. scans • Consider a 1TB database with 100 byte records – We want to update 1 percent of the records • Scenario 1: random access – Each update takes ~30 ms (seek, read, write) – 10 8 updates = ~35 days • Scenario 2: rewrite all records – Assume 100MB/s throughput – Time = 5.6 hours(!) • Lesson: avoid random seeks! event.cwi.nl/lsde2015 Source: Ted Dunning, on Hadoop mailing list

Numbers everyone should know L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns Mutex lock/unlock 25 ns Main memory reference 100 ns Send 2K bytes over 1 Gbps network 20,000 ns Read 1 MB sequentially from memory 250,000 ns Round trip within same datacenter 500,000 ns Disk seek 10,000,000 ns Read 1 MB sequentially from disk 20,000,000 ns Send packet CA → Netherlands → CA 150,000,000 ns event.cwi.nl/lsde2015 * According to Jeff Dean (LADIS 2009 keynote)

DEVELOPING ALGORITHMS event.cwi.nl/lsde2015

Optimising computation • The cluster management software orchestrates the computation • But we can still optimise the computation – Just as we can write better code and use better algorithms and data structures – At all times confined within the capabilities of the framework • Cleverly-constructed data structures – Bring partial results together • Sort order of intermediate keys – Control order in which reducers process keys • Partitioner – Control which reducer processes which keys • Preserving state in mappers and reducers – Capture dependencies across multiple keys and values event.cwi.nl/lsde2015

Preserving State Mapper object Reducer object one object per task state state setup setup API initialization hook one call per input key-value pair map reduce one call per intermediate key cleanup close API cleanup hook event.cwi.nl/lsde2015

Importance of local aggregation • Ideal scaling characteristics: – Twice the data, twice the running time – Twice the resources, half the running time • Why can’t we achieve this? – Synchronization requires communication – Communication kills performance • Thus… avoid communication! – Reduce intermediate data via local aggregation – Combiners can help event.cwi.nl/lsde2015

Word count: baseline class Mapper method map (docid a, doc d) for all term t in d do emit (t, 1); class Reducer method reduce (term t, counts [c1, c2, …]) sum = 0; for all counts c in [c1, c2, …] do sum = sum + c; emit (t, sum); event.cwi.nl/lsde2015

Word count: introducing combiners class Mapper method map (docid a, doc d) H = associative_array(term  count;) for all term t in d do H[t]++; for all term t in H[t] do emit (t, H[t]); Local aggregation reduces further computation event.cwi.nl/lsde2015

Word count: introducing combiners class Mapper method initialise () H = associative_array(term  count); method map (docid a, doc d) for all term t in d do H[t]++; method close () for all term t in H[t] do emit (t, H[t]); Compute sums across documents! event.cwi.nl/lsde2015

Design pattern for local aggregation • In-mapper combining – Fold the functionality of the combiner into the mapper by preserving state across multiple map calls • Advantages – Speed – Why is this faster than actual combiners? • Disadvantages – Explicit memory management required – Potential for order-dependent bugs event.cwi.nl/lsde2015

Combiner design • Combiners and reducers share same method signature – Effectively they are map-side reducers – Sometimes, reducers can serve as combiners – Often, not… • Remember: combiners are optional optimisations – Should not affect algorithm correctness – May be run 0, 1, or multiple times • Example: find average of integers associated with the same key event.cwi.nl/lsde2015

Computing the mean: version 1 class Mapper method map (string t, integer r) emit (t, r); class Reducer method reduce (string, integers [r1, r2 , …]) sum = 0; count = 0; for all integers r in [r1, r2 , …] do sum = sum + r; count++ r avg = sum / count; emit (t, r avg ); Can we use a reducer as the combiner? event.cwi.nl/lsde2015

Computing the mean: version 2 class Mapper method map (string t, integer r) emit (t, r); class Combiner method combine ( string, integers [r1, r2, …] ) sum = 0; count = 0; for all integers r in [r1, r2, …] do sum = sum + r; count++; emit (t, pair(sum, count); class Reducer method reduce (string, pairs [(s1, c1), (s2, c2), …]) sum = 0; count = 0; for all pair(s, c) r in [(s1, c1), (s2, c2), … ] do sum = sum + s; count = count + c; r avg = sum / count; emit (t, r avg ); Wrong! event.cwi.nl/lsde2015

Computing the mean: version 3 class Mapper method map (string t, integer r) emit (t, pair(t, 1)); class Combiner method combine ( string, pairs [(s1, c1), (s2, c2), …] ) sum = 0; count = 0; for all pair(s, c) in [(s1, c1), (s2, c2), …] do sum = sum + s; count = count + c; emit (t, pair(sum, count); class Reducer method reduce (string, pairs [(s1, c1), (s2, c2), …]) sum = 0; count = 0; for all pair(s, c) in [(s1, c1), (s2, c2), … ] do sum = sum + s; count = count + c; r avg = sum / count; emit (t, r avg ); Fixed! event.cwi.nl/lsde2015

Computing the mean: version 4 class Mapper method initialise () S = associative_array(string  integer); C = associative_array(string  integer); method map (string t, integer r) S[t] = S[t] + r; C[t]++; method close () for all t in keys(S) do emit (t, pair(S[t], C[t]); Simpler, cleaner, with no need for combiner event.cwi.nl/lsde2015

Algorithm design: term co-occurrence • Term co-occurrence matrix for a text collection – M = N x N matrix (N = vocabulary size) – M ij : number of times i and j co-occur in some context (for concreteness, let’s say context = sentence) • Why? – Distributional profiles as a way of measuring semantic distance – Semantic distance useful for many language processing tasks event.cwi.nl/lsde2015

Using MapReduce for large counting problems • Term co-occurrence matrix for a text collection is a specific instance of a large counting problem – A large event space (number of terms) – A large number of observations (the collection itself) – Goal: keep track of interesting statistics about the events • Basic approach – Mappers generate partial counts – Reducers aggregate partial counts How do we aggregate partial counts efficiently? event.cwi.nl/lsde2015

First try: pairs • Each mapper takes a sentence: – Generate all co-occurring term pairs – For all pairs, emit (a, b) → count • Reducers sum up counts associated with these pairs • Use combiners! event.cwi.nl/lsde2015

Pairs: pseudo-code class Mapper method map (docid a, doc d) for all w in d do for all u in neighbours (w) do emit (pair(w, u), 1); class Reducer method reduce (pair p, counts [c1, c2, …]) sum = 0; for all c in [c1, c2, …] do sum = sum + c; emit (p, sum); event.cwi.nl/lsde2015

Analysing pairs • Advantages – Easy to implement, easy to understand • Disadvantages – Lots of pairs to sort and shuffle around (upper bound?) – Not many opportunities for combiners to work event.cwi.nl/lsde2015

Another try: stripes • Idea: group together pairs into an associative array (a, b) → 1 (a, c) → 2 a → { b: 1, c: 2, d: 5, e: 3, f: 2 } (a, d) → 5 (a, e) → 3 (a, f) → 2 • Each mapper takes a sentence: – Generate all co-occurring term pairs – For each term, emit a → { b: count b , c: count c , d: count d … } • Reducers perform element-wise sum of associative arrays a → { b: 1, d: 5, e: 3 } a → { b: 1, c: 2, d: 2, f: 2 } + a → { b: 2, c: 2, d: 7, e: 3, f: 2 } Cleverly-constructed data structure brings together partial results event.cwi.nl/lsde2015

Large-Scale Data Engineering Designing and implementing algorithms - PowerPoint PPT Presentation

Large-Scale Data Engineering Designing and implementing algorithms for MapReduce event.cwi.nl/lsde2015 PROGRAMMING FOR A DATA CENTRE event.cwi.nl/lsde2015 Programming for a data centre Understanding the design of warehouse-sized computes

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

Ethics in Techniques for large-scale data Graham J.L. Kemp TECHNIQUES FOR LARGE-SCALE DATA

A large-scale chemical data integration system Gaia Paolini Pfizer Confidential 1 Large-Scale

Large-Scale Data Engineering Data streams and low latency processing event.cwi.nl/lsde2015 DATA

Large-Scale Data Engineering Data streams and low latency processing event.cwi.nl/lsde DATA

MongoDB large scale data-centric architectures QConSF 2012 Kenny Gorman Founder, ObjectRocket

Large-scale Data Processing and Optimisation Eiko Yoneki University of Cambridge Computer

Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems A. Gharaibeh, E.

Meeting the Challenges of Ultra- -Large Large- - Meeting the Challenges of Ultra Scale Systems

GLAST Large Area Telescope: GLAST Large Area Telescope: Gamma- -ray Large ray Large Gamma

INCORPORATING LARGE-SCALE CITIZEN INCORPORATING LARGE-SCALE CITIZEN DELIBERATION INTO

Workshop Workshop on Large on Large- -Scale Disaster Recovery Scale Disaster Recovery i i

Meeting the Challenges of Ultra- -Large Large- -Scale Scale Meeting the Challenges of Ultra

Computers as undocumented physical objects Daniel J. Bernstein 2013.11.03 Do you think you

State-Based Testing Part B Error Identification Generating test cases for complex

CS 5150 So(ware Engineering 18. Reuse and Design Pa9erns William Y. Arms So(ware Reuse It is

Be Empowered & Know Your Rights" 2019 What should community organizations and

Machine Learning Considerations Auralee Edelen SLAC National Accelerator Laboratory Controls

Recent topics in the IETF Tomohiro Fujisaki Nippon Telegraph and Telephone Corporation Contents

(01 0120 20442 4423) ) IPv6 6 ov over er Lo Low-Power Power Wi Wireles eless s Per

DRAFT Status of work on IDNA2008 3/22/2009 1500 PDT Vint Cerf This brief summary is intended to

Large-Scale Data Engineering Designing and implementing algorithms - PowerPoint PPT Presentation

Large-Scale Data Engineering Designing and implementing algorithms for MapReduce event.cwi.nl/lsde2015 PROGRAMMING FOR A DATA CENTRE event.cwi.nl/lsde2015 Programming for a data centre Understanding the design of warehouse-sized computes

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

Ethics in Techniques for large-scale data Graham J.L. Kemp TECHNIQUES FOR LARGE-SCALE DATA

A large-scale chemical data integration system Gaia Paolini Pfizer Confidential 1 Large-Scale

Large-Scale Data Engineering Data streams and low latency processing event.cwi.nl/lsde2015 DATA

Large-Scale Data Engineering Data streams and low latency processing event.cwi.nl/lsde DATA

MongoDB large scale data-centric architectures QConSF 2012 Kenny Gorman Founder, ObjectRocket

Large-scale Data Processing and Optimisation Eiko Yoneki University of Cambridge Computer

Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems A. Gharaibeh, E.

Meeting the Challenges of Ultra- -Large Large- - Meeting the Challenges of Ultra Scale Systems

GLAST Large Area Telescope: GLAST Large Area Telescope: Gamma- -ray Large ray Large Gamma

INCORPORATING LARGE-SCALE CITIZEN INCORPORATING LARGE-SCALE CITIZEN DELIBERATION INTO

Workshop Workshop on Large on Large- -Scale Disaster Recovery Scale Disaster Recovery i i

Meeting the Challenges of Ultra- -Large Large- -Scale Scale Meeting the Challenges of Ultra

Computers as undocumented physical objects Daniel J. Bernstein 2013.11.03 Do you think you

State-Based Testing Part B Error Identification Generating test cases for complex

CS 5150 So(ware Engineering 18. Reuse and Design Pa9erns William Y. Arms So(ware Reuse It is

Be Empowered &amp; Know Your Rights&quot; 2019 What should community organizations and

Machine Learning Considerations Auralee Edelen SLAC National Accelerator Laboratory Controls

Recent topics in the IETF Tomohiro Fujisaki Nippon Telegraph and Telephone Corporation Contents

(01 0120 20442 4423) ) IPv6 6 ov over er Lo Low-Power Power Wi Wireles eless s Per

DRAFT Status of work on IDNA2008 3/22/2009 1500 PDT Vint Cerf This brief summary is intended to

Be Empowered & Know Your Rights" 2019 What should community organizations and