15-150 Fall 2020 Stephen Brookes Lecture 17 Sequences and cost - PowerPoint PPT Presentation

15-150 Fall 2020 Stephen Brookes Lecture 17 Sequences and cost graphs Halloween, a full moon and a time change all happening simultaneously

announcements • Next Tuesday (3 Nov) is ELECTION DAY • Class will be held as usual (on zoom) • Homework is NOT DUE on Tuesday to allow you time to participate in voting • NEW: the TAs will hold a Weekly Review this evening (will be recorded, too)

today parallel programming • cost semantics • Brent’s Theorem • sequences : an abstract type with efficient parallel operations

parallelism exploiting multiple processors evaluating independent code simultaneously • low-level implementation • scheduling work onto processors • high-level planning • designing code abstractly • without baking in a schedule

our approach design abstractly • behavioral correctness • asymptotic runtime (work, span) reason abstractly • independently of schedule • cost semantics and evaluation

• You design the code • The compiler schedules the work

functional benefits • No side effects, so… evaluation order doesn’t affect correctness • Can build abstract types that support efficient parallel-friendly operations • Can use work and span to predict potential for parallel speed-up • Work and span are independent of scheduling details

caveat • In practice, it’s hard to achieve speed-up • Current language implementations don’t make it easy • Problems include: • scheduling overhead • locality of data (cache problems) • runtime sensitive to scheduling choices

why bother? • It’s good to think abstractly first and figure out details later • Focus on data dependencies when you design your code • Our thesis: this approach to parallelism will prevail... (and 15-210 builds on these ideas...)

cost semantics We already introduced work and span • Work estimates the sequential evaluation time on a single processor • Span takes account of data dependency, estimates the parallel evaluation time with unlimited processors

cost semantics • We showed how to calculate work and span for recursive functions with recurrence relations • Now we introduce cost graphs , another way to deal with work and span • Cost graphs also allow us to talk about schedules... ... and the potential for speed-up

cost graphs A cost graph is a series-parallel graph • a directed graph, with source and sink • nodes represent units of work (constant time) • edges represent data dependencies • branching indicates potential parallelism

series-parallel graphs . . . a single node . . . G 1 . . . G 1 G 2 . . G 2 sequential parallel composition composition

work and span of a cost graph • The work is the number of nodes • The span is the length of the longest path from source to sink span (G) ≤ work (G)

. . G 1 . work = work G 1 + work G 2 + c . G 2 dependent code … add the work . . . . . work = work G 1 + work G 2 + c G 1 G 2 . independent code … add the work

. . G 1 . span = span G 1 + span G 2 + c . G 2 . dependent code … add the span . . . . span = max(span G 1 , span G 2 ) + c G 1 G 2 . independent code … max the span

sources and sinks • Sometimes we omit them from pictures • no loss of generality • easy to put them back in • No difference, asymptotically • a single node represents an additive constant amount of work and span • Allows easier explanation of execution

⑦ ⑩ ② ① ④ ⑧ ① ⑪ ⑦ ⑨ ⑥ ⑤ ③ ② example and must be done before each node represents a single unit of work work = 11 (number of nodes) span = 4 (longest path length)

using cost graphs • Every expression can be given a cost graph • Can calculate work and span using the graph • These are asymptotically the same as the work and span derived from recurrence relations work and span provide work: single processor span: unlimited processors asymptotic estimates of actual running time, basic ops under certain assumptions take constant time

① ② ⑦ ⑥ ⑧ ⑨ ⑩ ⑧ ⑪ ⑩ ⑨ ③ ④ ⑥ ⑤ ④ ③ ② ① ⑪ ⑦ ⑤ scheduling assign units of work to processors respecting data dependency • Work: number of nodes • Span: length of critical path uses 5 processors { (i) (ii) an optimal w = 11 (iii) parallel schedule s = 4 (iv) (5 rounds, or 4 steps) (v)

⑨ ⑩ ⑩ ⑥ ⑧ ⑦ ⑤ ③ ④ ① ② ⑧ ⑪ ⑪ ⑦ ⑨ ⑥ ⑤ ④ ③ ② ① example What if there are only 2 processors? (i) a best schedule (ii) for 2 processors (iii) w = 11 (iv) (6 rounds, s = 4 (v) 5 steps) (vi) 2 processors cannot do the job as fast as 5 (!)

Brent’s Theorem An expression with work w and span s can be evaluated on a p -processor machine in time O(max( w / p , s )). Find me the smallest p such that Optimal schedule using p processors: Do (up to) p units of work each round w / p ≤ s Total work to do is w Needs at least s steps Using more than this many processors won’t yield any speed-up Richard Brent is an illustrious Australian mathematician and computer scientist. David Brent is the manager of the Slough branch He is known for Brent’s Theorem , which shows that a parallel algorithm can of Wernham–Hogg. He wants to know how many always be adapted to run on fewer processors with only the obvious time penalty computers to buy to improve office efficiency. —a beautiful example of an “obvious” but non-trivial theorem.

② ⑪ ⑥ ① ⑦ ⑧ ⑪ ⑤ ⑧ ④ ⑩ ⑦ ⑨ ⑥ ⑤ ③ ② example min { p | w/ p ≤ s} is 3 (i) a best schedule ① ③ ④ (ii) for 3 processors w = 11 (iii) s = 4 (iv) (5 rounds, ⑨ ⑩ (v) 4 steps) 3 processors can do the work as fast as 5(!)

summary • Cost graphs give us another way to talk about work and span • Brent’s Theorem tells us about the potential for parallel speed-up • check if w/p ≤ s

next • Exploiting parallelism in ML • A signature for parallel collections • Cost analysis of implementations • Cost benefits of parallel algorithm design - we revisit some list-based functions - sequence-based functions are faster

sequences signature SEQ = sig type ’a seq exception Range val tabulate : (int -> ’a) -> int -> ’a seq val length : ’a seq -> int val nth : int -> ’a seq -> ’a   val split : ’a seq -> ’a seq * ’a seq val map : (’a -> ’b) -> ’a seq -> ’b seq val reduce : (’a * ’a -> ’a) -> ’a -> ’a seq -> ’a val mapreduce : (’a -> ’b) -> ’b -> (’b * ’b -> ’b) -> ’a seq -> ’b end

SEQ • We may expand the SEQ signature later… … with some extra functions • For today, let’s keep it simple • Purpose: a value of type t seq is a sequence of values of type t • with faster operations than those available for t list

implementations • Many ways to implement the signature • lists, balanced trees, arrays, ... • For each one, can give a cost analysis • There may be implementation trade-offs • lists: access is O(n), length is O(n) • arrays: access is O(1), length is O(1) • trees: access is O(log n), length is ?? Obviously, a list-based implementation of sequences isn’t going to be faster than lists! But arrays, trees, can be.

Seq : SEQ • An abstract parameterized type of sequences • Think of a sequence as a parallel collection • With parallel-friendly operations • constant-time access to items • efficient map and reduce We’ll work today with an implementation Seq : SEQ based on vectors

notation • We have an abstract type of sequences • We want to think about sequence values in a way that’s independent of any specific implementation • could be lists, arrays, trees, … • We need a neutral notation for sequences ⟨ v 0 , ..., v n-1 ⟩ This is NOT program syntax!

notation • Remember that if we have structures like ListSeq : SEQ ArraySeq : SEQ BalancedTreeSeq : SEQ we can use qualified names like ListSeq.empty, int ListSeq.seq

think abstractly • We’ll mostly use the abstract notation for sequences • We’ll give abstract specifications • But we’ll discuss work/span characteristics for a specific implementation • other implementations may have different work/span

sequence values A value of type t seq is a sequence of values of type t • We use math notation like ⟨ v 1 , ..., v n ⟩ ⟨ v 0 , ..., v n-1 ⟩ ⟨ ⟩ for sequence values ⟨ 1, 2, 4, 8 ⟩ is a value of type int seq

equality • Two sequence values are (extensionally) equal iff they have the same length and have equal items at all positions ⟨ v 1 , ..., v n ⟩ = ⟨ u 1 , ..., u m ⟩ if and only if n = m and for all i, v i = u i Again, this is NOT program notation

operations For our given structure Seq : SEQ, we specify • the (extensional) behavior • the cost semantics of each operation Other implementations of SEQ are designed to have the same extensional behavior but may have different work/span profiles Learn to choose wisely!

15-150 Fall 2020 Stephen Brookes Lecture 17 Sequences and cost - PowerPoint PPT Presentation

15-150 Fall 2020 Stephen Brookes Lecture 17 Sequences and cost graphs Halloween, a full moon and a time change all happening simultaneously announcements Next Tuesday (3 Nov) is ELECTION DAY Class will be held as usual (on zoom)

MEDP 150 / FILMP 150 MEDP 150 / FILMP 150 Whether you are thinking about a career in filmmaking,

The Problem = $1,500 taxes/yr. Median Value = $150,000 x 1% for City Property Tax + $150 to roads

Leica Sprinter 50 / 150 / 150M / 250M Push the Button Leica Sprinter 50 / 150 Construction

Fall to Fall Enrollment Comparison Fall to Fall Enrollment Comparison Student FTE, Fall 2000

Seasonal Outreach Fall Fall Outreach Campaign Fall Outreach Campaign Fall Outreach Fall

14 C 4 CFR PAR ART 150 150 N NOISE AN AND D LAND USE C SE COM OMPATIBILITY ST STUDY

CPB Approach 0,5 0 2000 2002 2004 2006 2008 2010 2012 2014 -0,5 5 November 2015 Fall 06 Fall

Sampling CS 6965 Fall 2011 Creative Program 3 CS 6965 Fall 2011 2 CS 6965 Fall 2011 3 CS

in the CDHS 150 150 (50 Annually 25 Summer) 9 th Grade High School Students 6 6 High School

US 150 Eastbound (McClugage Bridge) over the Illinois River PRE-BID MEETING

Building on 150 years CSR 1855-2005 CELEBRATING 150 YEARS CSR Limited Results Presentation Half

150 Proportion of Users 100 50 0 0 1000 2000 3000 4000 Duration of User Session 150

! e Picha Project ENJOY MEALS , EMPOWER LIVES picha from Myanmar 150,000 registered refugees

FALL PLANNING BELLEVUE SCHOOL DISTRICT Fall Planning 2020 STEERING COMMITTEE July 29, 2020

EQUATION OF FREE FALL Chapter 2 = Free Fall v = u - gt Chapter 2 = Free Fall v = u - gt

CS 251 Fall 2019 CS 251 Fall 2019 CS 251 Fall 2019 CS 251 Fall 2019 Principles of

Genomic Medicine Centers Meeting VII Genomic Clinical Decision Support Developing Solutions

Monitoring the Impact of Wildfires on Tree Species with Deep Learning Wang Zhou , Levente Klein

\ and the S2S Project Andrew W Robertson awr@iri.columbia.edu ICTP/WCRP School on Climate

New Vertical Moist Thermodynamic Structure of the MJO in AIR IRS Observations Baijun Tian

Le vote lectronique : un dfi pour la vrification formelle Steve Kremer Loria, Inria Nancy

Q4 & FY2018 Highlights TSX:TGZ / OTCQX:TGCDF Friday, February 22, 2019 Forward-Looking

CS885 Reinforcement Learning Lecture 14c: June 15, 2018 Trust Region Methods [Nocedal and

HARNESS THE EDGE with StorMagic SvSAN & HPE Edgeline MAKING THE COMPLEX SIMPLE MAKING THE

15-150 Fall 2020 Stephen Brookes Lecture 17 Sequences and cost - PowerPoint PPT Presentation

15-150 Fall 2020 Stephen Brookes Lecture 17 Sequences and cost graphs Halloween, a full moon and a time change all happening simultaneously announcements Next Tuesday (3 Nov) is ELECTION DAY Class will be held as usual (on zoom)

MEDP 150 / FILMP 150 MEDP 150 / FILMP 150 Whether you are thinking about a career in filmmaking,

The Problem = $1,500 taxes/yr. Median Value = $150,000 x 1% for City Property Tax + $150 to roads

Leica Sprinter 50 / 150 / 150M / 250M Push the Button Leica Sprinter 50 / 150 Construction

Fall to Fall Enrollment Comparison Fall to Fall Enrollment Comparison Student FTE, Fall 2000

Seasonal Outreach Fall Fall Outreach Campaign Fall Outreach Campaign Fall Outreach Fall

14 C 4 CFR PAR ART 150 150 N NOISE AN AND D LAND USE C SE COM OMPATIBILITY ST STUDY

CPB Approach 0,5 0 2000 2002 2004 2006 2008 2010 2012 2014 -0,5 5 November 2015 Fall 06 Fall

Sampling CS 6965 Fall 2011 Creative Program 3 CS 6965 Fall 2011 2 CS 6965 Fall 2011 3 CS

in the CDHS 150 150 (50 Annually 25 Summer) 9 th Grade High School Students 6 6 High School

US 150 Eastbound (McClugage Bridge) over the Illinois River PRE-BID MEETING

Building on 150 years CSR 1855-2005 CELEBRATING 150 YEARS CSR Limited Results Presentation Half

150 Proportion of Users 100 50 0 0 1000 2000 3000 4000 Duration of User Session 150

! e Picha Project ENJOY MEALS , EMPOWER LIVES picha from Myanmar 150,000 registered refugees

FALL PLANNING BELLEVUE SCHOOL DISTRICT Fall Planning 2020 STEERING COMMITTEE July 29, 2020

EQUATION OF FREE FALL Chapter 2 = Free Fall v = u - gt Chapter 2 = Free Fall v = u - gt

CS 251 Fall 2019 CS 251 Fall 2019 CS 251 Fall 2019 CS 251 Fall 2019 Principles of

Genomic Medicine Centers Meeting VII Genomic Clinical Decision Support Developing Solutions

Monitoring the Impact of Wildfires on Tree Species with Deep Learning Wang Zhou , Levente Klein

\ and the S2S Project Andrew W Robertson awr@iri.columbia.edu ICTP/WCRP School on Climate

New Vertical Moist Thermodynamic Structure of the MJO in AIR IRS Observations Baijun Tian

Le vote lectronique : un dfi pour la vrification formelle Steve Kremer Loria, Inria Nancy

Q4 &amp; FY2018 Highlights TSX:TGZ / OTCQX:TGCDF Friday, February 22, 2019 Forward-Looking

CS885 Reinforcement Learning Lecture 14c: June 15, 2018 Trust Region Methods [Nocedal and

HARNESS THE EDGE with StorMagic SvSAN &amp; HPE Edgeline MAKING THE COMPLEX SIMPLE MAKING THE

Q4 & FY2018 Highlights TSX:TGZ / OTCQX:TGCDF Friday, February 22, 2019 Forward-Looking

HARNESS THE EDGE with StorMagic SvSAN & HPE Edgeline MAKING THE COMPLEX SIMPLE MAKING THE