Charlie Garrod Michael Hilton School of Computer Science 15-214 1 - - PowerPoint PPT Presentation

charlie garrod michael hilton
SMART_READER_LITE
LIVE PREVIEW

Charlie Garrod Michael Hilton School of Computer Science 15-214 1 - - PowerPoint PPT Presentation

Principles of So3ware Construc9on: Objects, Design, and Concurrency Part 5: Concurrency Introduc9on to concurrency, part 4 Concurrency frameworks Charlie Garrod Michael Hilton School of Computer Science 15-214 1 Administrivia Homework


slide-1
SLIDE 1

1

15-214

School of Computer Science

Principles of So3ware Construc9on: Objects, Design, and Concurrency Part 5: Concurrency Introduc9on to concurrency, part 4 Concurrency frameworks

Charlie Garrod Michael Hilton

slide-2
SLIDE 2

2

15-214

Administrivia

  • Homework 5b due tonight 11:59 p.m.

– Turn in by Wednesday 9 a.m. to be considered as a Best Framework

slide-3
SLIDE 3

3

15-214

Key concepts from last Thursday

slide-4
SLIDE 4

4

15-214

Summary of our RwLock example

  • Generally, avoid wait/notify
  • Never invoke wait outside a loop

– Must check coordina9on condi9on a3er waking

  • Generally use notifyAll, not notify
  • Do not use our RwLock – it's just a toy

– Instead, know the standard libraries…

  • Discuss: sun.misc.Unsafe
slide-5
SLIDE 5

5

15-214

Concurrency bugs can be very subtle

private final List<Observer<E>> observers = new ArrayList<>(); public void addObserver(Observer<E> observer) { synchronized(observers) { observers.add(observer); } } public boolean removeObserver(Observer<E> observer) { synchronized(observers) { return observers.remove(observer); } } private void notifyOf(E element) { synchronized(observers) { for (Observer<E> observer : observers)

  • bserver.notify(this, element); // Risks liveness and

} // safety failures! }

slide-6
SLIDE 6

6

15-214

The fork-join paYern

if (my portion of the work is small) do the work directly else split my work into pieces invoke the pieces and wait for the results

Image from: Wikipedia

slide-7
SLIDE 7

7

15-214

The membrane paYern

  • Mul9ple rounds of fork-join, each round wai9ng for the previous

round to complete

Image from: Wikipedia

slide-8
SLIDE 8

8

15-214

Today

  • An aside: Networking in Java
  • The Java executors framework
  • Concurrency in prac9ce: In the trenches of parallelism
slide-9
SLIDE 9

9

15-214

Basic types in Java

  • What is a byte?

– Answer: a signed, 8-bit integer (-128 to 127)

  • What is a char?

– Answer: a 16-bit Unicode-encoded character

slide-10
SLIDE 10

10

15-214

The stream abstrac9on

  • A sequence of bytes
  • May read 8 bits at a 9me, and close

java.io.InputStream

void close(); abstract int read(); int read(byte[] b);

  • May write, flush and close

java.io.OutputStream

void close(); void flush(); abstract void write(int b); void write(byte[] b);

slide-11
SLIDE 11

11

15-214

Example streams

  • java.io.FileInputStream

– Reads from files, byte by byte

  • java.io.ByteArrayInputStream

– Provides a stream interface for a byte[]

  • Many APIs provide streams

– e.g., java.lang.System.in

slide-12
SLIDE 12

12

15-214

Aside: To read and write arbitrary objects

  • Your object must implement the java.io.Serializable

interface

– Methods: none

  • If all of your data fields are themselves Serializable, Java

can automa9cally serialize your class

– If not, will get run9me NotSerializableException

  • Can customize serializa9on by overriding special methods
slide-13
SLIDE 13

13

15-214

Internet addresses and sockets

  • For IP version 4 (IPv4) host address is a 4-byte number

– e.g. 127.0.0.1 – Hostnames mapped to host IP addresses via DNS – ~4 billion dis9nct addresses

  • Port is a 16-bit number (0-65535)

– Assigned conven9onally

  • e.g., port 80 is the standard port for web servers
slide-14
SLIDE 14

14

15-214

Packet-oriented and stream-oriented connec9ons

  • UDP: User Datagram Protocol

– Unreliable, discrete packets of data

  • TCP: Transmission Control Protocol

– Reliable data stream

slide-15
SLIDE 15

15

15-214

Networking in Java

  • The java.net.InetAddress:

static InetAddress getByName(String host); static InetAddress getByAddress(byte[] b); static InetAddress getLocalHost();

  • The java.net.Socket:

Socket(InetAddress addr, int port); boolean isConnected(); boolean isClosed(); void close(); InputStream getInputStream(); OutputStream getOutputStream();

  • The java.net.ServerSocket:

ServerSocket(int port); Socket accept(); void close(); …

slide-16
SLIDE 16

16

15-214

Today

  • An aside: Networking in Java
  • The Java executors framework
  • Concurrency in prac9ce: In the trenches of parallelism
slide-17
SLIDE 17

17

15-214

Execu9on of tasks

  • Natural boundaries of computa9on define tasks, e.g.:

public class SingleThreadWebServer { public static void main(String[] args) throws IOException { ServerSocket socket = new ServerSocket(80); while (true) { Socket connection = socket.accept(); handleRequest(connection); } } private static void handleRequest(Socket connection) { … // request-handling logic here } }

slide-18
SLIDE 18

18

15-214

A poor design choice: A thread per task

public class ThreadPerRequestWebServer { public static void main(String[] args) throws IOException { ServerSocket socket = new ServerSocket(80); while (true) { Socket connection = socket.accept(); new Thread(() -> handleRequest(connection)).start(); } } private static void handleRequest(Socket connection) { … // request-handling logic here } }

slide-19
SLIDE 19

19

15-214

Recall the Java primi9ve concurrency tools

  • The java.lang.Runnable interface

void run();

  • The java.lang.Thread class

Thread(Runnable r); void start(); void join();

slide-20
SLIDE 20

20

15-214

Recall the Java primi9ve concurrency tools

  • The java.lang.Runnable interface

void run();

  • The java.lang.Thread class

Thread(Runnable r); void start(); void join();

  • The java.util.concurrent.Callable<V> interface

– Like java.lang.Runnable but can return a value V call();

slide-21
SLIDE 21

21

15-214

A framework for asynchronous computa9on

  • The java.util.concurrent.Future<V> interface

V get(); V get(long timeout, TimeUnit unit); boolean isDone(); boolean cancel(boolean mayInterruptIfRunning); boolean isCancelled();

slide-22
SLIDE 22

22

15-214

A framework for asynchronous computa9on

  • The java.util.concurrent.Future<V> interface:

V get(); V get(long timeout, TimeUnit unit); boolean isDone(); boolean cancel(boolean mayInterruptIfRunning); boolean isCancelled();

  • The java.util.concurrent.ExecutorService interface:

Future<?> submit(Runnable task); Future<V> submit(Callable<V> task); List<Future<V>> invokeAll(Collection<? extends Callable<V>> tasks); Future<V> invokeAny(Collection<? extends Callable<V>> tasks); void shutdown();

slide-23
SLIDE 23

23

15-214

Executors for common computa9onal paYerns

  • From the java.util.concurrent.Executors class

static ExecutorService newSingleThreadExecutor(); static ExecutorService newFixedThreadPool(int n); static ExecutorService newCachedThreadPool(); static ExecutorService newScheduledThreadPool(int n);

slide-24
SLIDE 24

24

15-214

Example use of executor service

public class ThreadPoolWebServer { private static final Executor exec = Executors.newFixedThreadPool(100); // 100 threads public static void main(String[] args) throws IOException { ServerSocket socket = new ServerSocket(80); while (true) { Socket connection = socket.accept(); exec.execute(() -> handleRequest(connection)); } } private static void handleRequest(Socket connection) { … // request-handling logic here } }

slide-25
SLIDE 25

25

15-214

Today

  • An aside: Networking in Java
  • The Java executors framework
  • Concurrency in prac9ce: In the trenches of parallelism
slide-26
SLIDE 26

26

15-214

Concurrency at the language level

  • Consider:

Collection<Integer> collection = …; int sum = 0; for (int i : collection) { sum += i; }

  • In python:

collection = … sum = 0 for item in collection: sum += item

slide-27
SLIDE 27

27

15-214

Parallel quicksort in Nesl

function quicksort(a) = if (#a < 2) then a else let pivot = a[#a/2]; lesser = {e in a| e < pivot}; equal = {e in a| e == pivot}; greater = {e in a| e > pivot}; result = {quicksort(v): v in [lesser,greater]}; in result[0] ++ equal ++ result[1];

  • Opera9ons in {} occur in parallel
  • 210-esque ques9ons: What is total work? What is depth?
slide-28
SLIDE 28

28

15-214

Prefix sums (a.k.a. inclusive scan, a.k.a. scan)

  • Goal: given array x[0…n-1], compute array of the sum of

each prefix of x

[ sum(x[0…0]), sum(x[0…1]), sum(x[0…2]), … sum(x[0…n-1]) ]

  • e.g., x =

[13, 9, -4, 19, -6, 2, 6, 3] prefix sums: [13, 22, 18, 37, 31, 33, 39, 42]

slide-29
SLIDE 29

29

15-214

Parallel prefix sums

  • Intui9on: If we have already computed the par9al sums

sum(x[0…3]) and sum(x[4…7]), then we can easily compute sum(x[0…7])

  • e.g., x =

[13, 9, -4, 19, -6, 2, 6, 3]

slide-30
SLIDE 30

30

15-214

Parallel prefix sums algorithm, upsweep

Compute the par9al sums in a more useful manner [13, 9, -4, 19, -6, 2, 6, 3] [13, 22, -4, 15, -6, -4, 6, 9]

slide-31
SLIDE 31

31

15-214

Parallel prefix sums algorithm, upsweep

Compute the par9al sums in a more useful manner [13, 9, -4, 19, -6, 2, 6, 3] [13, 22, -4, 15, -6, -4, 6, 9] [13, 22, -4, 37, -6, -4, 6, 5]

slide-32
SLIDE 32

32

15-214

Parallel prefix sums algorithm, upsweep

Compute the par9al sums in a more useful manner [13, 9, -4, 19, -6, 2, 6, 3] [13, 22, -4, 15, -6, -4, 6, 9] [13, 22, -4, 37, -6, -4, 6, 5] [13, 22, -4, 37, -6, -4, 6, 42]

slide-33
SLIDE 33

33

15-214

Parallel prefix sums algorithm, downsweep

Now unwind to calculate the other sums [13, 22, -4, 37, -6, -4, 6, 42] [13, 22, -4, 37, -6, 33, 6, 42]

slide-34
SLIDE 34

34

15-214

Parallel prefix sums algorithm, downsweep

Now unwind to calculate the other sums [13, 22, -4, 37, -6, -4, 6, 42] [13, 22, -4, 37, -6, 33, 6, 42] [13, 22, 18, 37, 31, 33, 39, 42]

  • Recall, we started with:

[13, 9, -4, 19, -6, 2, 6, 3]

slide-35
SLIDE 35

35

15-214

Doubling array size adds two more levels

Upsweep Downsweep

slide-36
SLIDE 36

36

15-214

Parallel prefix sums pseudocode

// Upsweep prefix_sums(x): for d in 0 to (lg n)-1: // d is depth parallelfor i in 2d-1 to n-1, by 2d+1: x[i+2d] = x[i] + x[i+2d] // Downsweep for d in (lg n)-1 to 0: parallelfor i in 2d-1 to n-1-2d, by 2d+1: if (i-2d >= 0): x[i] = x[i] + x[i-2d]

slide-37
SLIDE 37

37

15-214

Parallel prefix sums algorithm, in code

  • An itera9ve Java-esque implementa9on:

void iterativePrefixSums(long[] a) { int gap = 1; for ( ; gap < a.length; gap *= 2) { parfor(int i=gap-1; i+gap < a.length; i += 2*gap) { a[i+gap] = a[i] + a[i+gap]; } } for ( ; gap > 0; gap /= 2) { parfor(int i=gap-1; i < a.length; i += 2*gap) { a[i] = a[i] + ((i-gap >= 0) ? a[i-gap] : 0); } }

slide-38
SLIDE 38

38

15-214

Parallel prefix sums algorithm, in code

  • A recursive Java-esque implementa9on:

void recursivePrefixSums(long[] a, int gap) { if (2*gap – 1 >= a.length) { return; } parfor(int i=gap-1; i+gap < a.length; i += 2*gap) { a[i+gap] = a[i] + a[i+gap]; } recursivePrefixSums(a, gap*2); parfor(int i=gap-1; i < a.length; i += 2*gap) { a[i] = a[i] + ((i-gap >= 0) ? a[i-gap] : 0); } }

slide-39
SLIDE 39

39

15-214

Parallel prefix sums algorithm

  • How good is this?
slide-40
SLIDE 40

40

15-214

Parallel prefix sums algorithm

  • How good is this?

– Work: O(n) – Depth: O(lg n)

  • See PrefixSums.java,

PrefixSumsSequentialWithParallelWork.java

slide-41
SLIDE 41

41

15-214

Goal: parallelize the PrefixSums implementa9on

  • Specifically, parallelize the parallelizable loops

parfor(int i = gap-1; i+gap < a.length; i += 2*gap) { a[i+gap] = a[i] + a[i+gap]; }

  • Par99on into mul9ple segments, run in different threads

for(int i = left+gap-1; i+gap < right; i += 2*gap) { a[i+gap] = a[i] + a[i+gap]; }

slide-42
SLIDE 42

42

15-214

Recall: The membrane paYern

  • Mul9ple rounds of fork-join, each round wai9ng for the previous

round to complete

Image from: Wikipedia

slide-43
SLIDE 43

43

15-214

Fork/join in Java

  • The java.util.concurrent.ForkJoinPool class

– Implements ExecutorService – Executes java.util.concurrent.ForkJoinTask<V> or java.util.concurrent.RecursiveTask<V> or java.util.concurrent.RecursiveAction

  • In a long computa9on:

– Fork a thread (or more) to do some work – Join the thread(s) to obtain the result of the work

slide-44
SLIDE 44

44

15-214

The RecursiveAction abstract class

public class MyActionFoo extends RecursiveAction { public MyActionFoo(…) { store the data fields we need } @Override public void compute() { if (the task is small) { do the work here; return; } invokeAll(new MyActionFoo(…), // smaller new MyActionFoo(…), // subtasks …); // … } }

slide-45
SLIDE 45

45

15-214

A ForkJoin example

  • See PrefixSumsParallelForkJoin.java
  • See the processor go, go go!
slide-46
SLIDE 46

46

15-214

Parallel prefix sums algorithm

  • How good is this?

– Work: O(n) – Depth: O(lg n)

  • See PrefixSumsParallelArrays.java
slide-47
SLIDE 47

47

15-214

Parallel prefix sums algorithm

  • How good is this?

– Work: O(n) – Depth: O(lg n)

  • See PrefixSumsParallelArrays.java
  • See PrefixSumsSequential.java
slide-48
SLIDE 48

48

15-214

Parallel prefix sums algorithm

  • How good is this?

– Work: O(n) – Depth: O(lg n)

  • See PrefixSumsParallelArrays.java
  • See PrefixSumsSequential.java

– n-1 addi9ons – Memory access is sequen9al

  • For PrefixSumsSequentialWithParallelWork.java

– About 2n useful addi9ons, plus extra addi9ons for the loop indexes – Memory access is non-sequen9al

  • The punchline:

– Don't roll your own – Cache and constants maYer

slide-49
SLIDE 49

49

15-214

In-class example for parallel prefix sums

[7, 5, 8, -36, 17, 2, 21, 18]