Programmability in the Era of Parallel Computing Per Stenstrm - - PowerPoint PPT Presentation

programmability in the era of parallel computing
SMART_READER_LITE
LIVE PREVIEW

Programmability in the Era of Parallel Computing Per Stenstrm - - PowerPoint PPT Presentation

Programmability in the Era of Parallel Computing Per Stenstrm Department of Computer Science and Engineering Chalmers University of Technology Sweden Multicore Scaling Cores/chip 100s cores 16 cores Source: Computer Performance : Game


slide-1
SLIDE 1

Programmability in the Era of Parallel Computing

Per Stenström Department of Computer Science and Engineering Chalmers University of Technology Sweden

slide-2
SLIDE 2

Multicore Scaling

By 2020, several hundreds of powerful cores/chip

1990 2000 2014

1 core 16 cores 100s cores

2020

Source: Computer Performance : Game Over or Next Level” IEEE Computer, Jan 2011 Predictions

Cores/chip

slide-3
SLIDE 3

Programmability

slide-4
SLIDE 4

High-Productivity Software Design in the Multi/Many-core Era

Plug & play Productivity programming languages (e.g. C/C++, Java)

4

End user Productivity programmers Efficiency programmers

System-near programming

slide-5
SLIDE 5

High-Productivity Software Stack for Multi/Many-core Systems

Software Components Oblivious to Parallelism Runtime with Parallelism Capabilities Hardware Primitives (e.g. TM)

5

Computer architects Efficiency-only programmers Productivity programmers

Increased level of abstraction

slide-6
SLIDE 6

Topic 1: Task based programming models

slide-7
SLIDE 7

Task-based Dataflow Prog. Models

TaskA TaskC TaskB

#pragma css task output(a) void TaskA( float a[M][M]); #pragma css task input(a) void TaskB( float a[M][M]); #pragma css task input(a) void TaskC( float a[M][M]);

  • Programmer annotations for task dependences
  • Annotations used by run-time for scheduling
  • Dataflow task graph constructed dynamially

Hypothesis: Programmers focus on extracting parallelism, system delivers performance. BUT: Is this a good idea?

slide-8
SLIDE 8

Topic 1: Transactional memory

slide-9
SLIDE 9

Transactional Memory (TM)

  • Transactional memory semantics:

– Atomicity, consistency, and isolation – Tx_begin/Tx_end primitives

  • Allow for concurrency inside critical

sections

  • Software implementations too slow
  • Hardware implementations complex but

have been adopted (IBM Bluegene, Intel Haswell)

  • 100s of papers in the open literature;

design space fairly well understood

WA RA Commit RA Re-execution TX1 TX2 Data conflict

Hypothesis: Simplifies for programmers, but is this a good idea?