terminology
play

Terminology Programmierung Paralleler und Verteilter Systeme (PPV) - PowerPoint PPT Presentation

Terminology Programmierung Paralleler und Verteilter Systeme (PPV) Sommer 2015 Frank Feinbube, M.Sc., Felix Eberhardt, M.Sc., Prof. Dr. Andreas Polze Terminology 2 Parallel Programming Concepts | 2013 / 1014 Terminology 3 When two trains


  1. Terminology Programmierung Paralleler und Verteilter Systeme (PPV) Sommer 2015 Frank Feinbube, M.Sc., Felix Eberhardt, M.Sc., Prof. Dr. Andreas Polze

  2. Terminology 2 Parallel Programming Concepts | 2013 / 1014

  3. Terminology 3 „When two trains approach each other at a crossing, both shall come to a full stop and neither shall start up again until the other has gone.“ [Kansas legislature, early 20th century]

  4. Terminology 4 ■ Concurrency □ Capability of a system to have two or more activities in progress at the same time □ May be independent, loosely coupled or closely coupled □ Classical operating system responsibility for a better utilization of CPU, memory, network, and other resources □ Demands scheduling and synchronization ■ Parallelism □ Capability of a system to execute activities simultaneously □ Demands parallel hardware, concurrency support, (and communication) ■ Any parallel program is a concurrent program ■ Some concurrent programs cannot be run as parallel program

  5. Terminology 5 ■ Concurrency vs. parallelism vs. distribution □ Two threads started by the application ◊ Define concurrent activities in the program code ◊ Might (!) be executed in parallel ◊ Can be distributed on different machines ■ Management of concurrent activities in an operating system □ Multiple applications being executed at the same time □ Single application leveraging threads for speedup / scaleup □ Non-sequential operating system activities „The vast majority of programmers today don’t grok concurrency, just as the vast majority of programmers 15 years ago didn’t yet grok objects“ [Herb Sutter, 2005]

  6. Concurrency [Breshears] 6 ■ Processes / threads represent the execution of atomic statements □ „Atomic“ can be defined on different granularity levels, e.g. source code line □ Concurrency should be treated as abstract concept ■ Concurrent execution □ Interleaving of multiple atomic instruction streams □ Leads to unpredictable result ◊ Non-deterministic scheduling, interrupts □ Concurrent algorithm should maintain its properties for all possible inter-leavings of sequential activities □ Example: All instructions are eventually included (fairness) ■ Some literature distinguishes between interleaving (uniprocessor) and overlapping (multiprocessor) of statements

  7. Concurrency 7 ■ In hardware □ Context switch support Server ■ In operating systems Application □ Native process / thread support □ Synchronization support Server Middleware ■ In virtual runtime environments Application □ Java / .NET thread support ■ In middleware Server Virtual Runtime Application □ J2EE / CORBA thread pooling ■ In programming languages □ Asynchronous and Operating System event-based programming

  8. Example: Operating System 8 code% data% files% code% data% files% registers% stack% registers% registers% registers% stack% stack% stack% Thread' Thread ' Thread ' Thread '

  9. Concurrency Is Hard 9 ■ Sharing of global resources □ Concurrent reads and writes on the same global resource (variable) makes ordering a critical issue ■ Optimal management of resource allocation □ Process gets control over a I/O channel and is then suspended before using it ■ Programming errors become non-deterministic □ Order of interleaving may / may not activate the bug ■ Happens all even on uniprocessors ■ Race condition □ The result of an operation depends on the order of execution □ Well-known issue since the 60‘s, identified by E. Dijkstra

  10. Race Condition 10 void echo() { char_in = getchar(); char_out = char_in; putchar(char_out); } ■ One piece of code in one process, executed at the same time … □ … by two threads on a single core. □ … by two threads on two cores. ■ What happens ?

  11. Potential Deadlock 11 [Stallings]

  12. Actual Deadlock 12 [Stallings] Parallel Programming Concepts | 2013 / 1014

  13. Terminology Deadlock ■ Two or more processes / threads are unable to proceed ■ Each is waiting for one of the others to do something Livelock ■ Two or more processes / threads continuously change their states in response to changes in the other processes / threads ■ No global progress for the application Race condition ■ Two or more processes / threads are executed concurrently ■ Final result of the application depends on the relative timing of their execution 13

  14. Terminology Starvation ■ A runnable process / thread is overlooked indefinitely ■ Although it is able to proceed, it is never chosen to run (dispatching / scheduling) Atomic Operation ■ Function or action implemented as a sequence of one or more instructions ■ Appears to be indivisible - no other process / thread can see an intermediate state or interrupt the operation ■ Executed as a group, or not executed at all Mutual Exclusion ■ The requirement that when one process / thread is using a resource, no other shall be allowed to do that 14

  15. From Concurrency to Parallelism 15 Program Program Program Process Process Process Process Process Process Process Process Process Process Process Process Task Task Task Node Processor Processor Processor Processor Processor Memory Processor Processor Network Processor Processor Processor Memory Processor Processor Processor Memory Memory Processor Processor Processor Memory

  16. Parallelism for … 16 ■ Speedup – compute faster ■ Throughput – compute more in the same time ■ Scalability – compute faster / more with additional resources ■ Price / performance – be as fast as possible for given money ■ Scavenging – compute faster / more with idle resources Processing Element A1 Main Memory Processing Element B1 Main Memory Scaling Up Processing Element A2 Processing Element B2 Processing Element A3 Processing Element B3 Scaling Out

  17. The Parallel Programming Problem 17 Configuration Flexible Type Execution Parallel Application Match ? Environment

  18. Parallelism [Mattson et al.] 18 ■ Task - Parallel program breaks a problem into tasks ■ Execution unit □ Representation of a concurrently running task (e.g. thread) □ Tasks are mapped to execution units during development time ■ Processing element □ Hardware element running one execution unit □ Depends on scenario - logical processor vs. core vs. machine □ Execution units run simultaneously on processing elements, controlled by the scheduling entity ■ Synchronization □ Mechanism to order activities of parallel tasks ■ Race condition □ Program result depends on the scheduling order

  19. Parallel Processing 19 ■ Inside the processor □ Instruction-level parallelism (ILP) □ Multicore □ Shared memory ■ With multiple processing elements in one machine □ Multiprocessing □ Shared memory ■ With multiple processing elements in many machines □ Multicomputer □ Shared nothing (in terms of a globally accessible memory)

  20. 20 Multiple Instruction, 
 (1966) Multiprocessor: Flynn ‘ s Taxonomy ■ Classify multiprocessor architectures among Single Data (MISD) instruction and data processing dimension Single Instruction, 
 Single Data (SISD) Multiple Instruction, 
 Single Instruction, 
 Multiple Data (MIMD) Multiple Data (SIMD) (C) Blaise Barney

  21. Another Taxonomy (Tanenbaum) 21 MIMD Parallel and Distributed Computers Multiprocessors Multicomputers (shared memory) (private memory) Bus Switched Bus Switched

  22. Another Taxonomy (Foster) 22 ■ Multicomputer Memory Control Unit □ Set of connected von Neumann computers (DM-MIMD) Bus Central Unit Output □ Each computer runs a local Arithmetic Logic Input Unit program in local memory and sends / receives messages □ Local memory access is less expensive than remote memory access Interconnect

  23. Shared Memory vs. Shared Nothing 23 ■ Organization of parallel processing hardware as … □ Shared memory system ◊ Concurrent processes can directly access a common address space ◊ Typically implemented as memory hierarchy, with different cache levels ◊ Examples: SMP systems, distributed shared memory systems, virtual runtime environment □ Shared nothing system ◊ Concurrent processes can only access local memory and exchange messages with other processes ◊ Message exchange typically order of magnitudes slower than memory ◊ Examples: Cluster systems, distributed systems (Hadoop, Grids, … )

  24. Shared Memory vs. Shared Nothing 24 ■ Pfister: „shared memory “ vs. „distributed memory “ ■ Foster: „multiprocessor “ vs. „multicomputer “ ■ Tannenbaum: „shared memory “ vs. „private memory “ Process Process Process Process Data Data Data Data Processor Processor Processor Processor Message Message Message Message Shared Memory

  25. Shared Memory 25 ■ All processors act independently and use the same global address space, changes in one memory location are visible for all others ■ Uniform memory access (UMA) system □ Equal load and store access for all processors to all memory □ Default approach for SMP systems of the past ■ Non-uniform memory access (NUMA) system □ Delay on memory access according to the accessed region □ Typically realized by processor networks and local memories ◊ Cache-coherent NUMA (CC-NUMA) , completely implemented in hardware ◊ Became standard approach with recent X86 chips

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend