Dynamic Processors Demand Dynamic Operating Systems Sankaralingam - - PowerPoint PPT Presentation

dynamic processors demand dynamic operating systems
SMART_READER_LITE
LIVE PREVIEW

Dynamic Processors Demand Dynamic Operating Systems Sankaralingam - - PowerPoint PPT Presentation

Dynamic Processors Demand Dynamic Operating Systems Sankaralingam Panneerselvam Michael M. Swift Computer Sciences Department University of Wisconsin, Madison, WI 1 HotPar 2010 Motivation Chip Multiprocessor Does not support well


slide-1
SLIDE 1

Dynamic Processors Demand Dynamic Operating Systems

Sankaralingam Panneerselvam Michael M. Swift

Computer Sciences Department University of Wisconsin, Madison, WI

HotPar 2010 1

slide-2
SLIDE 2

Motivation

 Chip Multiprocessor  Does not support well for sequential workloads

HotPar 2010

Possible Configurations “Amdahls law in the multicore era” [IEEE computer, July 2008]

256 128 64 32 16 8 4 2 1 Number of effective cores Speedup symmetric

System with up to 256 cores

250 200 150 100 50

2

slide-3
SLIDE 3

Motivation

 Asymmetric Chip Multiprocessor  To satisfy diverse workloads

HotPar 2010

256 255 253 249 241 225 193 129 1 Number of effective cores Speedup asymmetric

“Amdahls law in the multicore era” [IEEE computer, July 2008] System with up to 256 cores

250 200 150 100 50

3

slide-4
SLIDE 4

Motivation

 Dynamic Multiprocessor  Flexible to cast to the right configuration based

  • n the need

HotPar 2010

Speedup Dynamic

Number of elementary cores that gets configured dynamically to make a powerful core

“Amdahls law in the multicore era” [IEEE computer, July 2008] System with up to 256 cores

1 2 4 8 16 32 64 128 256 250 200 150 100 50

4

slide-5
SLIDE 5

Core Fusion [ISCA’07]

Examples of Dynamic Multiprocessors

Intel Turbo Boost [Nehalem]

HotPar 2010 5

slide-6
SLIDE 6

Motivation

 Many mechanisms lead to dynamically variable

processors

 Performance  Merging resources: Core Fusion, Speculative

Multithreading

 Shifting power: Turbo Boost, Over-provisioned

systems

 Reliability  Redundant execution [ISCA’07]

HotPar 2010 6

slide-7
SLIDE 7

Why reconfigure the OS?

 What happens if a processor goes to offline

state without any notification?

 Servicing of interrupts, IPI, Bottom halves is stopped  Other processors might wait for spinlock  RCU stall  Thread execution is stopped

HotPar 2010 7

slide-8
SLIDE 8

Can the OS adapt to changing processors ?

 Common theme: the number of physical

execution contexts may change dynamically and frequently

 Our work:

 Analysis of Linux mechanisms for changing processors  Two new techniques for dynamically varying processors

 Processor Proxies  Deferred/Parallel Hotplug

HotPar 2010 8

slide-9
SLIDE 9

Outline

 Motivation  Current Mechanisms  Processor Proxies  Deferred/Parallel hotplug

HotPar 2010 9

slide-10
SLIDE 10

Why is changing processors hard?

 Many pieces of code know which processors

are available

 Scheduler  Per-CPU structures

 Distributed operations require processors to

communicate

 Communication between processors - IPI  Read Copy Update (RCU) mechanism

HotPar 2010 10

slide-11
SLIDE 11

CPU dependence in Linux

HotPar 2010

 Analysis of Linux 2.6.31-4 kernel on a 4 CPU

machine

 Inference: CPU dependences are widespread Number of per-CPU data structures 446 data structures Number of callbacks when CPU set changes 35 callbacks Frequency of global RCU

  • perations

90 callbacks/second

11

slide-12
SLIDE 12

Current solution: Linux Hotplug

 Hotplug allows dynamic addition/removal of a

processor

 Partitioning/virtualization  Physical repair

 Used for long-term reconfigurations

 Assumes that processors, once off lined, never

comes online

 Notifies all relevant subsystems, creates/deletes all

per-CPU state

HotPar 2010 12

slide-13
SLIDE 13

3 1 2 4

CPU_DOWN_PREPARE CPU_DEAD CPU_POST_DEAD

CPU 3 going down

HotPar 2010

Time

13

take_cpu_down

  • disables interrupt
  • remove cpu from

cpu_online_mask

  • schedule idle thread on this cpu

CPU_DYING

NOP loop NOP loop NOP loop

slide-14
SLIDE 14

Hotplug performance

 Good for virtualization but too slow for rapid

reconfiguration

Hotplug Operations Cores Latency (msec) OFFLINE 1 25 2 60 3 137 ONLINE 1 106 2 214 3 331

HotPar 2010 14

slide-15
SLIDE 15

Outline

 Motivation  Current Mechanisms  Processor Proxies  Deferred/Parallel hotplug

HotPar 2010 15

slide-16
SLIDE 16

Our approach

Strategy

Do very little for short-term changes

Do long-term changes off line, asynchronously and in parallel

Solutions

Processor proxies address short-term reconfiguration

Deferred and Parallel hotplug reduces the frequency and latency of long-term reconfiguration

HotPar 2010 16

slide-17
SLIDE 17

Processor Proxies

 A processor proxy is a fill-in for offline

processor

 Provides separate execution context on the

proxying CPU called the proxy context

 Participates in operations that requires the

  • ffline processor:

 Servicing Inter Processor Interrupts (IPI)  Ensuring progress in RCU mechanism

 Does not execute threads

HotPar 2010 17

slide-18
SLIDE 18

CPU A

Native context Proxy context

Interrupts destined to CPU A Interrupts destined to CPU B CPU B Interrupt/Bottom halves servicing B is offline and A is proxying for B

HotPar 2010 18

slide-19
SLIDE 19

Processor Proxy Evaluation Result

 Offline / Online performance compared to

native

Hotplug Operations Cores Native (msec) Proxy (msec) OFFLINE 1 25 1.7 2 60 4 3 137 6.5 ONLINE 1 106 1.2 2 214 2.8 3 331 6

HotPar 2010 19

slide-20
SLIDE 20

Deferred and Parallel Hotplug

 Processor proxies are not a long term solution

 Threads don’t run on a proxy

 If the reconfiguration is long lasting, move to a

stable state

 Solutions:

 Deferred hotplug: remove a CPU that is currently

proxied

 Parallel hotplug: reconfigure multiple CPUs

simultaneously

HotPar 2010 20

slide-21
SLIDE 21

Evaluation Results

 Performance of CPU online is greatly improved

 Major time spent in initialization for CPU online  Initialization can happen in parallel

Hotplug Operations Cores Native (msec) Parallel (msec) OFFLINE 1 25 25 2 60 60 3 137 130 ONLINE 1 106 106 2 214 111 3 331 131

HotPar 2010 21

slide-22
SLIDE 22

Conclusions

 Dynamic reconfiguration

 Operating systems are not prepared  Hotplug mechanisms is too slow

 Low latency solutions

 Processor Proxies  Deferred and Parallel hotplug

 Future work

 Resource management

HotPar 2010 22