ADAPTIVE TECHNIQUES FOR SCALABLE OPTIMISTIC PARALLEL DISCRETE EVENT - - PowerPoint PPT Presentation

adaptive techniques for scalable optimistic parallel
SMART_READER_LITE
LIVE PREVIEW

ADAPTIVE TECHNIQUES FOR SCALABLE OPTIMISTIC PARALLEL DISCRETE EVENT - - PowerPoint PPT Presentation

ADAPTIVE TECHNIQUES FOR SCALABLE OPTIMISTIC PARALLEL DISCRETE EVENT SIMULATION Eric Mikida Presentation Overview PDES / GVT Overview GVT Framework Description New GVT Algorithm New Load Balancing Work Summary and Future Work


slide-1
SLIDE 1

ADAPTIVE TECHNIQUES FOR SCALABLE OPTIMISTIC PARALLEL DISCRETE EVENT SIMULATION

Eric Mikida

slide-2
SLIDE 2

Presentation Overview

  • PDES / GVT Overview
  • GVT Framework Description
  • New GVT Algorithm
  • New Load Balancing Work
  • Summary and Future Work

2 5/2/19

slide-3
SLIDE 3

PDES / GVT Overview

  • Simulation driven by discrete, time-stamped events
  • Logical Processes (LPs) store state and execute events
  • Charades is optimistically synchronized

– Events executed speculatively – Incorrect events rolled back via reverse computation – Event efficiency = committed / total

  • Global Virtual Time (GVT) required for synchronization

– Virtual time passed by every processor and event in flight

3 5/2/19

slide-4
SLIDE 4

GVT Framework Description

  • Separated GVT Management from Scheduler

– Each encapsulated into separate chare groups – Common API between base classes

  • Allows for multiple different GVT

implementations

  • Work and communication automatically
  • verlapped with Scheduler and LPs

4 5/2/19

slide-5
SLIDE 5

GVT Framework Description

5 5/2/19

Scheduler

Event Exec Event Exec ... FC

GVTManager

GVT Work GVT Work gvt_begin() resume() gvt_done(gvt) resume() gvt_done(gvt) gvt_begin()

slide-6
SLIDE 6

New GVT Algorithm

  • Adaptive Bucketed GVT algorithm

– Virtual time divided into buckets – Completion detection per bucket – CD is timestamp aware – Buckets included in a given computation can increase/decrease based on simulation conditions

6 5/2/19

slide-7
SLIDE 7

Adaptive Bucketed GVT Algorithm

7 5/2/19

sent: s1 recv: r1 sent: s2 recv: r2 sent: s3 recv: r3 sent: s4 recv: r4 sent: s5 recv: r5 sent: s6 recv: r6

Virtual Time

Current GVT Current LVT

Sending an event (increment s4) Receiving an event (increment r6)

slide-8
SLIDE 8

Adaptive Bucketed GVT Algorithm

Formally, a bucket b is completed iff: 1) sent[b] = recvd[b]

2) lvtp > b × bucket_size for all processors p 3) bucket x is complete for all x in { 1 … b-1 }

5/2/19 8

slide-9
SLIDE 9

Adaptive Bucketed Performance

Speedup over Blocking Speedup over Phase-Based

5/2/19 9

slide-10
SLIDE 10

Adaptive Bucketed Interval Analysis

All-Reduces for Phase-Based All-Reduces for Adaptive Bucketed

Total Per GVT PHOLD Base 2005 1.98 PHOLD Work 2024 1.98 PHOLD Event 2011 1.98 PHOLD Combo 2040 1.99 Traffic Base 1276 1.58 Traffic Src 1965 1.92 Traffic Dest 1350 1.57 Traffic Route 2027 1.99

5/2/19 10

Total Per GVT PHOLD Base 3887 4.11 PHOLD Work 4270 4.31 PHOLD Event 5553 4.28 PHOLD Combo 6890 4.33 Traffic Base

  • Traffic Src
  • Traffic Dest
  • Traffic Route
slide-11
SLIDE 11

Adaptive Event Throttling

  • SPEEDES halted event sending to flush

network for continuous GVT [1]

– Execution was allowed to continue

  • Anti events a source of significant overhead
  • Adaptive Bucketed GVT monitors all event

sends (both regular and anti events)

11 5/2/19

[1] Steinman ‘95

slide-12
SLIDE 12

Adaptive Event Throttling

Approach

  • Track events by offset from

GVT (in buckets)

  • Add tracing for off-line

analysis

  • Analyze cancellation

frequency and lag

  • Hold events based on offset

PHOLD Combo Event Stats

5/2/19 12

slide-13
SLIDE 13

Adaptive Event Throttling

Model Event Rates

5/2/19 13

1.1x 1.2x 1.15x 1.2x 1.75x

slide-14
SLIDE 14

Adaptive Event Throttling

Dragonfly Remote Events Traffic Remote Events

5/2/19 14

54% 76%

slide-15
SLIDE 15

Adaptive Event Throttling

Dragonfly Event Efficiency Traffic Event Efficiency

5/2/19 15

slide-16
SLIDE 16

How does this differ from SPEEDES?

SPEEDES

  • Throttling required for the

GVT computation to complete

  • Once throttling starts, all

events are all held until next GVT cycle CHARADES

  • GVT computation runs

regardless of messages in flight – throttling just to improve performance

  • Choice to hold an event is

per event – holding one does not preclude us from sending another

5/2/19 16

slide-17
SLIDE 17

Load Balancing with Bucketed GVT

  • Don’t want to stop the simulation
  • No obvious synchronization points

– GVTManager runs independently of Scheduler

  • Exploit anytime migration in Charm++
  • Throttling improves event efficiency to aid LB

17 5/2/19

slide-18
SLIDE 18

Load Balancing with Bucketed GVT

PHOLD Speedup

5/2/19 18

slide-19
SLIDE 19

Load Balancing with Bucketed GVT

Traffic Speedup

5/2/19 19

slide-20
SLIDE 20

Summary

  • Proposed the Adaptive Bucketed GVT algorithm

– Timestamp aware to adapt to simulation conditions – Less communication required – Allows for adaptive communication throttling

  • Load balancing can improve event efficiency

– Metric effectiveness depends on model

  • Best performance comes with decoupled solution

– GVT: sync cost, Throttling: event efficiency, LB: balance

20 5/2/19

slide-21
SLIDE 21

Future Work

  • On-line tuning for adaptive event throttling
  • Lightweight graph partitioning strategies
  • Vectors of load metrics
  • ML for load metrics

21 5/2/19

slide-22
SLIDE 22

THANK YOU!

5/2/19 22