1
ITR: Non-equilibrium surface growth and the scalability of parallel - - PowerPoint PPT Presentation
ITR: Non-equilibrium surface growth and the scalability of parallel - - PowerPoint PPT Presentation
ITR: Non-equilibrium surface growth and the scalability of parallel discrete- event simulations for large asynchronous systems NSF DMR-0113049 http://www.rpi.edu/~korniss/Research/gk_research.html 1 PIs: Gyorgy Gyorgy Korniss (Rensselaer),
2
PIs: PIs: Gyorgy Gyorgy Korniss (Rensselaer), Mark Novotny Korniss (Rensselaer), Mark Novotny ( (Mississippi State U.)
Mississippi State U.) postdoc postdoc: Alice : Alice Kolakowska Kolakowska (Mississippi State U.) (Mississippi State U.) graduate student: H. graduate student: H. Guclu Guclu (Rensselaer) (Rensselaer) undergraduate students: Katie undergraduate students: Katie Barbieri Barbieri, John Marsh, Brad , John Marsh, Brad McAdams (Rensselaer); Shannon Wheeler ( McAdams (Rensselaer); Shannon Wheeler (MSState MSState) ) collaborators: P.A. collaborators: P.A. Rikvold Rikvold (Florida State U.), Z. (Florida State U.), Z. Toroczkai Toroczkai (CNLS, Los Alamos), B.D. (CNLS, Los Alamos), B.D. Lubachevsky Lubachevsky, Alan Weiss , Alan Weiss (Lucent/Bell Labs) (Lucent/Bell Labs) Funded by NSF DMR/ITR, The Research Corporation, DOE (NERSC, SCRI/CSIT, LANL), Rensselaer, Mississippi State U.
3
“Nature” ? “Nature” ?
computer architectures + algorithms
4
Discrete-event systems
!Cellular communication networks (call arrivals) !Internet traffic routing/queueing systems §
- Dynamics is asynchronous
- Updates in the local “configuration” are discrete events in
continuous time (Poisson arrivals) ⇒ discrete-event simulation Modeling the evolution of spatially extended interacting systems: updates in “local” configuration as discrete-events !Magnetization dynamics in condensed matter (Ising model with single-spin flip Glauber dynamic) !Spatial epidemic models (contact process)
5
Parallelization for asynchronous dynamics
The paradoxical task: ! (algorithmically) parallelize (physically) non-parallel dynamics
Difficulties:
! Discrete events (updates) are not synchronized by a global clock !Traditional algorithms appear inherently serial (e.g., Glauber attempt one site/spin update at a time) "However, these algorithms are not inherently serial (B.D. Lubachevsky ’87)
6
Parallel discrete-event simulation
- Spatial decomposition on lattice/grid
(for systems with short-range interactions
- nly local synchronization between subsystems)
- Changes/updates: independent Poisson arrivals
"Each subsystem/block of sites, carried by a processing element (PE) must must have its
- wn local simulated time, {τi} (“virtual time”)
"Synchronization scheme "PEs must concurrently advance their own Poisson streams, without violating causality
7
Two approaches
"Conservative !PE “idles” if causality is not guaranteed !utilization, 〈u〉: fraction of non-idling PEs
τi
(site index) i
d=1 "Optimistic (or speculative) !PEs assume no causality violations !Rollbacks to previous states once causality violation is found (extensive state saving or reverse simulation) !Rollbacks can cascade (“avalanches”)
8
Basic conservative approach
“Worst-case” analysis:
- One-site-per PE, NPE=Ld
- t=0,1,2,¢ parallel steps
- τi(t) fluctuating time horizon
- Local time increments are
iid exponential random variables
- Advance only if
"Scalability modeling !utilization (efficiency) 〈u(t)〉 (fraction of non-idling PEs) density of local minima !width (spread) of time surface:
2 1 2
)] ( ) ( [ 1 ) ( t t N t w
PE
N i i PE
τ τ
∑
=
− =
} min{ nn τ τ ≤
i (nn: nearest neighbors)
9
Coarse graining for the stochastic time surface evolution
) , (
2 2 2
t x x x
t
η τ λ τ τ + ∂ ∂ − ∂ ∂ = ∂
Kardar-Parisi-Zhang equation
∂ ∂ − ∝
∫
2
2 1 exp )] ( [ x dx D x P τ τ
Steady state (d=1): Edwards-Wilkinson Hamiltonian
"Random-walk profile: short-range correlated local slopes
- G. K., Toroczkai, Novotny, Rikvold, ‘00
( ) ( )
) ( ) ( ) ( ) ( ) ( ) ( ) 1 (
1 1
t t t t t t t
i i i i i i i
η τ τ τ τ τ τ − Θ − Θ = − +
+ −
- Θ(…) is the Heaviside step-function
- ηi(t) iid exponential random numbers
M
10
"Universality/roughness
5 . , 33 . ≈ ≈ α β
>> << 〉 〈
× ×
t t if L t t if t t w
L
, , ~ ) (
2 2 2 α β
exact KPZ: β=1/3 α=1/2 2464 . ≈ 〉 〈
∞
u
L const u u L . + 〉 〈 ≅ 〉 〈
∞
"Utilization/efficiency
β α / , ~ = z L t
z
2 / 1 2 2
/ 1 ~ L u u
L L L
〉 〈 − 〉 〈 = σ
(d=1)
11
Higher-d simulations (one site per PE)
d=1 d=2 d=3
246 . ≈ 〉 〈
∞
u 12 . ≈ 〉 〈
∞
u 075 . ≈ 〉 〈
∞
u
d
L N =
PE
12
Implications for scalability
Simulation reaches steady state for (arbitrary d)
z
L t >>
"Simulation phase: scalable "Measurement (data management) phase: not scalable
) 1 ( 2
.
α − ∞ +
〉 〈 ≅ 〉 〈 L const u u L
〈u〉∞ asymptotic average growth rate (simulation speed or utilization ) is non-zero
α 2 2
~ L w
L
〉 〈
w
measurement at τmeas: (e.g., simple averages)
Krug and Meakin, ‘90
"But CAN be made scalable by considering complex underlying communication topologies among PEs
13
Actual implementation
- 1. Local time increment:
∆τ=-ln(r), r✌U(0,1)
- 2. If chosen site is on the boundary,
PE must wait until τ≤min{τnn} l×l blocks NPE=(L/l)2
14
Application: metastability and dynamic phase transition in spatially extended bistable systems
∫
= dt t s t Q
i i
) ( 2 1
2 / 1
〉 〈 < τ
2 / 1
t
〉 〈 ≈ τ
2 / 1
t
〉 〈 > τ
2 / 1
t
〉 〈 ) , ( H T τ
metastable lifetime
2 / 1
t
half-period of the oscillating field period-averaged spin
} { i s } {
i
Q
15
#
L×L lattice with periodic boundary conditions
#
Single-spin-flip Glauber dynamics
#
Periodic square-wave field of amplitude H
Half-period: t1/2 Magnetization: m(t)=(1/L2)Σisi(t) T<Tc H→ −H t=0: m=1 escape from metastable well: t=τ : m=0 Lifetime: 〈τ 〉= 〈τ (T,H) 〉
∑ ∑
= > <
− − =
2
1 ,
) (
L i i j j i i
s t H s s J H
J>0
1 ± =
i
s
Application: metastability and hysteresis Kinetic Ising model
16
Hysteresis and dynamic response
$ Periodic square-wave field of amplitude Ho $ Half-period t1/2 ; Θ=t1/2/<τ (T,Ho)>
∑ ∑
− − =
> < i i j j i i
s t H s s J ) (
,
H
Θ>>1 symmetric limit cycle Θ<<1 asymmetric limit cycle
17
Dynamic Phase Transition (DPT)
# Θ >> Θc : |Q| ≈ 0
symmetric dynamic phase
# Θ << Θc : |Q| ≈ 1
symmetry-broken dynamic phase
# Θ = Θc ¿ 1 (t1/2 ¿ 〈τ 〉)
large fluctuations in Q → DPT
∫
= dt t m t Q ) ( 2 1
2 / 1
〉 〈 = Θ ) , (
2 / 1
H T t τ
Sides et.al., PRL’98, PRE’99 G.K. et.al., PRE’01 finite-size scaling evidence for a continuous (dynamic) phase transition
}
- rder parameter
fluctuations 4th order cumulant
18
Large-scale finite-size analysis of the dynamic phase transition : Absence of the Tri-critical Point
∫
= dt t m t Q ) ( 2 1
2 / 1
period-averaged magnetization ( dynamic order parameter)
19
Summary and outlook
$ The tools and machinery of non-equilibrium
statistical physics (coarse-graining, finite-size scaling, universality, etc.) can be applied to scalability modeling and algorithm engineering
$ Conservative schemes can be made scalable $ Optimistic schemes: rollbacks (avalanches in
virtual time): Self-organized criticality ???
$ Non-Poisson asynchrony (e.g., in “fat-tail”
internet traffic)
$ Applications: metastability, nucleation, and