Path Specialization: Reducing Phased Execution Overheads Filip - PowerPoint PPT Presentation

Path Specialization: Reducing Phased Execution Overheads Filip Pizlo, Erez Petrank, Bjarne Steensgaard Purdue, Technion/Microsoft, Microsoft ISMM’08 - Tucson, AZ 1

• Real-time, concurrent, and incremental garbage collectors are becoming main- stream techniques. • But these collectors require barriers to be inserted, which causes execution to slow down. 2

• Barriers slow down execution of programs. • This talk focuses on increasing the throughput of programs that use expensive barriers. 3

Types of Barriers (a non-exclusive list of expensive barriers that we’re familiar with) 4

• Stopless (ISMM’07) • Brooks read barrier (both lazy and eager) • Yuasa barrier for concurrent or incremental mark-sweep 5

Stopless Barriers • “The write barrier from heck” -anonymous • Stopless barriers require potentially multiple branches, loads, stores, and CASes even on primitive reads and writes . • But the barriers are only active during the (short) copying phase. 6

• Brooks read barriers • Useful when the mutator may see the same object in both to-space and from- space • Idea: each object has a pointer in its header to the “correct” version of the object. • This pointer may be self-pointing 7

Brooks Forwarding Pointer 8

“Lazy” Brooks object a = b.f use a use a object a = b.forward.f use a.forward use a.forward 9

These barriers are only needed when copying is ongoing. 10

Yuasa Write Barrier a.f = b if barrier active mark a.f a.f = b 11

Yuasa Write Barrier a.f = b if barrier active We use this barrier mark a.f in concurrent and a.f = b incremental mark-sweep collectors. 11

• Barriers for concurrent and incremental collectors tend to only be active during some phase of collector execution. • Even if the collector is always running, the barriers are only active a fraction of the time. • Concurrent Mark-sweep: only active during marking phase. • Metronome: Brooks only active during the (rare) copying phase • Stopless: only active during the (rare and short) copying phase. 12

• What we want: • Make code run faster when the barriers are not needed. • Make code run not much slower when the barriers are needed. • Result: get better throughput . 13

Path Specialization 14

Simple Example Original 15

Simple Example Original barriers 15

Simple Example Original 15

Simple Example Original Fast Slow 15

How It Really Works • We wish to provide best throughput while still being sound. • Thus - we need to be able to allow code to switch between one version of the barrier to another when there is a phase change in the collector. • This is the crucial difference from previous work on specialization. 16

GC points • Typically, concurrent and incremental collectors require that each mutator acknowledges changes in phase at GC points. • A GC point may be: • memory allocation • back branch (to ensure that GC points are reached in a timely fashion) • by proxy - any method call 17

How It Really Works • Three versions of code: • Unspecialized - code where we don’t care about GC phase • Fast - code where we know that we don’t need barriers • Slow - code where we need barriers 18

• The approach: • The “Unspecialized” code is the original code; it will check phase, and switch to either Fast or Slow, at every barrier. • Fast and Slow switch to Unspecialized at GC points (e.g. method call). 19

A better example (Lazy Brooks) int foo(object o) { int x = 2+2; o.f = x; o.g = null; o.bar(); return o.f; } 20

A better example (Lazy Brooks) int foo(object o) { int x = 2+2; o.f = x; Needs Barriers o.g = null; o.bar(); return o.f; Needs Barrier } 20

A better example (Lazy Brooks) int foo(object o) { int x = 2+2; o.f = x; Needs Barriers o.g = null; o.bar(); GC point return o.f; Needs Barrier } 20

Lazy Brooks: Without Specialization int foo(object o) { int x = 2+2; o.forward.f = x; Needs Barriers o.forward.g = null; o.bar(); GC point return o.forward.f; Needs Barrier } 21

What happens with path specialization? 22

int foo(object o) { int x = 2+2; o.f = x; o.g = null; o.bar(); return o.f; } 23

int foo(object o) { int x = 2+2; o.f = x; o.g = null; o.bar(); return o.f; } 24

Unspecialized Fast Slow int foo(object o) { int foo(object o) { int foo(object o) { int x = 2+2; int x = 2+2; int x = 2+2; o.f = x; o.f = x; o.forward.f = x; o.g = null; o.g = null; o.forward.g = null; o.bar(); o.bar(); o.bar(); return o.f; return o.f; return o.forward.f; } } } 25

Unspecialized Fast Slow int foo(object o) { int foo(object o) { int foo(object o) { int x = 2+2; int x = 2+2; int x = 2+2; o.f = x; o.f = x; o.forward.f = x; o.g = null; o.g = null; o.forward.g = null; o.bar(); o.bar(); o.bar(); return o.f; return o.f; return o.forward.f; } } } 26

Unspecialized Fast Slow int foo(object o) { int foo(object o) { int foo(object o) { int x = 2+2; o.f = x; o.f = x; o.forward.f = x; o.g = null; o.g = null; o.forward.g = null; o.bar(); return o.f; return o.f; return o.forward.f; } } 27

Lazy Brooks: With Specialization int foo(object o) { int x = 2+2; if need barrier o.forward.f = x; o.forward.g = null; else o.f = x; o.g = null; o.bar(); if need barrier return o.forward.f; else return o.f; } 28

Lazy Brooks: With Specialization int foo(object o) { int x = 2+2; Unspecialized if need barrier o.forward.f = x; o.forward.g = null; else o.f = x; o.g = null; o.bar(); Unspecialized if need barrier return o.forward.f; else return o.f; } 28

Lazy Brooks: With Specialization int foo(object o) { int x = 2+2; Unspecialized if need barrier o.forward.f = x; o.forward.g = null; else o.f = x; Fast o.g = null; o.bar(); Unspecialized if need barrier return o.forward.f; else return o.f; Fast } 28

Lazy Brooks: With Specialization int foo(object o) { int x = 2+2; Unspecialized if need barrier o.forward.f = x; Slow o.forward.g = null; else o.f = x; Fast o.g = null; o.bar(); Unspecialized if need barrier return o.forward.f; Slow else return o.f; Fast } 28

Summary • Our algorithm aims to introduce the smallest number of “needs barrier” phase checks along any path... • ... while ensuring that code is not duplicated unnecessarily (example: any path from a GC point to a check is not duplicated). • See the paper for the complete algorithm. 29

Implementation 30

• We have implemented Path Specialization in the Microsoft Bartok Research Compiler. • Path specialization exists as an optional pass that can be applied to any barrier that has a phase check. • We have tested this with our Yuasa barrier, our lazy and eager Brooks barriers, and our Stopless barriers. 31

Results 32

• We test four internal MSR benchmarks (large PL-type programs) and three smaller traditional benchmarks ported to .NET. • Five barriers are used: CMS (Yuasa-type barrier), Brooks (lazy), Brooks (sunk eager), Stopless, and Stopless without any copying activity. 33

Without Specialization 34

Conclusion • For heavy barriers (Stopless), path specialization reduces code size and improves performance. • For barriers that are cheap but already have phase checks (like CMS), path specialization increases performance a bit without affecting code size. • For Brooks barriers, performance improves but results in large code blow-up. • Performance improves for every barrier we tried. 38

Questions/Comments 39

Path Specialization: Reducing Phased Execution Overheads Filip - PowerPoint PPT Presentation

Path Specialization: Reducing Phased Execution Overheads Filip Pizlo, Erez Petrank, Bjarne Steensgaard Purdue, Technion/Microsoft, Microsoft ISMM08 - Tucson, AZ 1 Real-time, concurrent, and incremental garbage collectors are becoming

Explicit Loop Specialization & Polymorphic Hardware Specialization Christopher Batten and

A * A path finding algorithm. A path finding algorithm. Given a state space, such as a

On Path Generation, Path Following On Path Generation, Path Following and Time Coordination for

Using Off-Path and On-Path Signaling for Internet Security Saikat Guha, Paul Francis Cornell

Introduction to Path Analysis Ways to think about path analysis Path coefficients

Martha Brumfield, President and CEO C-Path Mission C-Path The Critical Path Institute is a

More On Paths Supplement to Chapter 4, Graph Theory Path definition What is a path? We

CSE 421 Longest Path in a DAG, LIS, Shortest Path with Negative Weights Shayan Oveis Gharan 1

1 minute Path tracing Bidirectional path tracing Progressive photon mapping 1 minute

Sports Specialization What we need to know Jeffrey Backes MD August 17 th , 2019 Sports

APPLIED BEHAVIOR ANALYSIS Specialization Overview Agenda What is Applied Behavior Analysis

Algebraic Specialization of Generic Functions for Recursive Types By: Alcino Cunha, Hugo Pacheco

Representation of Concept Representation of Concept Specialization Distance through

Specialization Electives November 30, 2015 Faculty of Pharmacy & Pharmaceutical Sciences

Intercultural Communication and Economics (ICE) 1 New Specialization for 3rd year Bachelors

Supporting Objects in Run-Time Bytecode Specialization Reynald Affeldt, Hidehiko Masuhara, Eijiro

Hadronic Interaction Studies with ARGO-YBJ Ivan De Mitri University of Salento and Istituto

Introduction Dividend policy is not a numbercrunching topic Javier Estrada This

SYSMETAB Non Stationary Metabolic flux analysis in isotope labeling experiments using the adjoint

Random Walk Based Algorithms for Complex Network Analysis Konstantin Avrachenkov Inria Sophia

Specialization of Integral Closure of Ideals by General Elements Based on joint work with Rachel

New Master Specialization in Knowledge Engineering Dr. Pavel Kord k, Ph.D. Department

Cities Kala Seetharam Sridhar Institute for Social and Economic Change Bengaluru, INDIA WIDER

Hardware Specialization The Age of Dark and Bespoke SIlicon Leonardo Aoun Keegan Griffee Dark

Path Specialization: Reducing Phased Execution Overheads Filip - PowerPoint PPT Presentation

Path Specialization: Reducing Phased Execution Overheads Filip Pizlo, Erez Petrank, Bjarne Steensgaard Purdue, Technion/Microsoft, Microsoft ISMM08 - Tucson, AZ 1 Real-time, concurrent, and incremental garbage collectors are becoming

Explicit Loop Specialization &amp; Polymorphic Hardware Specialization Christopher Batten and

A * A path finding algorithm. A path finding algorithm. Given a state space, such as a

On Path Generation, Path Following On Path Generation, Path Following and Time Coordination for

Using Off-Path and On-Path Signaling for Internet Security Saikat Guha, Paul Francis Cornell

Introduction to Path Analysis Ways to think about path analysis Path coefficients

Martha Brumfield, President and CEO C-Path Mission C-Path The Critical Path Institute is a

More On Paths Supplement to Chapter 4, Graph Theory Path definition What is a path? We

CSE 421 Longest Path in a DAG, LIS, Shortest Path with Negative Weights Shayan Oveis Gharan 1

1 minute Path tracing Bidirectional path tracing Progressive photon mapping 1 minute

Sports Specialization What we need to know Jeffrey Backes MD August 17 th , 2019 Sports

APPLIED BEHAVIOR ANALYSIS Specialization Overview Agenda What is Applied Behavior Analysis

Algebraic Specialization of Generic Functions for Recursive Types By: Alcino Cunha, Hugo Pacheco

Representation of Concept Representation of Concept Specialization Distance through

Specialization Electives November 30, 2015 Faculty of Pharmacy &amp; Pharmaceutical Sciences

Intercultural Communication and Economics (ICE) 1 New Specialization for 3rd year Bachelors

Supporting Objects in Run-Time Bytecode Specialization Reynald Affeldt, Hidehiko Masuhara, Eijiro

Hadronic Interaction Studies with ARGO-YBJ Ivan De Mitri University of Salento and Istituto

Introduction Dividend policy is not a numbercrunching topic Javier Estrada This

SYSMETAB Non Stationary Metabolic flux analysis in isotope labeling experiments using the adjoint

Random Walk Based Algorithms for Complex Network Analysis Konstantin Avrachenkov Inria Sophia

Specialization of Integral Closure of Ideals by General Elements Based on joint work with Rachel

New Master Specialization in Knowledge Engineering Dr. Pavel Kord k, Ph.D. Department

Cities Kala Seetharam Sridhar Institute for Social and Economic Change Bengaluru, INDIA WIDER

Hardware Specialization The Age of Dark and Bespoke SIlicon Leonardo Aoun Keegan Griffee Dark

Explicit Loop Specialization & Polymorphic Hardware Specialization Christopher Batten and

Specialization Electives November 30, 2015 Faculty of Pharmacy & Pharmaceutical Sciences