DeAliaser: Alias Speculation Using Atomic Region Support Wonsun - PowerPoint PPT Presentation

DeAliaser: Alias Speculation Using Atomic Region Support Wonsun Ahn*, Yuelu Duan, Josep Torrellas University of Illinois at Urbana Champaign http://iacoma.cs.illinois.edu

Memory Aliasing Prevents Good Code Generation • Many popular compiler optimizations require code motion – Loop Invariant Code Motion (LICM): Body  P reheader – Redundancy elimination: Redundant expr.  First expr. r1 = a + b r1 = a + b r1 = a + b r1 = a + b … r2 = a + b r2 = r1 r2 = a + b … … … c = r2 c = r2 c = r2 c = r1 • Memory aliasing prevents code motion r1 = a + b r1 = a + b *p = … r2 = a + b r2 = a + b *p = … c = r2 c = r2 • Problem: compiler alias analysis is notoriously difficult 2

Alias Speculation • Compile time: optimize assuming certain alias relationships • Run time: check those assumptions – Recover if assumptions are incorrect • Enables further optimizations beyond what’s provable statically 3

Contribution: Repurpose Transactions for Alias Speculation • Atomic Regions (a.k.a transactions) are here: – Intel TSX, AMD ASF, IBM Bluegene/Q, IBM Power • HW for Atomic Regions performs: – Memory alias detection across threads – Buffering of speculative state • DeAliaser: Repurpose it to detect aliasing within a thread as we move accesses • How? – Cover the code motion span in an Atomic Region – Speculate that may-aliases in the span are no-aliases – Check speculated aliases using transactional HW – Recover from failure by rolling back transaction 4

Repurposing Transactional Hardware SR SW Tag Data • Repurpose SR (Speculatively Read) bits to mark load locations that need monitoring due to code motion – Do not mark SR bits for regular loads inside the atomic region – Atomic region cannot be used for conventional TM 5

Repurposing Transactional Hardware SR SW Tag Data • Repurpose SR (Speculatively Read) bits to mark load locations that need monitoring due to code motion – Do not mark SR bits for regular loads inside the atomic region – Atomic region cannot be used for conventional TM • SW (Speculatively Written) bits are still set by all the stores – Record all the transaction’s speculative data for rollback 5

Repurposing Transactional Hardware SR SW Tag Data ISA Extensions • Repurpose SR (Speculatively Read) bits to mark load locations that need monitoring due to code motion – Do not mark SR bits for regular loads inside the atomic region – Atomic region cannot be used for conventional TM • SW (Speculatively Written) bits are still set by all the stores – Record all the transaction’s speculative data for rollback • Add ISA extensions to manipulate and check SR and SW bits 5

Instructions to Mark Atomic Regions • begin_atomic_opt PC / end_atomic_opt • Starts / ends optimization atomic region • PC is the address of the Safe-Version of atomic region - Atomic region code without speculative optimizations - Execution jumps to Safe-Version after rollback  Same as regular atomic regions in TM systems except that SR bit marking by regular loads is turned off 8

Extensions to the ISA (for Recording Monitored Locations) • load.r r1, addr • Loads location addr to r1 just like a regular load • Marks SR bit in cache line containing addr • Used for marking monitored loads • clear.r addr • Clears SR bit in cache line containing addr • Used to mark end of load monitoring  Repurposing of SR bits allows selective monitoring of the loaded location between load.r and clear.r  Recall: all stored locations monitored until end of atomic region 9

Extensions to the ISA (for Checking Monitored Locations) • storechk.(r/w/rw) r1, addr • Stores r1 to location addr just like a regular store • r : If SR bit is set  rollback • w : If SW bit is set  rollback • rw : If either SR or SW set  rollback • loadchk.(r/w/rw) r1, addr • Loads r1 to location addr just like a regular load • r : If SR bit is set  rollback • w : If SW bit is set  rollback • rw : If either SR or SW set  rollback • r, rw: set SR bit after checking 10

How are these Instructions Used? • Four code motions are supported – Hoisting / sinking loads – Hoisting / sinking stores • Some color coding before going into details – Green : moved instructions – Red: instructions “alias - checked” against moved instructions – Orange: instructions “alias - checked” against moved instructions unnecessarily (checks due to imprecision) 11

Code Motion 1: Hoisting Loads begin_atomic_opt begin_atomic_opt store X store X load A load A end_atomic_opt end_atomic_opt 12

Code Motion 1: Hoisting Loads begin_atomic_opt begin_atomic_opt load. A store X store X load A end_atomic_opt end_atomic_opt 12

Code Motion 1: Hoisting Loads begin_atomic_opt begin_atomic_opt load. A store X store X load A end_atomic_opt end_atomic_opt 1. Change load A to load.r A to set up monitoring of A 12

Code Motion 1: Hoisting Loads begin_atomic_opt begin_atomic_opt load.r A store X store X load A end_atomic_opt end_atomic_opt 1. Change load A to load.r A to set up monitoring of A 12

Code Motion 1: Hoisting Loads begin_atomic_opt begin_atomic_opt load.r A store X store X load A end_atomic_opt end_atomic_opt 1. Change load A to load.r A to set up monitoring of A 2. Change store X to storechk.r X to check monitor 12

Code Motion 1: Hoisting Loads begin_atomic_opt begin_atomic_opt load.r A store X storechk.r X load A end_atomic_opt end_atomic_opt 1. Change load A to load.r A to set up monitoring of A 2. Change store X to storechk.r X to check monitor 12

Code Motion 1: Hoisting Loads begin_atomic_opt begin_atomic_opt load.r A store X storechk.r X load A end_atomic_opt end_atomic_opt 1. Change load A to load.r A to set up monitoring of A 2. Change store X to storechk.r X to check monitor 3. Insert clear.r A to turn off monitoring at end of motion span 12

Code Motion 1: Hoisting Loads begin_atomic_opt begin_atomic_opt load.r A store X storechk.r X load A clear.r A end_atomic_opt end_atomic_opt 1. Change load A to load.r A to set up monitoring of A 2. Change store X to storechk.r X to check monitor 3. Insert clear.r A to turn off monitoring at end of motion span 12

Code Motion 1: Hoisting Loads begin_atomic_opt begin_atomic_opt load.r A store X storechk.r X load A clear.r A end_atomic_opt end_atomic_opt 1. Change load A to load.r A to set up monitoring of A 2. Change store X to storechk.r X to check monitor 3. Insert clear.r A to turn off monitoring at end of motion span 4. If overlapping monitor, loadchk.r A is used instead of load.r A 12

Code Motion 1: Hoisting Loads begin_atomic_opt begin_atomic_opt load.r B load.r A store X storechk.r X load A clear.r A end_atomic_opt end_atomic_opt 1. Change load A to load.r A to set up monitoring of A 2. Change store X to storechk.r X to check monitor 3. Insert clear.r A to turn off monitoring at end of motion span 4. If overlapping monitor, loadchk.r A is used instead of load.r A 12

Code Motion 1: Hoisting Loads begin_atomic_opt begin_atomic_opt load.r B loadchk.r A store X storechk.r X load A clear.r A end_atomic_opt end_atomic_opt 1. Change load A to load.r A to set up monitoring of A 2. Change store X to storechk.r X to check monitor 3. Insert clear.r A to turn off monitoring at end of motion span 4. If overlapping monitor, loadchk.r A is used instead of load.r A – Checks whether load.r B set up monitor in same cache line – Prevents clear.r A from clearing monitor set up by load.r B 12

Code Motion 1: Hoisting Loads begin_atomic_opt Alias check is precise begin_atomic_opt load.r B • Selectively check loadchk.r A against only stores in store X storechk.r X code motion span load A clear.r A end_atomic_opt end_atomic_opt 1. Change load A to load.r A to set up monitoring of A 2. Change store X to storechk.r X to check monitor 3. Insert clear.r A to turn off monitoring at end of motion span 4. If overlapping monitor, loadchk.r A is used instead of load.r A – Checks whether load.r B set up monitor in same cache line – Prevents clear.r A from clearing monitor set up by load.r B 12

Code Motion 2: Sinking Stores begin_atomic_opt begin_atomic_opt load.r W load.r W store X store X store A store A load Y load Y store Z store Z end_atomic_opt end_atomic_opt 24

Code Motion 2: Sinking Stores begin_atomic_opt begin_atomic_opt load.r W load.r W store X store X store A load Y load Y store Z store Z store A end_atomic_opt end_atomic_opt 24

Code Motion 2: Sinking Stores begin_atomic_opt begin_atomic_opt load.r W load.r W store X store X store A load Y load Y store Z store Z store A end_atomic_opt end_atomic_opt 1. Change store A to storechk.rw A to check preceding reads and writes 24

Code Motion 2: Sinking Stores begin_atomic_opt begin_atomic_opt load.r W load.r W store X store X store A load Y load Y store Z store Z storechk.rw A end_atomic_opt end_atomic_opt 1. Change store A to storechk.rw A to check preceding reads and writes 24

DeAliaser: Alias Speculation Using Atomic Region Support Wonsun - PowerPoint PPT Presentation

DeAliaser: Alias Speculation Using Atomic Region Support Wonsun Ahn*, Yuelu Duan, Josep Torrellas University of Illinois at Urbana Champaign http://iacoma.cs.illinois.edu Memory Aliasing Prevents Good Code Generation Many popular compiler

Playing with Maya thru MEL/ API Min Gyu Choi Kwangwoon University Alias Maya Alias|Wavefront

Alias Analysis Last time Reuse optimization Today Alias analysis (pointer analysis)

Alias Analysis Last time Alias analysis I (pointer analysis) Address Taken FIAlias,

Alias Analysis Last time Interprocedural analysis Today Intro to alias analysis (pointer

Alias Analysis Motivation a = 1; a = 1; b = 2; b = 2; c = a + b; c = 3; Alias Analysis

1 What Can Alias? (cont) Alias Analysis Arrays Goal: Statically identify aliases do b[c[i 1 ]]

Years Guri Sohi University of Wisconsin-Madison Outline Speculation infancy performance

Large Scale IPv6 Alias Resolution Matthew Luckie Overview IP-ID based alias resolution

Alias Analysis Simone Campanoni simonec@eecs.northwestern.edu Memory alias analysis: the problem

TULA REGION TULA Moscow REGION Moscow region Kaluga region Tula Novomoskovsk Ryazan

DK - Batteridrevet vakuum lfter AL-Atomic 500 D - Batteriebetrieber Vakuumheber AL-Atomic 500

UAv6: Alias Resolution in IPv6 Using Unused Addresses Ramakrishna Padmanabhan, Zhihao Li ,

Interprocedural Analysis Last time Alias analysis Today Interprocedural analysis CS553

Atomic page flip and mode setting Hardware structure and abstraction Atomic page flip The

Alias Villas Respecting history of the place & creating a resort village living lifestyle.

Alias Analysis for Object-Oriented Programs M. Sridharan, S. Chandra, J. Dolby, S. J. Fink, and E.

Speculation and Price Volatility: I m plications for Farm er Marketing Scott I rw in sirw

NEPI/IHI HCAHPS Project Work Plan 1 Introduction to Project Background : The New England

DRPT Public Transportation Funding Study - SJR 297 Hampton Roads TPO Board September 20, 2012

Peer Review Presented by Trudy Brown Ripin, MPH & Molly Gwisc, MPH Shoreline Health

ts t s

CACI International Inc Q1 FY20 Earnings Conference Call October 31, 2019 CACI Proprietary

Introduction You have to really stretch your imagination to infer what the intrinsic value of

Managing growth Presentation to Steering Committee Presentation to Steering Committee Meeting

DeAliaser: Alias Speculation Using Atomic Region Support Wonsun - PowerPoint PPT Presentation

DeAliaser: Alias Speculation Using Atomic Region Support Wonsun Ahn*, Yuelu Duan, Josep Torrellas University of Illinois at Urbana Champaign http://iacoma.cs.illinois.edu Memory Aliasing Prevents Good Code Generation Many popular compiler

Playing with Maya thru MEL/ API Min Gyu Choi Kwangwoon University Alias Maya Alias|Wavefront

Alias Analysis Last time Reuse optimization Today Alias analysis (pointer analysis)

Alias Analysis Last time Alias analysis I (pointer analysis) Address Taken FIAlias,

Alias Analysis Last time Interprocedural analysis Today Intro to alias analysis (pointer

Alias Analysis Motivation a = 1; a = 1; b = 2; b = 2; c = a + b; c = 3; Alias Analysis

1 What Can Alias? (cont) Alias Analysis Arrays Goal: Statically identify aliases do b[c[i 1 ]]

Years Guri Sohi University of Wisconsin-Madison Outline Speculation infancy performance

Large Scale IPv6 Alias Resolution Matthew Luckie Overview IP-ID based alias resolution

Alias Analysis Simone Campanoni simonec@eecs.northwestern.edu Memory alias analysis: the problem

TULA REGION TULA Moscow REGION Moscow region Kaluga region Tula Novomoskovsk Ryazan

DK - Batteridrevet vakuum lfter AL-Atomic 500 D - Batteriebetrieber Vakuumheber AL-Atomic 500

UAv6: Alias Resolution in IPv6 Using Unused Addresses Ramakrishna Padmanabhan, Zhihao Li ,

Interprocedural Analysis Last time Alias analysis Today Interprocedural analysis CS553

Atomic page flip and mode setting Hardware structure and abstraction Atomic page flip The

Alias Villas Respecting history of the place &amp; creating a resort village living lifestyle.

Alias Analysis for Object-Oriented Programs M. Sridharan, S. Chandra, J. Dolby, S. J. Fink, and E.

Speculation and Price Volatility: I m plications for Farm er Marketing Scott I rw in sirw

NEPI/IHI HCAHPS Project Work Plan 1 Introduction to Project Background : The New England

DRPT Public Transportation Funding Study - SJR 297 Hampton Roads TPO Board September 20, 2012

Peer Review Presented by Trudy Brown Ripin, MPH &amp; Molly Gwisc, MPH Shoreline Health

ts t s

CACI International Inc Q1 FY20 Earnings Conference Call October 31, 2019 CACI Proprietary

Introduction You have to really stretch your imagination to infer what the intrinsic value of

Managing growth Presentation to Steering Committee Presentation to Steering Committee Meeting

Alias Villas Respecting history of the place & creating a resort village living lifestyle.

Peer Review Presented by Trudy Brown Ripin, MPH & Molly Gwisc, MPH Shoreline Health