An Analysis of Call-site Patching Without Strong Hardware Support - PowerPoint PPT Presentation

An Analysis of Call-site Patching Without Strong Hardware Support for Self-Modifying-Code Tim Hartley, Foivos Zakkak, first.last@manchester.ac.uk Christos Kotselidis, Mikel Lujan MPLR’19 2019-10-22

Call-Sites Direct branching Indirect branching Method A Method A call/jmp <offset> ld target, 0xabcd Memory call/jmp target Method B Method B Method C Method C 2019-10-22 MPLR’19 @foivoszakkak 2

Call-Site Patching § Tiered compilation § De-optimization § Etc. 2019-10-22 MPLR’19 @foivoszakkak 3

JIT compilation and Caches Main Memory 1 Code-stream vs Data-stream I-CACHE 001010101010110 010101010100101 0 11 00 1 00 11 00 111 1. Code gets fetched to I-Cache 3 1 000 1 0 1 0 1 0 1 0 1 00 1 000 1 0 1 0 1 0 1 0 111 2. Data get fetched to D-Cache 1 000 1 0 1 0 1 0 1 0 1 00 0 11 00 1 00 11 00 111 3. CPU executes code from I-Cache CPU 010101010101010 111110101010100 4. CPU writes data to D-Cache 6 010100100010101 100110010000011 5. D-Cache writes-back to memory 100110010000011 8 7 4 11 000 1 00 111 00 1 0 6. D-Cache fetches code to be edited 1 0 1 0 1 0 1 00 1 00 1 0 1 1 0 1 0 111 00 1 00 111 7. CPU writes code to D-Cache 1 0 1 0 1 0 1 00 1 00 1 0 1 11 00 1 0 1 00 1 00 1 0 1 D-CACHE 8. D-Cache writes-back code 111110101010100 5 2 2019-10-22 MPLR’19 @foivoszakkak 4

Low-power architectures and call-site patching § Fixed size instructions – Limit the range of direct branches/calls • +- 128MiB on AArch64 • +- 1MiB on RISC-V – Require multiple instructions to perform long-range calls AArch64 128MiB x86-64 240MiB 2019-10-22 MPLR’19 @foivoszakkak 5

Low-power architectures and call-site patching (cont.) § Weak memory models and self-modifying-code (SMC) support – SW explicitly issues memory barriers – Code-stream handled separately from data-stream (need to sync them) § Not all instructions are safe to patch – ARM (armv7 and armv8) and IBM (Power) limit the instructions that are safe to be patched while executing • Even if using atomic writes 2019-10-22 MPLR’19 @foivoszakkak 6

Patchable call-site implementations in AArch64 Direct Branching (short-range only) Relative-Load Indirect Branching B TARGET CALLEE_1 : .quad 0 x0123456789ABCDEF ... CALLEE_N : .quad 0 x01234ABCDEF56789 START : ... LDR X16, CALLEE_1 BLR X16 Absolute-Load Indirect Branching Trampolines (OpenJDK approach) MOVZ X16, #0xABCD ; Craft the address L: LDR X16, CALLEE MOVK X16, #0xEF89, lsl #16 ; holding BR X16 ; Don 't link MOVK X16, #0x7654, lsl #32 ; the CALLEE: .quad 0 x0123456789ABCDEF MOVK X16, #0x0213, lsl #48 ; target START: ... LDR X16, [X16] BL SHORT_TARGET ; or L BLR X16 2019-10-22 MPLR’19 @foivoszakkak 7

Comparison of call-site implementation approaches 2019-10-22 MPLR’19 @foivoszakkak 8

Evaluation Setup § Odroid-C2 – Quad-core Cortex-A53 @ 1.54GHz (pinned) • 8-stage pipelined processor with 2-way superscalar, in-order pipeline – 2 GB DDR3 RAM – Ubuntu 18.04.02 LTS – Kernel: Odroid 3.16..68-41 – GCC 8.3.0 – MaxineVM 2.8.0 – OpenJDK 8 u212 2019-10-22 MPLR’19 @foivoszakkak 9

Microbenchmark § Generates inline call-sites § Callers are ret-only methods § To patch we call a patcher method instead of a ret-only § Patcher always patches the next call-site (allows us to control number of patches § Patcher performs the necessary barriers as it would in a real system 2019-10-22 MPLR’19 @foivoszakkak 10

Microbenchmark results 2019-10-22 MPLR’19 @foivoszakkak 11

Dacapo and MaxineVM § We take the best two performing approaches (Direct and Relative-Load Indirect) and evaluate them with DaCapo using MaxineVM § We had to tweak Relative-Load Indirect to make it work with MaxineVM – Due to its metacircular nature, MaxineVM can only operate with offsets (relative branches), since at boot image creation the absolute targets are not known yet Indirect-Maxine ADR X17, CALL ; Get address of BLR LDR X16, OFFSET ; Load offset ADD X16, X16 , X17 ; Add them B #8 ; Jump over inline offset OFFSET: .int CALL - CALLEE_1 CALL: BLR X16 2019-10-22 MPLR’19 @foivoszakkak 12

Indirect-Maxine in Microbenchmark results 2019-10-22 MPLR’19 @foivoszakkak 13

DaCapo Results 2019-10-22 MPLR’19 @foivoszakkak 14

Conclusions § OpenJDK’s method seems the best for AArch64 since it penalizes only long-range branches and avoids explicit instruction cache invalidations on callers. – If you have a higher #"#$%&'($%) *(""+ #+,#'-&'($%) *(""+ ratio then maybe Relative-Load is better § The most promising approach in theory would be combining the following gadgets Indirect (long-rang) Direct (short-range only) ADRP X16, CALLEE ADD X16, X16, :lo12:CALLEE B TARGET BLR X16 – On AArch64 this is not possible though since ADRP and ADD cannot be safely overwritten if they are being executed concurrently with the modifications. 2019-10-22 MPLR’19 @foivoszakkak 15

An Analysis of Call-site Patching Without Strong Hardware Support - PowerPoint PPT Presentation

An Analysis of Call-site Patching Without Strong Hardware Support for Self-Modifying-Code Tim Hartley, Foivos Zakkak, first.last@manchester.ac.uk Christos Kotselidis, Mikel Lujan MPLR19 2019-10-22 Call-Sites Direct branching Indirect

Android patching From a Mobile Device Management perspective Cedric Van Bockhaven

User Space Live Patching Joo Moreira SUSE Labs User Space Live Patching Joo Moreira

Hardware Observability Framework Hardware Observability Framework Hardware Observability

A PLACE TO CALL HOME A PLACE TO CALL HOME A PLACE TO CALL HOME A PLACE TO CALL HOME A PLACE

Filling the Gaps and Patching the Cracks Connected Care for Home Health Care Agencies Barbara

Bayes factors: A re-volution in psychology Geoff Patching Department of Psychology

Hercules 009 Landfill Superfund Site Scott Martin Presentation Overview Site History Site

PWSCF and new charge density PWSCF call read_input_file (input.f90) call run_pwscf call setup

TRES WEST ENGINEERS, INC Existing Site Development Proposed Site Development Proposed Site

De la wa re Co unty DPW F a c ility Site s T o p Site s Hyb rid Site # 11A & 7A a nd Site

Cline Family YMCA Beckley, WV Conceptual Design Package Site Site Site Site Proposed Site

VC. VC. Hardware Startup The Hardware Revolu/on The Hardware Revolution Removing Barriers to

Sec Secure ure Hardware Hardware and Hardware and Hardware- En Enabled abled Security

PenPlace Aurora Highlands Civic Association January 31, 2013 Presentation Outline Site

Site Plan May 2009 Site Plan February 2010 Site Plan May 5, 2010 Site Plan

Existing Site with Aerial Image Heritage Hunt Sewage Pumping Station Site Existing Site with Aerial

Advances in Real-Time Automotive Visualisation Ch ris OCo n n o r I n t r o d u c t i o n At

Iron-based superconductors La[O 1 - x F x ]FeAs Kamihara et al. [2008] 1

ENGINEERING AT ILLINOIS Leadership, Collaboration, Impact Click to edit subtitle 1 A BRIEF

Scott Technology Ltd Medium to LongTerm Growth Objectives & Strategies September 2016

My Itinerary to L-Band Moonbouncing... By Bertrand Zauhar, VE2ZAZ ve2zaz@rac.ca

INTEGRATION OF DALI WITH TENSORRT ON XAVIER Josh Park (joshp@nvidia.com), Manager - Automotive Deep

Analysis of Large Networks Pajek with Pajek Network visualization Properties Important

Dynamic Control Of Magnified Image For Low Vision Observers R.B. Goldstein 1 , E.Peli 1 ,

An Analysis of Call-site Patching Without Strong Hardware Support - PowerPoint PPT Presentation

An Analysis of Call-site Patching Without Strong Hardware Support for Self-Modifying-Code Tim Hartley, Foivos Zakkak, first.last@manchester.ac.uk Christos Kotselidis, Mikel Lujan MPLR19 2019-10-22 Call-Sites Direct branching Indirect

Android patching From a Mobile Device Management perspective Cedric Van Bockhaven

User Space Live Patching Joo Moreira SUSE Labs User Space Live Patching Joo Moreira

Hardware Observability Framework Hardware Observability Framework Hardware Observability

A PLACE TO CALL HOME A PLACE TO CALL HOME A PLACE TO CALL HOME A PLACE TO CALL HOME A PLACE

Filling the Gaps and Patching the Cracks Connected Care for Home Health Care Agencies Barbara

Bayes factors: A re-volution in psychology Geoff Patching Department of Psychology

Hercules 009 Landfill Superfund Site Scott Martin Presentation Overview Site History Site

PWSCF and new charge density PWSCF call read_input_file (input.f90) call run_pwscf call setup

TRES WEST ENGINEERS, INC Existing Site Development Proposed Site Development Proposed Site

De la wa re Co unty DPW F a c ility Site s T o p Site s Hyb rid Site # 11A &amp; 7A a nd Site

Cline Family YMCA Beckley, WV Conceptual Design Package Site Site Site Site Proposed Site

VC. VC. Hardware Startup The Hardware Revolu/on The Hardware Revolution Removing Barriers to

Sec Secure ure Hardware Hardware and Hardware and Hardware- En Enabled abled Security

PenPlace Aurora Highlands Civic Association January 31, 2013 Presentation Outline Site

Site Plan May 2009 Site Plan February 2010 Site Plan May 5, 2010 Site Plan

Existing Site with Aerial Image Heritage Hunt Sewage Pumping Station Site Existing Site with Aerial

Advances in Real-Time Automotive Visualisation Ch ris OCo n n o r I n t r o d u c t i o n At

Iron-based superconductors La[O 1 - x F x ]FeAs Kamihara et al. [2008] 1

ENGINEERING AT ILLINOIS Leadership, Collaboration, Impact Click to edit subtitle 1 A BRIEF

Scott Technology Ltd Medium to LongTerm Growth Objectives &amp; Strategies September 2016

My Itinerary to L-Band Moonbouncing... By Bertrand Zauhar, VE2ZAZ ve2zaz@rac.ca

INTEGRATION OF DALI WITH TENSORRT ON XAVIER Josh Park (joshp@nvidia.com), Manager - Automotive Deep

Analysis of Large Networks Pajek with Pajek Network visualization Properties Important

Dynamic Control Of Magnified Image For Low Vision Observers R.B. Goldstein 1 , E.Peli 1 ,

De la wa re Co unty DPW F a c ility Site s T o p Site s Hyb rid Site # 11A & 7A a nd Site

Scott Technology Ltd Medium to LongTerm Growth Objectives & Strategies September 2016