Presenting: Michal Paszkowski Research: Michal Paszkowski, Radoslaw - PowerPoint PPT Presentation

Presenting: Michal Paszkowski Research: Michal Paszkowski, Radoslaw Drabinski Special thanks to: Julia Koval

It is all about the semantic differences! Desired • Semantics • Instruction order • Value names Distracting • Basic Block names • Redundant code

How could we automate that? We needed a simple tool that would makes comparing semantic differences between two modules easier. The tool should… • ”Reduce” non-semantic differences • Process modules independently • Leverage existing diff tools

How could we automate that? Therefore, the tool should transform a module into a canonical form. How will that “canonical form” help us? • Two semantically identical canonicalized modules should show no differences when diffed. And more importantly… • When the modules are not identical the semantic differences should stand out.

How do we arrive at this canonical form? Let’s start with instruction ordering… Assumptions for comparing an identical module after two different transformations: • Side-Effects should be roughly the same • Control-Flow Graphs should be similar

Instruction reordering %1 = ... %2 = ... • def-use distance reduction %T1 = call float @inputV.f32(i32 15, i32 2) %A = fsub float %T1, %T1 %B = fmul float %A, %T1 %C = fsub float %T1, %B %X = fadd float %C, %A store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 ret void

Instruction reordering %1 = ... %2 = ... • def-use distance reduction %T1 = call float @inputV.f32(i32 15, i32 2) 1. Collect instructions with side-effects %A = fsub float %T1, %T1 and ”ret” instructions %B = fmul float %A, %T1 %C = fsub float %T1, %B %X = fadd float %C, %A store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 ret void

Instruction reordering %1 = ... %2 = ... • def-use distance reduction %T1 = call float @inputV.f32(i32 15, i32 2) 1. Collect instructions with side-effects %A = fsub float %T1, %T1 and ”ret” instructions %B = fmul float %A, %T1 2. Walk the instructions with side- %C = fsub float %T1, %B effects (top-down) and on each instruction their operands (left-right) %X = fadd float %C, %A store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 ret void

Instruction reordering %1 = ... %2 = ... • def-use distance reduction %T1 = call float @inputV.f32(i32 15, i32 2) 1. Collect instructions with side-effects %A = fsub float %T1, %T1 and ”ret” instructions %B = fmul float %A, %T1 2. Walk the instructions with side- %C = fsub float %T1, %B effects (top-down) and on each instruction their operands (left-right) %X = fadd float %C, %A 3. For each operand, bring their store float %C, float addrspace(479623)* %1 definition as close as possible to the store float %X, float addrspace(283111)* %2 using instruction ret void

Step-by-step reorder walkthrough %T1 = call float @inputV.f32(i32 15, i32 2) %A = fsub float %T1, %T1 %B = fmul float %A, %T1 %C = fsub float %T1, %B %X = fadd float %C, %A store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 ret void

Step-by-step reorder walkthrough %T1 = call float @inputV.f32(i32 15, i32 2) %A = fsub float %T1, %T1 %B = fmul float %A, %T1 %C = fsub float %T1, %B Select the 1 st side-effecting instruction. %X = fadd float %C, %A Don’t move it, otherwise semantics may not be preserved! store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 Canonicalized instructions ret void

Step-by-step reorder walkthrough %T1 = call float @inputV.f32(i32 15, i32 2) %A = fsub float %T1, %T1 %B = fmul float %A, %T1 %C = fsub float %T1, %B Take the 1 st operand of the side-effecting instruction. %X = fadd float %C, %A store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 Canonicalized instructions ret void

Step-by-step reorder walkthrough %T1 = call float @inputV.f32(i32 15, i32 2) %A = fsub float %T1, %T1 %B = fmul float %A, %T1 %C = fsub float %T1, %B Move it closer to the user. %X = fadd float %C, %A Def-Use sequence may be temporarily broken. %C = fsub float %T1, %B store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 Canonicalized instructions ret void

Step-by-step reorder walkthrough %T1 = call float @inputV.f32(i32 15, i32 2) %A = fsub float %T1, %T1 %B = fmul float %A, %T1 Select the 1 st operand of the moved instruction. %X = fadd float %C, %A %C = fsub float %T1, %B store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 Canonicalized instructions ret void

Step-by-step reorder walkthrough %T1 = call float @inputV.f32(i32 15, i32 2) %A = fsub float %T1, %T1 %B = fmul float %A, %T1 Move it closer to the user. %X = fadd float %C, %A Def-Use sequence may be temporarily broken. %T1 = call float @inputV.f32(i32 15, i32 2) %C = fsub float %T1, %B store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 Canonicalized instructions ret void

Step-by-step reorder walkthrough %A = fsub float %T1, %T1 %B = fmul float %A, %T1 Select the 2 nd operand of the previously moved instruction. %X = fadd float %C, %A %T1 = call float @inputV.f32(i32 15, i32 2) %C = fsub float %T1, %B store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 Canonicalized instructions ret void

Step-by-step reorder walkthrough %A = fsub float %T1, %T1 %B = fmul float %A, %T1 Move it closer to the user. %X = fadd float %C, %A Def-Use sequence is being repaired. %T1 = call float @inputV.f32(i32 15, i32 2) %B = fmul float %A, %T1 %C = fsub float %T1, %B store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 Canonicalized instructions ret void

Step-by-step reorder walkthrough %A = fsub float %T1, %T1 Select the 1 st operand of the moved instruction. %X = fadd float %C, %A %T1 = call float @inputV.f32(i32 15, i32 2) %B = fmul float %A, %T1 %C = fsub float %T1, %B store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 Canonicalized instructions ret void

Step-by-step reorder walkthrough %A = fsub float %T1, %T1 Move it closer to the user. %X = fadd float %C, %A Def-Use sequence is being repaired. %T1 = call float @inputV.f32(i32 15, i32 2) %A = fsub float %T1, %T1 %B = fmul float %A, %T1 %C = fsub float %T1, %B store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 Canonicalized instructions ret void

Step-by-step reorder walkthrough Select the 1st operand of the moved instruction. Don’t move it! %T1 has been already moved! %X = fadd float %C, %A Select the 2nd operand of the moved instruction. Don’t move it! %T1 has been already moved! %T1 = call float @inputV.f32(i32 15, i32 2) %A = fsub float %T1, %T1 %B = fmul float %A, %T1 %C = fsub float %T1, %B store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 Canonicalized instructions ret void

Step-by-step reorder walkthrough Select the 2 nd operand of the previously moved instruction. %X = fadd float %C, %A Don’t move it, %T1 has been moved before. %T1 = call float @inputV.f32(i32 15, i32 2) %A = fsub float %T1, %T1 %B = fmul float %A, %T1 %C = fsub float %T1, %B store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 Canonicalized instructions ret void

Step-by-step reorder walkthrough We repeat the process for all operands in all side-effecting %X = fadd float %C, %A instructions. %T1 = call float @inputV.f32(i32 15, i32 2) %A = fsub float %T1, %T1 %B = fmul float %A, %T1 %C = fsub float %T1, %B store float %C, float addrspace(479623)* %1 store float %X, float addrspace(283111)* %2 Canonicalized instructions ret void

How do we arrive at this canonical form? Desired • Semantics • Instruction order • Value names Distracting • Basic Block names • Redundant code

Naming instructions: Linear • Numbers all instructions top-down after reordering v 𝑜 • We were hoping that maybe the reordering mechanism could be used as a ‘seed’ for instruction naming %T1 = @inputV.f32(i32 15, i32 2) %v0 = @inputV.f32(i32 15, i32 2) %A = fsub float %T1, %T1 %v1 = fsub float %v0, %v0 %B = fmul float %A, %T1 %v2 = fmul float %v1, %v0 %C = fsub float %T1, %T1 %v3 = fsub float %v0, %v0 %X = fadd float %C, %A %v4 = fadd float %v3, %v1 store float %C, float addrspace(479623)* %1 store float %v3, float addrspace(479623)* %a3 store float %X, float addrspace(283111)* %2 store float %v4, float addrspace(283111)* %a4 ret void ret void

Presenting: Michal Paszkowski Research: Michal Paszkowski, Radoslaw - PowerPoint PPT Presentation

Presenting: Michal Paszkowski Research: Michal Paszkowski, Radoslaw Drabinski Special thanks to: Julia Koval It is all about the semantic differences! Desired Semantics Instruction order Value names Distracting Basic Block

Spirit Michal Vaner (michal.vaner@avast.com) 1 / 3 About me Michal Vaner

Operator and end user performance Michal Ptacin Michal Ptacin 20.5.2005 T-110.456 1

Amazon Dynamo distributed key-value storage Michal Oniszczuk October 10, 2012 Michal Oniszczuk

Advanced localization topics Michal iha <michal@weblate.org> https://weblate.org/

Introduction to Metasploit and tools Michal Novotn Malware Researcher & Security Analyst

Making It Public Michal Migurski, Stadt der Strme Stamen Eric Rodenbeck Michal Migurski (me)

Resolving Combinatorial Markets via Posted Prices Michal Feldman Tel Aviv University and

Redefining GC-MS in the Laboratory with GC Orbitrap Michal Godula, Ph.D. Thermo Fisher Scientific

Monte Carlo Continual Resolving for Online Strategy Computation in Imperfect Information Games

Project PIZZARO - Image Restoration Module - Report I Michal Sorel, Filip Sroubek,

the logs you're looking for Alicja Kwasniewska, Intel Eric Lemoine, Mirantis Michal

Countering Kernel Rootkits with Lightweight Hook Protection Michal Sekletar College of

Testing Monotone Continuous Distributions on High-dimensional Real Cubes Michal Adamaszek DIMAP,

Compression of a Dictionary Jan Lnsk, Michal emli ka zizelevak@matfyz.cz

Generic norms and metrics on countable Abelian groups Michal Doucha University of Franche-Comt

Towards a unified picture of complexity for quantum fields Michal P . Heller aei.mpg.de/GQFI

8th Grade ELA Week 16 Vocabulary Slides November 28-December 2, 2016 1. ETHICAL -- inference

Designing Classes Check out DesigningClasses project from SVN It starts with good classes }

Survey Questions on Opioid-Related Impairment Stephanie Willson Collaborating Center for Question

vs print Return vs. "Side-effects" def max(a: int, b: int) -> int:

Outline The web from a security perspective CSci 5271 Introduction to Computer Security SQL

Towards an European consensus indications for Towards an European consensus indications for major

Learning From Data Lecture 12 Regularization Constraining the Model Weight Decay Augmented

Package dprep August 21, 2009 Type Package Title Data preprocessing and visualization

Presenting: Michal Paszkowski Research: Michal Paszkowski, Radoslaw - PowerPoint PPT Presentation

Presenting: Michal Paszkowski Research: Michal Paszkowski, Radoslaw Drabinski Special thanks to: Julia Koval It is all about the semantic differences! Desired Semantics Instruction order Value names Distracting Basic Block

Spirit Michal Vaner (michal.vaner@avast.com) 1 / 3 About me Michal Vaner

Operator and end user performance Michal Ptacin Michal Ptacin 20.5.2005 T-110.456 1

Amazon Dynamo distributed key-value storage Michal Oniszczuk October 10, 2012 Michal Oniszczuk

Advanced localization topics Michal iha &lt;michal@weblate.org&gt; https://weblate.org/

Introduction to Metasploit and tools Michal Novotn Malware Researcher &amp; Security Analyst

Making It Public Michal Migurski, Stadt der Strme Stamen Eric Rodenbeck Michal Migurski (me)

Resolving Combinatorial Markets via Posted Prices Michal Feldman Tel Aviv University and

Redefining GC-MS in the Laboratory with GC Orbitrap Michal Godula, Ph.D. Thermo Fisher Scientific

Monte Carlo Continual Resolving for Online Strategy Computation in Imperfect Information Games

Project PIZZARO - Image Restoration Module - Report I Michal Sorel, Filip Sroubek,

the logs you're looking for Alicja Kwasniewska, Intel Eric Lemoine, Mirantis Michal

Countering Kernel Rootkits with Lightweight Hook Protection Michal Sekletar College of

Testing Monotone Continuous Distributions on High-dimensional Real Cubes Michal Adamaszek DIMAP,

Compression of a Dictionary Jan Lnsk, Michal emli ka zizelevak@matfyz.cz

Generic norms and metrics on countable Abelian groups Michal Doucha University of Franche-Comt

Towards a unified picture of complexity for quantum fields Michal P . Heller aei.mpg.de/GQFI

8th Grade ELA Week 16 Vocabulary Slides November 28-December 2, 2016 1. ETHICAL -- inference

Designing Classes Check out DesigningClasses project from SVN It starts with good classes }

Survey Questions on Opioid-Related Impairment Stephanie Willson Collaborating Center for Question

vs print Return vs. &quot;Side-effects&quot; def max(a: int, b: int) -&gt; int:

Outline The web from a security perspective CSci 5271 Introduction to Computer Security SQL

Towards an European consensus indications for Towards an European consensus indications for major

Learning From Data Lecture 12 Regularization Constraining the Model Weight Decay Augmented

Package dprep August 21, 2009 Type Package Title Data preprocessing and visualization

Advanced localization topics Michal iha <michal@weblate.org> https://weblate.org/

Introduction to Metasploit and tools Michal Novotn Malware Researcher & Security Analyst

vs print Return vs. "Side-effects" def max(a: int, b: int) -> int: