Decompilation is an information-flow problem (Or, information flow - - PowerPoint PPT Presentation

decompilation is an information flow problem
SMART_READER_LITE
LIVE PREVIEW

Decompilation is an information-flow problem (Or, information flow - - PowerPoint PPT Presentation

Decompilation is an information-flow problem (Or, information flow meets program transformation) Boris Feigin Computer Laboratory, University of Cambridge PLID 2008 joint work with Alan Mycroft 1 / 22 Motivation Given suitable tools we


slide-1
SLIDE 1

Decompilation is an information-flow problem

(Or, information flow meets program transformation) Boris Feigin

Computer Laboratory, University of Cambridge

PLID 2008 joint work with Alan Mycroft

1 / 22

slide-2
SLIDE 2

Motivation

“Given suitable tools we can present the [cryptographic] key as a constant in the computation which is carried out using that key and then we can optimise the code given that constant. This will cause the key to be intimately intertwined with the code which uses it.” Playing ‘Hide and Seek’ with Stored Keys Shamir and van Someren (1999)

2 / 22

slide-3
SLIDE 3

Typical source and target languages

v ∈ Value = Z r ∈ Register = {r0, r1, . . . , r31} while-language (source): e ::= v | x | op e1, . . . , en c ::= x := e | skip | c0; c1 | if e then c0 else c1 | while e do c RISC assembly (target): ι ::= movi rd, v | mov rd, rs | ld rd, [rs] | st [rd], rs |

  • p rd, r1, . . . , rn | jz r, l | jnz r, l | nop | ι0; ι1

3 / 22

slide-4
SLIDE 4

Definitions

◮ C(−) is a compiler from source language S to target

language T.

◮ The observational equivalence relations of S and T are

(respectively) ∼S and ∼T.

◮ Decompilation recovers a source program semantically

equivalent to the original. D(−) is a decompiler iff D(C(e)) ∈ [e]∼S This is the weakest possible definition of decompilation.

◮ In certain cases there is a trivial solution for D(−): emit an interpreter for T written in S incorporating the text of the program (in T) to be decompiled. 4 / 22

slide-5
SLIDE 5

Definitions

◮ C(−) is a compiler from source language S to target

language T.

◮ The observational equivalence relations of S and T are

(respectively) ∼S and ∼T.

◮ Decompilation recovers a source program semantically

equivalent to the original. D(−) is a decompiler iff D(C(e)) ∈ [e]∼S This is the weakest possible definition of decompilation.

◮ In certain cases there is a trivial solution for D(−): emit an interpreter for T written in S incorporating the text of the program (in T) to be decompiled.

◮ How well can a decompiler do in principle?

◮ IOW, how much information about the source program can be inferred from the output of the compiler? 4 / 22

slide-6
SLIDE 6

Example

C(“x := 42”) = C(“y := 42; x := y”) = = C(“z := 6; y := 7; x := z × y”) = = “mov r0, 42” C(−) does constant folding, constant propagation, etc.

5 / 22

slide-7
SLIDE 7

Program equivalence

◮ ≡ (“bit-for-bit” equality of programs)

e ≡ e′ ⇐ ⇒ strcmp(e, e′) == 0

◮ ∼α (α-equivalence)

6 / 22

slide-8
SLIDE 8

Program equivalence

◮ Recall: two expressions are contextually equivalent (e ∼ e′)

whenever e ∼ e′ ⇐ ⇒ ∀Ctx[−] Ctx[e] ∼ = Ctx[e′] where Ctx[−] ranges over contexts of the language and ∼ = is some observation (say, convergence).

7 / 22

slide-9
SLIDE 9

Program equivalence

◮ Recall: two expressions are contextually equivalent (e ∼ e′)

whenever e ∼ e′ ⇐ ⇒ ∀Ctx[−] Ctx[e] ∼ = Ctx[e′] where Ctx[−] ranges over contexts of the language and ∼ = is some observation (say, convergence).

◮ Restriction to programs (d ranges over inputs):

e ∼ e′ ⇐ ⇒ ∀d ∈ D [ [e] ](d) = [ [e′] ](d)

7 / 22

slide-10
SLIDE 10

Example: size t strlen(const char *str)

const char *s = str; while(*s) s++; return (s - str); size_t len = 0; for(; str[len]; len++) ; return len;

8 / 22

slide-11
SLIDE 11

Intuition

Define the relation f −1(Q), the kernel of f w.r.t. Q (Clark et al., 2005): x f −1(Q) x′ ⇐ ⇒ (f x) Q (f x′)

9 / 22

slide-12
SLIDE 12

Intuition

Define the relation f −1(Q), the kernel of f w.r.t. Q (Clark et al., 2005): x f −1(Q) x′ ⇐ ⇒ (f x) Q (f x′) E.g. “x := 42” C−1(≡) “y := 42; x := y”

9 / 22

slide-13
SLIDE 13

Intuition

Define the relation f −1(Q), the kernel of f w.r.t. Q (Clark et al., 2005): x f −1(Q) x′ ⇐ ⇒ (f x) Q (f x′) E.g. “x := 42” C−1(≡) “y := 42; x := y” Programs compiled by “less normalizing” compilers are more susceptible to decompilation. We tend to have the case that: ∼α ⊂ C−1

1 (≡) ⊂ C−1 2 (≡) ⊂ . . . ⊂ C−1 n (≡) ⊂ ∼S

where C1(−) to Cn(−) are progressively more optimizing compilers.

9 / 22

slide-14
SLIDE 14

Compiler correctness

C(−) is fully abstract (Abadi, 1998) iff e ∼S e′ ⇐ ⇒ C(e) ∼T C(e′) (1)

10 / 22

slide-15
SLIDE 15

Compiler correctness

C(−) is fully abstract (Abadi, 1998) iff e ∼S e′ ⇐ ⇒ C(e) ∼T C(e′) (1) Abadi observes that the forward implication “means that the translation does not introduce information leaks”.

10 / 22

slide-16
SLIDE 16

Non-interference

e ∼S e′ ⇒ C(e) ∼T C(e′) (2) Zero information flow (from high-security inputs to low-security

  • utputs) for a program M:

σ ∼low σ′ ⇒ [ [M] ](σ) ≈ [ [M] ](σ′) (3) where two states are equivalent up to ∼low when their low-security parts are equal.

11 / 22

slide-17
SLIDE 17

Relating non-interference and software protection

Let P and Q be binary relations over domains D and E

  • respectively. Then, given f : D → E, say that f : P ⇒ Q whenever

∀x, x′ ∈ D x P x′ ⇒ (f x) Q (f x′)

12 / 22

slide-18
SLIDE 18

Relating non-interference and software protection

Let P and Q be binary relations over domains D and E

  • respectively. Then, given f : D → E, say that f : P ⇒ Q whenever

∀x, x′ ∈ D x P x′ ⇒ (f x) Q (f x′) The correspondence is explicit: [ [M] ](−) : ∼low ⇒ ≈ C(−) : ∼S ⇒ ∼T The substitution {C / [ [M] ], ∼S / ∼low, ∼T / ≈} unifies the equations nicely.

12 / 22

slide-19
SLIDE 19

Parallels

◮ Programs are secret (high-security) inputs. Compiled binaries

are the public (low-security) outputs (≡).

◮ Attackers attempt to infer (as much as possible about) the

inputs from the outputs. (Decompilation.)

13 / 22

slide-20
SLIDE 20

Parallels

◮ Programs are secret (high-security) inputs. Compiled binaries

are the public (low-security) outputs (≡).

◮ Attackers attempt to infer (as much as possible about) the

inputs from the outputs. (Decompilation.) Caveat: in practice, the goal of decompilation is to recover any readable source program.

13 / 22

slide-21
SLIDE 21

Secure information flow for compilers?

We would like to have zero information flow compilers: C(−) : ∼S ⇒ ≡

14 / 22

slide-22
SLIDE 22

Secure information flow for compilers?

We would like to have zero information flow compilers: C(−) : ∼S ⇒ ≡

◮ Relational reading: C(−) may leak only the equivalence class

  • f its input programs.

◮ C(−) must be perfectly optimizing (undecidable for

Turing-complete languages).

◮ Though, cf. superoptimization (Massalin, 1987). 14 / 22

slide-23
SLIDE 23

Implications

In general, a compiler must leak more than just the equivalence class of its input programs. We are interested in applying techniques from quantitative information flow to deriving concrete bounds on the leakage. E.g.: the identity “compiler” (λx.x) leaks its input completely.

15 / 22

slide-24
SLIDE 24

Possible applications

◮ Randomized compilation and information-flow security for

non-deterministic languages

  • cf. non-deterministic encryption schemes

◮ Obfuscation (more generally: software protection)

16 / 22

slide-25
SLIDE 25

Virtualization

Essentially, fast whole-system emulation. Examples: KVM, VMware, Xen, . . .

17 / 22

slide-26
SLIDE 26

Virtualization

Essentially, fast whole-system emulation. Examples: KVM, VMware, Xen, . . . (virtual machine) transparency n. making virtual and native hardware indistinguishable under close scrutiny by a dedicated adversary (Garfinkel et al., 2007)

17 / 22

slide-27
SLIDE 27

Virtualization

Essentially, fast whole-system emulation. Examples: KVM, VMware, Xen, . . . (virtual machine) transparency n. making virtual and native hardware indistinguishable under close scrutiny by a dedicated adversary (Garfinkel et al., 2007) e ∼x86 e′ ⇐ ⇒ [ [vm] ](e) ≈ [ [vm] ](e′)

17 / 22

slide-28
SLIDE 28

From compilers to interpreters and back again

◮ Partial evaluation

[ [e] ](d) = [ [sint] ](e, d) = [ [[ [mix] ](sint, e)] ](d)

18 / 22

slide-29
SLIDE 29

From compilers to interpreters and back again

◮ Partial evaluation

[ [e] ](d) = [ [sint] ](e, d) = [ [[ [mix] ](sint, e)] ](d)

◮ Non-interference?

e ∼S e′ ⇐ ⇒ ∀d [ [int] ](e, d) ≈ [ [int] ](e′, d) e ∼S e′ ⇐ ⇒ [ [mix] ](int, e) ≈ [ [mix] ](int, e′)

18 / 22

slide-30
SLIDE 30

Overview

◮ Optimizing compilers obey a “non-interference”-like property ◮ Perfect optimization is impossible, so information leaks are

inevitable

◮ An information-flow approach to program transformation?

19 / 22

slide-31
SLIDE 31

Challenges

◮ Probability distributions over programs

◮ Shannon information theory / Kolmogorov complexity /

Scott’s information systems

◮ “Real” compilers don’t come with formalized equational

theories

20 / 22

slide-32
SLIDE 32

Related work

◮ Decompilation: Mycroft (1999), Katsumata and Ohori (2001),

Ager et al. (2002).

◮ Full abstraction: Mitchell (1993), Abadi (1998),

Kennedy (2006).

◮ Reverse engineering by power analysis etc.: Vermoen (2007). ◮ Randomized compilation: Cohen (1993), Forrest et al. (1997). ◮ Nullspace of compilers: Veldhuizen and Lumsdaine (2002). ◮ Obfuscation: Barak et al. (2001), Dalla Preda and

Giacobazzi (2005).

◮ Virtual machines and partial evaluation: Feigin and

Mycroft (2008).

21 / 22

slide-33
SLIDE 33

Questions?

22 / 22