be a binary rockst r
play

Be a Binary Rockst r An Introduction to Program Analysis with - PowerPoint PPT Presentation

Be a Binary Rockst r An Introduction to Program Analysis with Binary Ninja Agenda Motivation Current State of Program Analysis Design Goals of Binja Program Analysis Building Tools 2 Motivation 3 Tooling - Concrete ->


  1. Be a Binary Rockst r An Introduction to Program Analysis with Binary Ninja

  2. Agenda ● Motivation ● Current State of Program Analysis ● Design Goals of Binja Program Analysis ● Building Tools 2

  3. Motivation 3

  4. ● Tooling - Concrete -> Symbolic ○ Increase speed & effectiveness of RE / VR ● Make Program Analysis more accessible & useful 4

  5. Foundations • Need to understand code semantics • Could be done directly on the assembly • An Intermediate Language (IL) is needed 5

  6. Why IL? • Architecture Abstraction • Smaller number of instructions 6

  7. Easy to lift • Simple flags calculation • As close to native instructions as possible • Typeless - types inferred later 7

  8. Easy to read • Intuitive to read • Tree-based infix notation • No register abstraction • Flags calculation only when necessary • Avoid excessive temporaries 8

  9. IL Instruction Set Size Easier to analyze Easier to lift Instruction Set Size 9

  10. The Options 10

  11. Existing Options for IL • BAP • VEX • REIL • LLVM • IDA 11

  12. BAP • Tree-tree based :) • Flags are explicit and inhibit readability :( • Written in OCAML :( 12

  13. addr 0x0 @asm ”add %eax,%ebx” t:u32 = REBX:u32 REBX:u32 = REBX:u32 + REAX:u32 RCF:bool = REBX:u32 < t:u32 addr 0x2 @asm ”shl %cl,%ebx” add ebx, eax t1:u32 = REBX:u32 >> 0x20:u32 − (RECX:u32 & shl ebx, cl 0x1f:u32) RCF:bool = ((RECX:u32 & 0x1f:u32) = 0:u32) & RCF:bool | ̃ ((RECX:u32 & 0x1f:u32) = 0:u32) & low:bool(t1:u32) 13

  14. VEX • Register names are abstracted :( • Single assignment :( • Over 1000 instructions! :( • Yet they call it “RISC-like” • Even Angr is planning a move away from it 14

  15. t0 = GET:I32(16) t1 = 0x8:I32 t3 = Sub32(t0,t1) subs R2, R2, #8 PUT(16) = t3 PUT(68) = 0x59FC8:I32 15

  16. REIL • Tiny instruction set • Horrible readability • Makes abstractions nearly impossible • Flags are explicit and inhibit readability :( 16

  17. 00000000.00 STR R_EAX:32, , V_00:32 00000000.01 STR 0:1, , R_CF:1 00000000.02 AND V_00:32, ff:8, V_01:8 00000000.03 SHR V_01:8, 7:8, V_02:8 00000000.04 SHR V_01:8, 6:8, V_03:8 00000000.05 XOR V_02:8, V_03:8, V_04:8 00000000.06 SHR V_01:8, 5:8, V_05:8 00000000.07 SHR V_01:8, 4:8, V_06:8 00000000.08 XOR V_05:8, V_06:8, V_07:8 00000000.09 XOR V_04:8, V_07:8, V_08:8 00000000.0a SHR V_01:8, 3:8, V_09:8 00000000.0b SHR V_01:8, 2:8, V_10:8 00000000.0c XOR V_09:8, V_10:8, V_11:8 test eax, eax 00000000.0d SHR V_01:8, 1:8, V_12:8 00000000.0e XOR V_12:8, V_01:8, V_13:8 00000000.0f XOR V_11:8, V_13:8, V_14:8 00000000.10 XOR V_08:8, V_14:8, V_15:8 00000000.11 AND V_15:8, 1:1, V_16:1 00000000.12 NOT V_16:1, , R_PF:1 00000000.13 STR 0:1, , R_AF:1 00000000.14 EQ V_00:32, 0:32, R_ZF:1 00000000.15 SHR V_00:32, 1f:32, V_17:32 00000000.16 AND 1:32, V_17:32, V_18:32 00000000.17 EQ 1:32, V_18:32, R_SF:1 00000000.18 STR 0:1, , R_OF:1 17

  18. LLVM ● Easy to analyze and has great tools already available ● It’s a compiler! ○ Reversers want a decompiler. ○ Cannot be the only goal 18

  19. LLVM Challenges ● Hard to lift well from compiled binaries ○ Designed for compiler output ● Expects type information in the instructions ● SSA form - assembly is not ● Stack in assembly looks like a structure, but structures lose many advantages of SSA 19

  20. IDA ? 20

  21. Binary Ninja’s Answer • Binary Ninja Intermediate Language (BNIL) 21

  22. IL Goals & Design 22

  23. Why Another IL? ● Popular existing ILs for compiled binaries are not very human readable . They are extremely low level and verbose. ● Existing ILs are single stage . Heavyweight analysis must be performed to get anywhere close to decompiled output. ● Writing a lifter for a new architecture is usually very time consuming. 23

  24. Binary Ninja IL ● Create a family of ILs with multiple stages of analysis ● Lowest level is close to assembly ● After analysis and transformations, higher levels are closer to decompiled output and would be much easier to translate to good LLVM code ● Analysis involved in each transformation is easy to understand, fast, and directly aids further analysis 24

  25. IL Design Goals ● Human readable ● Computer understandable (SSA, 3AF, etc.) ● Plugin understandable ● Easy to lift native architectures ● Translation to other ILs such as LLVM 25

  26. Human Readable ● Reads like pseudocode, even in lowest level form ● Flags are resolved into readable expressions 26

  27. Low Level IL Example lea rax, [0x201047] rax = 0x201047 lea rdi, [0x201040] rdi = 0x201040 push rbp push(rbp) sub rax, rdi rax = rax - rdi mov rbp, rsp rbp = rsp cmp rax, 0xe if (rax u> 0xe) then ja 0x68d 6 @ 0x68d else 8 @ 0x68b x86-64 Assembly Low Level IL 27

  28. Low Level IL Example addiu $sp, $sp, -0x18 $sp = $sp - 0x18 sw $ra, 0x14($sp) [$sp + 0x14].d = $ra lw $a0, ($a1) $a0 = [$a1].d jal atoi call(atoi) nop $at = $v0 u< 0x20 ? 1 : 0 sltiu $at, $v0, 0x20 if ($at == 0) then beqz $at, 0x4002d8 7 @ 0x4002d8 else nop 12 @ 0x400290 MIPS Assembly Low Level IL 28

  29. Computer Understandable ● Multiple IL forms ● Pick the right IL for the task at hand 29

  30. IL Forms Lifted IL Low Level IL SSA / 3AF ASM -> IL Flags use resolved High Level IL Medium Level IL Calls in high level form Stack usage resolved SSA / 3AF Expression folding Type propagation Like decompiled output 30

  31. Plugin Understandable ● All IL forms directly accessible from API ● Analysis performed on IL also accessible by API 31

  32. Easy to Lift ● Expression tree ● Designed for quick, modular lifter implementations ● Semantic flags eases the burden of describing flag effects during lifting 32

  33. Semantic Flags ● Architecture plugins define the set of flags and their semantic roles ● Instructions can define a set of flags they write ● Data flow analysis is performed to link flag uses to flag writes 33

  34. Semantic Flags ● In most compiled code, flags are resolved to simple comparison expressions with no effort from the architecture plugin ● Special cases fall back to emitting concrete flag write expressions 34

  35. Semantic Flags Example Folded expression “Writes to all ALU flags” describing use of flags sub.q{*}(rax, 0xe) if (rax u> 0xe) then if (u>) then … else … … else … “Flag state representing unsigned greater than” 35

  36. Translating Upwards ● Semantic flags analysis gives Low Level IL with flag usage fully resolved ● Stack is represented as memory accesses, so data flow can be difficult to compute on stack variables in Low Level IL ● Need to analyze and translate to Medium Level IL 36

  37. Low Level IL to Medium Level IL ● Low Level IL is translated to SSA form ● Use implicit data flow from SSA to resolve stack layout ● Data flow based stack layout resolution avoids problems with nonstandard frame pointer behavior ● Translate loads and stores on stack to stack variable uses and assignments 37

  38. Medium Level IL Example push(ebp) ebp = esp var_4 = ebp esp = esp - 0x18 eax = arg_4 eax = [ebp + 8].d var_1c = eax [esp].d = eax free(var_1c) call(free) ebp = var_4 esp = ebp return ebp = pop <return> jump(pop) Medium Level IL 38

  39. Medium Level IL ● Registers and stack usage are now both treated as variables ● Stack variables no longer use explicit memory access ● Translate to SSA form to obtain implicit data flow on both registers and stack variables ● Type propagation is performed on SSA form 39

  40. Using Medium Level IL - Jump Tables 40

  41. Using Medium Level IL - Jump Tables ● Jump table resolution based on path-sensitive data flow ● SSA conversion process also tracks control flow dependence for every block ● Data flow computations allow disjoint sets of possible values ● Reads from memory are simulated ● At jump site, possible values are the possible jump targets 41

  42. Jump Table Example x8#1 = zx.q(x0#2.d) if (x0#2.d u> 0x1f) then … else … … x8#2 = sx.q([table + (x8#1 << 2)].d) x8#3 = x8#2 + table jump(x8#3) Medium Level IL Solve for this to get jump targets SSA Form 42

  43. Jump Table Example x8#1 = zx.q(x0#2.d) if (x0#2.d u> 0x1f) then … else … … x8#2 = sx.q([table + (x8#1 << 2)].d) x8#3 = x8#2 + table jump(x8#3) Track flow backwards with SSA to find definitions 43

  44. Jump Table Example x8#1 = zx.q(x0#2.d) if (x0#2.d u> 0x1f) then … else … … x8#2 = sx.q([table + (x8#1 << 2)].d) x8#3 = x8#2 + table jump(x8#3) Memory read depends on value of x8#1 44

  45. Jump Table Example x8#1 = zx.q(x0#2.d) Value used in if (x0#2.d u> 0x1f) branch then … else … comparison … x8#2 = sx.q([table + (x8#1 << 2)].d) x8#3 = x8#2 + table jump(x8#3) 45

  46. Jump Table Example Branch condition x8#1 = zx.q(x0#2.d) must be false to if (x0#2.d u> 0x1f) reach jump site then … else … … x8#2 = sx.q([table + (x8#1 << 2)].d) x8#3 = x8#2 + table jump(x8#3) 46

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend