implementing an llvm based d ynamic b inary i
play

Implementing an LLVM based D ynamic B inary I nstrumentation - PowerPoint PPT Presentation

Implementing an LLVM based D ynamic B inary I nstrumentation framework Charles Hubain Cdric Tessier Introduction to Instrumentation 34c3 - Implementing an LLVM based DBI framework 2 What is Instrumentation? Transformation of a program


  1. Implementing an LLVM based D ynamic B inary I nstrumentation framework Charles Hubain Cédric Tessier

  2. Introduction to Instrumentation 34c3 - Implementing an LLVM based DBI framework 2

  3. What is Instrumentation? • “Transformation of a program into its own measurement tool” • Observe any state of a program anytime during runtime • Automate the data collection and processing 34c3 - Implementing an LLVM based DBI framework 3

  4. Use Cases • Finding memory bugs: • Track memory allocations / deallocations • Track memory accesses • Fuzzing: • Measure code coverage • Build symbolic representation of code • Recording execution traces • Replay them for “timeless” debugging • Software side-channel attacks against crypto 34c3 - Implementing an LLVM based DBI framework 4

  5. “Why not … debuggers?” • Debuggers are awesome but slooooooooow Resume Schedule Signal + Debugger Kernel Target Trap interrupt schedule 34c3 - Implementing an LLVM based DBI framework 5

  6. https://asciinema.org/a/17nynlopg5a18e1qps3r9ou7g

  7. “Why not … debuggers?” • Debuggers are awesome but slooooooooow Resume Schedule Signal + Debugger Kernel Target Trap interrupt schedule • Solution? Get rid of the kernel • How? Run the instrumentation inside the target 34c3 - Implementing an LLVM based DBI framework 7

  8. Instrumentation Techniques • From source code: • Manually, you know … printf(…) BORING • At compile time • From binary: • Static binary patching & hooking Crude and barbaric • Dynamic Binary Instrumentation This talk 34c3 - Implementing an LLVM based DBI framework 8

  9. Existing Frameworks • Valgrind since 2000 • Open source, only *nix platforms, very complex • DynamoRIO since 2002 • Open source, cross-platforms, very raw • Intel Pin since 2004 • Closed source, only Intel platforms, user friendly 34c3 - Implementing an LLVM based DBI framework 9

  10. “Why we made our own” What we wanted from a DBI framework in 2015 • Cross-platform and cross-architecture • Mobile and embedded targets support • Simpler and modular design • Focus on “heavy” instrumentation 34c3 - Implementing an LLVM based DBI framework 10

  11. Introduction to DBI 34c3 - Implementing an LLVM based DBI framework 11

  12. Dynamic Binary Instrumentation • Dynamically insert the instrumentation at runtime Generate Disassemble Insert Execute Instrumentation Instru Original Binary Code PAC-MAN for scale 34c3 - Implementing an LLVM based DBI framework 12

  13. Disassembling • What part of the binary is the code is unknown ➡ Disassembling the whole binary in advance is impossible • We need to discover the code as we go 34c3 - Implementing an LLVM based DBI framework 13

  14. Code Discovery • How? • Execute a block of code • Discover where the execution flow after the block • Execute the next block of code • This forms a short execution cycle 34c3 - Implementing an LLVM based DBI framework 14

  15. No Free Space • The instrumented code is larger than Instruction Instruction the original code Instruction … COND JUMP • Binaries are usually tightly packed with TRUE Instruction little free space FALSE Instruction Instruction … JUMP ➡ The instrumentation cannot be Instruction inserted in-place Instruction Instruction … ➡ It needs to be “ relocated” JUMP 34c3 - Implementing an LLVM based DBI framework 15

  16. Relocating • Code contains relative reference to memory addresses • These become invalid once we move the code • We need to completely rewrite the code to fix those references ➡ This is what we call “ patching ” 34c3 - Implementing an LLVM based DBI framework 16

  17. The “Cycle of Life” 34c3 - Implementing an LLVM based DBI framework 17

  18. Designing a DBI: 1. Low Level Abstractions 34c3 - Implementing an LLVM based DBI framework 18

  19. Basic Blocks Instruction Instruction Instruction … Instruction Instruction Instruction Instruction Instruction Instruction … … Instruction Instruction Instruction … 34c3 - Implementing an LLVM based DBI framework 19

  20. Control Flow Instruction Instruction Instruction … JUMP Instruction Instruction Instruction Instruction Instruction Instruction … … JUMP JUMP Instruction Instruction Instruction … JUMP 34c3 - Implementing an LLVM based DBI framework 20

  21. Under Control Flow Guest Host Instruction Instruction Instruction … JUMP Instruction Instruction Instruction … JUMP DBI Instruction Instruction Instruction … JUMP Instruction Instruction Instruction … JUMP 34c3 - Implementing an LLVM based DBI framework 21

  22. Under Control DBI is all about keeping control of the execution 34c3 - Implementing an LLVM based DBI framework 22

  23. Under Control • Keeping control of the execution • requires modifying original instructions… • …without modifying original behaviour 34c3 - Implementing an LLVM based DBI framework 23

  24. What We Need • A multi-architecture disassembler • A multi-architecture assembler • A generic intermediate representation to apply modifications on 34c3 - Implementing an LLVM based DBI framework 24

  25. We Don't Want Actually we don’t have 10 years and unlimited ressources • To implement a multi-architecture disassembler and assembler • To abstract every single instruction semantic • Architectures Developer Manuals are not that fun… 34c3 - Implementing an LLVM based DBI framework 25

  26. Here Be Dragons This has nothing to do with 26C3 34c3 - Implementing an LLVM based DBI framework 26

  27. To the rescue • LLVM already has everything • It supports all major architectures • It provides a disassembler and an assembler … • …and both work on the same intermediate representation • LLVM Machine Code (aka MC) to the rescue 34c3 - Implementing an LLVM based DBI framework 27

  28. LLVM MC movq rax, 42 Instruction Binary [0x48,0x89,0x04,0x25,0x2a,0x00,0x00,0x00] <MCInst #1670 MOV64mr LLVM MC <MCOperand Reg:0> <MCOperand Imm:1> <MCOperand Reg:0> <MCOperand Imm:42> <MCOperand Reg:0> <MCOperand Reg:35>> 34c3 - Implementing an LLVM based DBI framework 28

  29. LLVM MC • It’s minimalist • It’s totally generic • still encodes a lot of things about an instruction • But very raw • genericness means some heavy compromises • doesn’t encode everything about an instruction 34c3 - Implementing an LLVM based DBI framework 29

  30. Creation Every instruction is encoded using the same representation … … but in a di ff erent way <MCInst #1139 MOV64mr <MCOperand Reg:41> <MCOperand Imm:1> movq [rip+0x2600], rax <MCOperand Reg:0> <MCOperand Imm:0x2600> <MCOperand Reg:0> <MCOperand Reg:35>> 34c3 - Implementing an LLVM based DBI framework 30

  31. Modification jmp 0x41424242 jmp [rip+0x2600] <MCInst #1139 JMP64m <MCOperand Reg:41> <MCInst #1141 JMP_1 <MCOperand Imm:1> <MCOperand Imm: 0x41424242>> <MCOperand Reg:0> <MCOperand Imm:0x2600> <MCOperand Reg:0>> 34c3 - Implementing an LLVM based DBI framework 31

  32. Patch 0x410000: mov r0, [r0+pc] ; Load a value relative to PC 34c3 - Implementing an LLVM based DBI framework 32

  33. Patch mov [pc+0x2600], r1 ; Backup R1 mov r1, 0x410000 ; Set original instruction address 0x7f10000: mov r0, [r0+r1] ; Load a value relative to R1 mov r1, [pc+0x2600] ; Restore R1 34c3 - Implementing an LLVM based DBI framework 33

  34. Abstractions • MCInst encoding make transformations painful • Patches can be really complex • Many transformations are composed of generic steps we need abstractions 34c3 - Implementing an LLVM based DBI framework 34

  35. Patch Engine MCInst Patch MCInst MCInst Engine MCInst Abstractions Inside™ 34c3 - Implementing an LLVM based DBI framework 35

  36. 36

  37. Patch DSL Abstractions you said? • Identify transformation steps required to patch instructions • Regroup and integrate them as a domain-specific language • Instructions are architecture specifics… • …DSL should be generic (as much as possible) 34c3 - Implementing an LLVM based DBI framework 37

  38. Patch DSL Program QBDI Registry Copy Reg T emp Load/Save e Get/Set t i r W Memory Shadows, Context Metadata 34c3 - Implementing an LLVM based DBI framework 38

  39. Patch DSL mov [pc+0x2600], r1 mov r1, 0x410000 Temp(0) […] mov r1, [pc+0x2600] 34c3 - Implementing an LLVM based DBI framework 39

  40. Patch DSL mov [pc+0x2600], r1 mov r1, 0x410000 mov r0, [r0+r1] mov r1, [pc+0x2600] SubstituteWithTemp(Reg(REG_PC), Temp(0)) 34c3 - Implementing an LLVM based DBI framework 40

  41. Patch DSL • Modifications are defined in rules • A rule is composed of • one (or several) condition(s) • one (or several) action(s) • Actions can modify or replace an instruction 34c3 - Implementing an LLVM based DBI framework 41

  42. Patch DSL /* Rule #3: Generic RIP patching. * Target: Any instruction with RIP as operand, e.g. LEA RAX, [RIP + 1] * Patch: Temp(0) := rip * LEA RAX, [RIP + IMM] --> LEA RAX, [Temp(0) + IMM] */ PatchRule( UseReg(Reg(REG_PC)), { GetPCO ff set(Temp(0), Constant(0)), ModifyInstruction({ SubstituteWithTemp(Reg(REG_PC), Temp(0)) }) } ); 34c3 - Implementing an LLVM based DBI framework 42

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend