analyzing memory accesses in x86 executables
play

Analyzing Memory Accesses in x86 Executables Gogul Balakrishnan - PowerPoint PPT Presentation

Analyzing Memory Accesses in x86 Executables Gogul Balakrishnan Thomas Reps University of Wisconsin Motivation Basic infrastructure for language-based security buffer-overrun detection information-flow vulnerabilities . . .


  1. Analyzing Memory Accesses in x86 Executables Gogul Balakrishnan Thomas Reps University of Wisconsin

  2. Motivation • Basic infrastructure for language-based security – buffer-overrun detection – information-flow vulnerabilities – . . . • What if we do not have source code? – viruses, worms, mobile code, etc. – legacy code (w/o source) • Limitations of existing tools – over-conservative treatment of memory accesses ⇒ Many false positives – unsafe treatment of pointer arithmetic ⇒ Many false negatives

  3. Goal (1) • Create an intermediate representation (IR) that is similar to the IR used in a compiler – CFGs – used, killed, may-killed variables for CFG nodes – points-to sets – call-graph • Why? – a tool for a security analyst – a general infrastructure for binary analysis

  4. Goal (2) • Scope: programs that conform to a “standard compilation model” – data layout determined by compiler – some variables held in registers – global variables � absolute addresses – local variables � offsets in esp -based stack frame • Report violations – violations of stack protocol – return address modified within procedure

  5. Codesurfer/x86 Architecture IDA Pro Parse Binary Binary Connector CodeSurfer Client Build Value-set Build SDG Applications CFGs Analysis Browse • CFGs • basic blocks • used, killed, may-killed variables for CFG nodes • points-to sets

  6. Codesurfer/x86 Architecture Whole-program analysis IDA Pro • stubs are ok Parse Binary Binary Connector CodeSurfer Client Build Value-set Build SDG Applications CFGs Analysis Browse Initial estimate of • code vs. data • procedures and call sites • malloc sites

  7. Outline • Example • Challenges • Value-set analysis • Performance • [Future work]

  8. Running Example int arrVal=0, *pArray2; ; ebx ⇔ i ; ecx ⇔ variable p int main() { int i, a[10], *p; sub esp, 40 ;adjust stack lea edx, [esp+8] ; /* Initialize pointers */ mov [4], edx ;pArray2=&a[2] pArray2=&a[2]; lea ecx, [esp] ;p=&a[0] p=&a[0]; mov edx, [0] ; /* Initialize Array*/ loc_9: for(i=0; i<10; ++i) { mov [ecx], edx ;*p=arrVal *p=arrVal; add ecx, 4 ;p++ inc ebx ;i++ p++ ; cmp ebx, 10 ;i<10? } jl short loc_9 ; /* Return a[2] */ mov edi, [4] ; return *pArray2; mov eax, [edi] ;return *pArray2 } add esp, 40 retn

  9. Running Example int arrVal=0, *pArray2; ; ebx ⇔ i ; ecx ⇔ variable p int main() { int i, a[10], *p; sub esp, 40 ;adjust stack lea edx, [esp+8] ; /* Initialize pointers */ mov [4], edx ;pArray2=&a[2] pArray2=&a[2]; lea ecx, [esp] ;p=&a[0] p=&a[0]; mov edx, [0] ; /* Initialize Array*/ loc_9: for(i=0; i<10; ++i) { mov [ecx], edx ;*p=arrVal *p=arrVal; add ecx, 4 ;p++ inc ebx ;i++ p++ ; ? cmp ebx, 10 ;i<10? } jl short loc_9 ; /* Return a[2] */ mov edi, [4] ; return *pArray2; mov eax, [edi] ;return *pArray2 } add esp, 40 retn

  10. Running Example – Address Space 0ffffh return_address ; ebx ⇔ i ; ecx ⇔ variable p Data local sub esp, 40 ;adjust stack to main lea edx, [esp+8] ; a(40 bytes) (Activation mov [4], edx ;pArray2=&a[2] Record) lea ecx, [esp] ;p=&a[0] mov edx, [0] ; loc_9: mov [ecx], edx ;*p=arrVal add ecx, 4 ;p++ inc ebx ;i++ ? cmp ebx, 10 ;i<10? jl short loc_9 ; mov edi, [4] ; pArray2(4 bytes) mov eax, [edi] ;return *pArray2 4h Global data add esp, 40 arrVal(4 bytes) retn 0h

  11. Running Example – Address Space 0ffffh return_address ; ebx ⇔ i ; ecx ⇔ variable p Data local sub esp, 40 ;adjust stack to main lea edx, [esp+8] ; (Activation mov [4], edx ;pArray2=&a[2] Record) lea ecx, [esp] ;p=&a[0] mov edx, [0] ; loc_9: mov [ecx], edx ;*p=arrVal No debugging add ecx, 4 ;p++ inc ebx ;i++ ? information cmp ebx, 10 ;i<10? jl short loc_9 ; mov edi, [4] ; mov eax, [edi] ;return *pArray2 Global data add esp, 40 retn 0h

  12. Challenges (1) • No debugging/symbol-table information • Explicit memory addresses – need something similar to C variables – a-locs • Only have an initial estimate of – code, data, procedures, call sites, malloc sites – extend IR on-the-fly • disassemble data, add to CFG, . . . • similar to elaboration of CFG/call-graph in a compiler because of calls via function pointers

  13. Challenges (2) • Indirect-addressing mode – need “pointer analysis” – value-set analysis • Pointer arithmetic – need numeric analysis (e.g., range analysis) – value-set analysis • Checking for non-aligned accesses – pointer forging? – keep stride information in value-sets

  14. Not Everything is Bad News ! • Multiple source languages OK • Some optimizations make our task easier – optimizers try to use registers, not memory – deciphering memory operations is the hard part

  15. Memory-regions • An abstraction of the address space • Idea: group similar runtime addresses – collapse the runtime ARs for each procedure … f … g f f … g g f global global

  16. Memory-regions • An abstraction of the address space • Idea: group similar runtime addresses – collapse the runtime ARs for each procedure • Similarly, • one region for all global data • one region for each malloc site

  17. Example – Memory-regions ret_main (main, 0) ; ebx ⇔ i ; ecx ⇔ variable p (GL,8) sub esp, 40 ;adjust stack lea edx, [esp+8] ; (GL,0) mov [4], edx ;pArray2=&a[2] lea ecx, [esp] ;p=&a[0] Global Region mov edx, [0] ; (main, -40) loc_9: mov [ecx], edx ;*p=arrVal add ecx, 4 ;p++ Region for main inc ebx ;i++ ? cmp ebx, 10 ;i<10? jl short loc_9 ; mov edi, [4] ; mov eax, [edi] ;return *pArray2 add esp, 40 retn

  18. “Need Something Similar to C Variables” • Standard compilation model – some variables held in registers – global variables � absolute addresses – local variables � offsets in stack frame • A-locs – locations between consecutive addresses – locations between consecutive offsets – registers • Use a-locs instead of variables in static analysis – e.g., killed a-loc ≈ killed variable

  19. Example – A-locs ret_main (main, 0) ; ebx ⇔ i ; ecx ⇔ variable p (GL,8) [4] (GL,4) sub esp, 40 ;adjust stack lea edx, [esp+8] ; [0] (GL,0) mov [4], edx ;pArray2=&a[2] [esp+8] (main, -32) lea ecx, [esp] ;p=&a[0] Global Region mov edx, [0] ; [esp] (main, -40) loc_9: mov [ecx], edx ;*p=arrVal add ecx, 4 ;p++ Region for main inc ebx ;i++ ? cmp ebx, 10 ;i<10? jl short loc_9 ; mov edi, [4] ; mov eax, [edi] ;return *pArray2 add esp, 40 retn

  20. Example – A-locs ret_main (main, 0) ; ebx ⇔ i ; ecx ⇔ variable p (GL,8) mem_4 mainv_20 (GL,4) sub esp, 40 ;adjust stack mem_0 lea edx, [esp+8] ; (GL,0) mov [4], edx ;pArray2=&a[2] (main, -32) lea ecx, [esp] ;p=&a[0] Global Region mov edx, [0] ; mainv_28 (main, -40) loc_9: mov [ecx], edx ;*p=arrVal add ecx, 4 ;p++ Region for main inc ebx ;i++ ? cmp ebx, 10 ;i<10? jl short loc_9 ; mov edi, [4] ; mov eax, [edi] ;return *pArray2 add esp, 40 retn

  21. Example – A-locs ret_main (main, 0) ; ebx ⇔ i ; ecx ⇔ variable p (GL,8) mem_4 mainv_20 (GL,4) sub esp, 40 ;adjust stack mem_0 lea edx, &mainv_2 ; (GL,0) mov mem_4, edx ;pArray2=&a[2] (main, -32) lea ecx, &mainv_2 ;p=&a[0] Global Region mov edx, mem_0 ; mainv_28 (main, -40) loc_9: mov [ecx], edx ;*p=arrVal add ecx, 4 ;p++ Region for main inc ebx ;i++ ? cmp ebx, 10 ;i<10? jl short loc_9 ; mov edi, mem_4 ; mov eax, [edi] ;return *pArray2 add esp, 40 retn

  22. Example – A-locs locals: mainv_28, mainv_20 {a[0], a[2]} ; ebx ⇔ i globals: mem_0, mem_4 ; ecx ⇔ variable p {arrVal, pArray2} sub esp, 40 ;adjust stack lea edx, &mainv_20; mov mem_4, edx ;pArray2=&a[2] lea ecx, &mainv_28;p=&a[0] edx mainv_20 mov edx, mem_0 ; loc_9: mem_4 mov [ecx], edx ;*p=arrVal add ecx, 4 ;p++ inc ebx ;i++ ? cmp ebx, 10 ;i<10? edi jl short loc_9 ; mov edi, mem_4 ; ecx mov eax, [edi] ;return *pArray2 mainv_28 add esp, 40 retn

  23. Example – A-locs locals: mainv_28, mainv_20 {a[0], a[2]} ; ebx ⇔ i globals: mem_0, mem_4 ; ecx ⇔ variable p {arrVal, pArray2} sub esp, 40 ;adjust stack lea edx, &mainv_20; mov mem_4, edx ;pArray2=&a[2] lea ecx, &mainv_28;p=&a[0] edx mainv_20 mov edx, mem_0 ; loc_9: mem_4 mov [ecx], edx ;*p=arrVal add ecx, 4 ;p++ inc ebx ;i++ � cmp ebx, 10 ;i<10? edi jl short loc_9 ; mov edi, mem_4 ; ecx mov eax, [edi] ;return *pArray2 mainv_28 add esp, 40 retn

  24. Value-Set Analysis • Resembles a pointer-analysis algorithm – interprets pointer-manipulation operations – pointer arithmetic, too • Resembles a numeric-analysis algorithm – over-approximate the set of values/addresses held by an a-loc • range information • stride information – interprets arithmetic operations on sets of values/addresses

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend