binary level program analysis a discussion of x86 64
play

Binarylevel program analysis: A discussion of x8664 Gang Tan CSE - PowerPoint PPT Presentation

Binarylevel program analysis: A discussion of x8664 Gang Tan CSE 597 Spring 2019 Penn State University * These slides follow Sec 3.13 of the book CSAPP Computer Systems: A Programmers Perspective; Figures and slides are


  1. Binary‐level program analysis: A discussion of x86‐64 Gang Tan CSE 597 Spring 2019 Penn State University * These slides follow Sec 3.13 of the book CSAPP “Computer Systems: A Programmer’s Perspective”; Figures and slides are borrowed/adapted from that book 2

  2. Intel’s 64‐Bit History • 2001: Intel Attempts Radical Shift from IA32 to IA64 – Totally different architecture (Itanium) – Executes IA32 code only as legacy – Performance disappointing • 2003: AMD Steps in with Evolutionary Solution – x86‐64 (now called “AMD64”) • Intel Felt Obligated to Focus on IA64 – Hard to admit mistake or that AMD is better • 2004: Intel Announces EM64T extension to IA32 – Extended Memory 64‐bit Technology – Almost identical to x86‐64! • All but low‐end x86 processors support x86‐64 – But, lots of code still runs in 32‐bit mode 3

  3. Overview of x86‐64 • Pointers and long integers are 64 bits long – Integer arithmetic operations support 8, 16, 32, and 64 bits • 16 general‐purpose registers; each 64‐bit long • Calling conventions pass more parameters via registers – System V AMD64 ABI: passes the first 6 parameters in registers – As a result, some procedures do not need to access the stack at all. • Conditional operations are implemented using conditional move instructions when possible – Better performance than using branches • Floating‐point operations are implemented using the register‐oriented instruction set in SSE version 2 – Rather than the stack‐based approach in IA32 4

  4. x86‐64 Data Types Fig 3.34 of CSAPP 5

  5. 16 64‐bit GP Registers Fig 3.35 of CSAPP 6

  6. Instruction Operands • Similar to IA32 – Except that the base and index registers must use the r‐version of registers • In addition, PC‐relative addressing – “add rax, 0x200ad1[rip]” accesses mem at address rip+0x200ad1 7

  7. Function Calling: Argument Passing • The following slides assume the System V AMD64 ABI • Arguments (up to the first six) are passed to procedures via registers – This reduces the overhead of storing and retrieving values on the stack • callq stores a 64‐bit return address on the stack. 8

  8. Example of Argument Passing long myfunc( long a, long b, long c, long d, long e, long f, long g, long h) { long xx = a * b * c * d * e * f * g * h; long yy = a + b + c + d + e + f + g + h; long zz = utilfunc(xx, yy, xx % yy); return zz + 20; } 9 * Example from https://eli.thegreenplace.net/2011/09/06/stack‐frame‐layout‐on‐x86‐64/

  9. Function Calling: Stack Frame • A function may not require a stack frame, if – all local variables can be held in registers, and – no array/structure local variables, and – no address‐of operator (&) is used on local variables, and – It does not call another function that requires argument passing on the stack, and – It does not need to save some callee‐save regs 10

  10. Function Calling: Red‐Zone Optimization • Red‐zone optimization for leaf functions (functions that do not call other funs) – 128 bytes below rsp can be used by a leaf function without stack allocation – Red‐zone will not be asynchronously clobbered by signals or interrupt handlers, and thus can use it for scratch data 11

  11. Function Calling: the Base Pointer Optimization • Two options for functions that need a stack frame • Option 1: the traditional approach (default for gcc without optimizations) – Function prologue: save the base pointer; create the new base pointer – Function body: References to stack location are made relative to the base pointer – Function epilogue: restore the base pointer • Option 2: faster (default for gcc with optimizations) – Do not save/restore the base pointer; rbp used as a GP register – References to stack locations are made relative to the stack pointer – Stack allocation at the beginning; rsp remains at a fixed position during a call 12

  12. Example C source code long int simple_l (long int *xp, long int y) { long int t = *xp + y; *xp = t; return t; } 13

  13. Example Optimized x86‐32 Assembly simple_l: pushl %ebp ; Save frame pointer movl %esp, %ebp ; New frame pointer movl 8(%ebp), %edx ; Retrieve xp movl 12(%ebp), %eax ; Retrieve yp addl (%edx), %eax ; Add *xp to get t movl %eax, (%edx) ; Store t at xp popl %ebp ; Restore frame pointer ret 14

  14. Example Unoptimized Optimized x86‐64 Assembly x86‐64 Assembly simple_l: simple_l: movq %rsi, %rax ; Copy y pushq %rbp addq (%rdi), %rax ; Add *xp to get t movq %rsp, %rbp movq %rax, (%rdi) ; Store t at xp movq %rdi, ‐24(%rbp) ret movq %rsi, ‐32(%rbp) movq ‐24(%rbp), %rax movq (%rax), %rax addq ‐32(%rbp), %rax movq %rax, ‐8(%rbp) movq ‐24(%rbp), %rax movq ‐8(%rbp), %rdx movq %rdx, (%rax) movq ‐8(%rbp), %rax leave ret 15

  15. Function Calling: Caller/Callee‐Save Registers • Callee‐saved regs: rbx, rbp, and r12 to r15 • Caller‐saved regs: r10 and r11 16

  16. x86‐64 Assembly Code Example C source code Optimized x86‐64 Assembly long plus(long x, long y); sumstore: pushq %rbx movq %rdx, %rbx void sumstore(long x, long y, long *dest) call plus movq %rax, (%rbx) { long t = plus(x, y); popq %rbx ret *dest = t; }

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend