x86 basics ISA context and x86 history Translation tools: C --> assembly <--> machine code x86 Basics: Registers Data movement instructions Memory addressing modes Arithmetic instructions 1
Program, Application Software Programming Language Compiler /Interpreter next Operating System few weeks Instruction Set Architecture Microarchitecture Hardware past few Digital Logic weeks Devices (transistors, etc.) Solid-State Physics
Computer Microarchitecture ( Implementation of ISA) Instruction ALU Registers Memory Fetch and Decode
Instruction Set Architecture (HW/SW Interface ) processor memory Instructions Instruction Encoded Names, Encodings • Logic Instructions Effects • Arguments, Results • Registers Data Local storage Names, Size • How many • Large storage Addresses, Locations • Computer
CISC a brief history of x86 (vs. RISC) Word Size ISA First Year 8086 Intel 8086 1978 16 First 16-bit processor. Basis for IBM PC & DOS 1MB address space 240 now: 32 IA32 Intel 386 1985 First 32-bit ISA. 2015: most laptops, Flat addressing, improved OS support desktops, servers. 64 x86-64 AMD Opteron 2003* 240 soon: Slow AMD/Intel conversion, slow adoption. *Not actually x86-64 until few years later. Mainstream only after ~10 years. 6
Turning C into Machine Code C Code Generated IA32 Assembly Code Human-readable language close to machine code. int sum(int x, int y) { int t = x+y; sum: return t; pushl %ebp } movl %esp,%ebp movl 12(%ebp),%eax code.c addl 8(%ebp),%eax movl %ebp,%esp popl %ebp compiler ret gcc -O1 -S code.c code.s assembler Object Code 01010101100010011110010110 00101101000101000011000000 Linker: create full executable 00110100010100001000100010 Resolve references between object files, 01111011000101110111000011 libraries, (re)locate data code.o 8
Disassembling Object Code (objdump) code.o 01010101100010011110010110 00101101000101000011000000 00110100010100001000100010 01111011000101110111000011 Disassembled by objdump 00401040 <_sum>: Disassembler 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 8b 45 0c mov 0xc(%ebp),%eax objdump -d p 6: 03 45 08 add 0x8(%ebp),%eax 9: 89 ec mov %ebp,%esp b: 5d pop %ebp c: c3 ret 12
Disassembling Object Code (gdb) Object Disassembled by GDB 0x401040: 0x401040 <sum>: push %ebp 0x55 0x401041 <sum+1>: mov %esp,%ebp 0x89 0x401043 <sum+3>: mov 0xc(%ebp),%eax 0xe5 0x401046 <sum+6>: add 0x8(%ebp),%eax 0x8b 0x401049 <sum+9>: mov %ebp,%esp 0x45 0x40104b <sum+11>: pop %ebp 0x0c 0x40104c <sum+12>: ret 0x03 0x45 > gdb p 0x08 (gdb) disassemble sum 0x89 0xec (disassemble function) 0x5d (gdb) x/13b sum 0xc3 (examine the 13 bytes starting at sum ) 13
Integer Registers (IA32) Origin (mostly obsolete) %eax accumulate %ecx counter general purpose data %edx %ebx base source %esi index destination %edi index stack %esp special purpose pointer base %ebp special purpose pointer 32-bits wide Some have special uses for particular instructions 14
Integer Registers (historical artifacts) high/low bytes of old 16-bit registers %eax %ax %ah %al accumulate %ecx %cx %ch %cl counter general purpose data %dx %dh %dl %edx %ebx base %bx %bh %bl source %esi %si index destination %edi %di index stack %esp %sp pointer base %ebp %bp pointer 16-bit virtual registers (backwards compatible) 15
IA32: Three Basic Kinds of Instructions 1. Data movement between memory and register Load data from memory into register %reg = Mem[address] Memory is an Store register data into memory array[] of bytes! Mem[address] = %reg 2. Arithmetic/logic on register or memory data c = a + b; z = x << y; i = h & g; 3. Comparisons and Control flow to choose next instruction Unconditional jumps to/from procedures Conditional branches 18
Data movement instructions %eax mov x Source , Dest %ecx x is one of { b, w, l } %edx gives size of data %ebx %esi movl Source , Dest : %edi Move 4-byte “long word” %esp movw Source , Dest : %ebp Move 2-byte “word” movb Source , Dest : Move 1-byte “byte” historical terms from the 16-bit days not the current machine word size 19
Data movement instructions %eax movl Source , Dest : %ecx %edx Operand Types: %ebx Immediate: Literal integer data Examples: $0x400 , $-533 %esi %edi Register: One of 8 integer registers %esp Examples: %eax, %edx %ebp Memory: 4 consecutive bytes in memory, at address held by register Simplest example: (%eax) Various other “address modes” 20
movl Operand Combinations Source Dest Src,Dest C Analog Reg movl $0x4,%eax var_a = 0x4; Imm Mem movl $-147,(%eax) *p_a = -147; Reg movl %eax,%edx var_d = var_a; Reg movl Mem movl %eax,(%edx) *p_d = var_a; Mem Reg movl (%eax),%edx var_d = *p_a; Cannot do memory-memory transfer with a single instruction. How would you do it? 21
Basic Memory Addressing Modes Indirect (R) Mem[Reg[R]] Register R specifies the memory address movl (%ecx),%eax Displacement D(R) Mem[Reg[R]+D] Register R specifies a memory address (e.g. the start of an object) Constant displacement D specifies the offset from that address (e.g. a field in the object) movl 8(%ebp),%edx 22
Using Basic Addressing Modes swap: pushl %ebp Set movl %esp,%ebp Up pushl %ebx void swap(int *xp, int *yp){ int t0 = *xp; movl 12(%ebp),%ecx int t1 = *yp; *xp = t1; movl 8(%ebp),%edx *yp = t0; movl (%ecx),%eax Body } movl (%edx),%ebx movl %eax,(%edx) movl %ebx,(%ecx) movl -4(%ebp),%ebx movl %ebp,%esp Finish popl %ebp ret 23
Understanding Swap higher addresses • void swap(int *xp, int *yp) { Stack • int t0 = *xp; • (in memory) int t1 = *yp; Offset *xp = t1; 12 yp *yp = t0; } lower addresses 8 xp 4 Return addr Old % ebp 0 %ebp Old % ebx -4 Register Value %ecx yp movl 12(%ebp),%ecx # ecx = yp %edx xp movl 8(%ebp),%edx # edx = xp %eax t1 movl (%ecx),%eax # eax = *yp (t1) %ebx t0 movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax register <-> variable movl %ebx,(%ecx) # *yp = ebx mapping 24
Address Understanding Swap 0x124 123 higher addresses 456 0x120 0x11c %eax 0x118 Offset %edx 0x114 yp 12 0x120 0x110 %ecx xp 8 0x124 lower addresses 0x10c %ebx 4 Return addr 0x108 %esi 0 %ebp 0x104 %edi -4 0x100 %esp movl 12(%ebp),%ecx # ecx = yp %ebp 0x104 movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax ex movl %ebx,(%ecx) # *yp = ebx 25
Address Understanding Swap 0x124 123 456 0x120 0x11c %eax 0x118 Offset %edx 0x114 yp 12 0x120 0x120 0x110 %ecx 0x120 xp 8 0x124 0x10c %ebx 4 Return addr 0x108 %esi 0 %ebp 0x104 %edi -4 0x100 %esp movl 12(%ebp),%ecx # ecx = yp %ebp 0x104 movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx 26
Address Understanding Swap 0x124 123 456 0x120 0x11c %eax 0x118 Offset %edx 0x124 0x114 yp 12 0x120 0x110 %ecx 0x120 xp 8 0x124 0x124 0x10c %ebx 4 Return addr 0x108 %esi 0 %ebp 0x104 %edi -4 0x100 %esp movl 12(%ebp),%ecx # ecx = yp %ebp 0x104 movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx 27
Address Understanding Swap 0x124 123 456 456 0x120 0x11c %eax 456 0x118 Offset %edx 0x124 0x114 yp 12 0x120 0x110 %ecx 0x120 xp 8 0x124 0x10c %ebx 4 Return addr 0x108 %esi 0 %ebp 0x104 %edi -4 0x100 %esp movl 12(%ebp),%ecx # ecx = yp %ebp 0x104 movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx 28
Address Understanding Swap 0x124 123 123 456 0x120 0x11c %eax 456 0x118 Offset %edx 0x124 0x114 yp 12 0x120 0x110 %ecx 0x120 xp 8 0x124 0x10c %ebx 123 4 Return addr 0x108 %esi 0 %ebp 0x104 %edi -4 0x100 %esp movl 12(%ebp),%ecx # ecx = yp %ebp 0x104 movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx 29
Address Understanding Swap 0x124 456 456 0x120 0x11c %eax 456 456 0x118 Offset %edx 0x124 0x114 yp 12 0x120 0x110 %ecx 0x120 xp 8 0x124 0x10c %ebx 123 123 4 Return addr 0x108 %esi 0 %ebp 0x104 %edi -4 0x100 %esp movl 12(%ebp),%ecx # ecx = yp %ebp 0x104 movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx 30
Address Understanding Swap 0x124 456 123 0x120 0x11c %eax 456 0x118 Offset %edx 0x124 0x114 yp 12 0x120 0x110 %ecx 0x120 xp 8 0x124 0x10c %ebx 123 123 4 Return addr 0x108 %esi 0 %ebp 0x104 %edi -4 0x100 %esp movl 12(%ebp),%ecx # ecx = yp %ebp 0x104 movl 8(%ebp),%edx # edx = xp movl (%ecx),%eax # eax = *yp (t1) movl (%edx),%ebx # ebx = *xp (t0) movl %eax,(%edx) # *xp = eax movl %ebx,(%ecx) # *yp = ebx 31
Recommend
More recommend