CS356 Unit 7 Data Layout & Intermediate Stack Frames 7.2 - PowerPoint PPT Presentation

7.1 CS356 Unit 7 Data Layout & Intermediate Stack Frames

7.2 Structs CS:APP 3.9.1 • Structs are just collections of heterogeneous data • Each member is laid out in consecutive memory locations, with some padding inserted to ensure alignment – Intel machines don't require alignment but perform better when it is used – Reordering can reduce size! www.catb.org/esr/structure-packing – “Each type aligned at a multiple of its size” struct Data1 { 4 0 offset: int x; x y Data1 char y; }; 2 0 offset: struct Data2 { Data2 w p (w/o padding) short w; char *p; 2 0 8 offset: }; Data2 w padding p (w/ padding) struct Data3 { struct Data1 f; 4 5 8 offset: 0 int g; Data3 f.x f.y padding g };

7.3 Structs: Offsets in assembly struct record_t { Assume 4-byte int / float , char a[2]; int b; 8-byte long / double . long c; int d[3]; Can you figure out the short e; }; offsets for %rdi ? void initialize(struct record_t *x) { x->a[1] = 1; x->b = 2; initialize: x->c = 3; movb $1, 1 (%rdi) x->d[1] = 4; movl $2, 4 (%rdi) x->e = 5; movq $3, 8 (%rdi) } movl $4, 20 (%rdi) movw $5, 28 (%rdi) ret a a b b b b c c c c c c c c d0 d0 d0 d0 d1 d1 d1 d1 d2 d2 d2 d2 e e

7.4 struct B { // this struct must start/end at a multiple of 4, because that's required by 'y' char x; // 1 byte int y; // 4 bytes (needs 3 bytes of padding before to start at a multiple of 4) char z; // 1 byte (needs 3 bytes of padding after to end at a multiple of 4) }; struct A { char a; // 1 byte struct B b; // has 4-byte alignment: 3 bytes of padding before 'b' char c; // also 3 bytes of padding before 'c', so that 'b' ends at a multiple of 4 }; void init(struct A *a) { a x a->a = 1; y y y y z a->b.x = 2; a->b.y = 3; c a->b.z = 4; a->c = 5; } $ gcc -fomit-frame-pointer -mno-red-zone -Og -S align.c; cat align.s | grep mov movb $1, (%rdi) We still want each member of the nested struct to start at a multiple of its movb $2, 4(%rdi) size, but where should the nested struct itself start? movl $3, 8(%rdi) movb $4, 12(%rdi) Its start/end should have the largest alignment required by its members. movb $5, 16(%rdi)

7.5 Unions CS:APP 3.9.2 • Unions allow you to read/write the same memory region as variables with different types – All elements start at offset 0 – The size of the union is simply the size of the biggest member – Elements must be POD (plain old data) or at least default-constructible union Data1 { int x; 0 offset: char y; x Data1 y }; union Data2 { 2 0 offset: short w; Data2 p w char *p; (w/o padding) }; 0 1 2 3 offset: int main() { Recall x86 uses 56 03 00 00 union Data1 item; item little-endian item.x = 0x356; item 61 03 00 00 item.y = 'a'; }

7.6 Unions: Revealing Endianness #include <stdio.h> • 4-byte union • union int_bytes { x reads/writes an int int x; • bytes reads/writes char bytes[4]; }; 4 consecutive char int main() { union int_bytes ib; ib.x = 256; Note that bytes are printf("%08X is %02X %02X %02X %02X\n", ib.x, ib.bytes[3], ib.bytes[2], stored in reversed order ib.bytes[1], ib.bytes[0]); } // prints: // 00000100 is 00 00 01 00

7.7 Unions: hex encoding of a float #include <stdio.h> • 4-byte union • union float_int { i reads/writes an int float f; • f reads/writes a float int i; }; int main() { Endianness not noticeable: union float_int fi; fi.f = 1.0; members have same size. printf("%.2f is %08X\n", fi.f, fi.i); } // prints: // 1.00 is 3F800000

7.8 Buffer "overrun"/"overflow" attacks EXPLOITS VIA THE STACK AND THEIR PREVENTION

7.9 Arrays Bounds: Java, Python, C class Bounds { $ javac Bounds.java public static void main (String[] args) { $ java Bounds int [] x = new int [ 10 ]; Exception in thread "main" for ( int i = 0 ; i <= x.length; i++) { java.lang.ArrayIndexOutOfBoundsException: 10 x[i] = i; at Bounds.main(Bounds.java:7) } } } x = [ 0 ] * 10 $ python3 bounds.py Traceback (most recent call last): # not pythonic! but still... File "bounds.py", line 5, in <module> for i in range(len(x) + 1 ): x[i] = i x[i] = i IndexError: list assignment index out of range #include <stdio.h> $ gcc bounds.c -o bounds int main () { $ ./bounds int x[ 10 ]; $ for ( int i = 0 ; i <= 10 ; i++) { x[i] = i; No failure! Why? } }

7.10 Arrays and Bounds Check CS:APP 3.10.3 • Many functions, especially those related to strings, may not check the bounds of an array • User or other input may overflow a fixed size array – Suppose the user types or passes "Tommy" to greet() or func1() – Note: gets() receives input from ' stdin ' until the user enters ' \n ' and places the string in the given array (no bound checks!) void greet() { void func1(char *str) { char name[10]; char copy[10]; gets (name); strcpy (copy, str); ... ... } } 0 5 0x7fffffa80: 5 0 9 0x7fffffef0: str 'T' 'o''m''m''y' 00 'T' 'o''m''m''y' 00 ... name 5 0x7fffffef0: 0 9 "Tommy" = 54 6f 6d 6d 79 00 copy 'T' 'o''m''m''y' 00 ...

7.11 Arrays and Bounds Check • Many functions, especially those related to strings, may not check the bounds of an array • User or other input may overflow a fixed size array – Suppose the user types or passes "Tommy" to greet() or func1() – Now suppose the user types or passes "Bartholomew" void greet() { void func1(char *str) { char name[10]; char copy[10]; gets (name); strcpy (copy, str); ... ... } } 0 0x7fffffa80: 11 5 0 9 0x7fffffef0: 'B''a''r''t''h' 'o' ... 'e''w' 00 str 'B' 'a''r''t''h''o' ... 'e' 'w' 00 name 0x7fffffef0: 0 9 What are we overwriting? copy ...

7.12 Buffer Overflow • Now recall these local arrays are stored on the stack where the return address is also stored • gets() will copy as much as the user types (until they enter the '\n' = 0x0a), overwriting anything on the stack Memory / RAM void greet() { ... 0xfffffffc char name[12]; "Tommy" = 54 6f 6d 6d 79 00 gets(name); 0000 0000 printf("Hello %s\n", name); Return } Address 0004 a048 0x7ffff0f8 0000 0000 0x7ffff0f4 greet: 0000 0000 0x7ffff0f0 subq $24, %rsp 0x7ffff0ec 0000 0000 movq %rsp, %rdi Processor movl $0, %eax 0x7ffff0e8 0000 0000 call gets rip 0000 0000 0004 d8c4 0000 0079 0x7ffff0e4 movl $.LC0, %esi movl $1, %edi 6d6d 6f54 0x7ffff0e0 rsp 0000 0000 7fff f0e0 movl $0, %eax name call __printf_chk rdi 0000 0000 0000 0000 addq $24, %rsp 0x0 ret

7.13 Overwriting the Return Address • An intelligent user could carefully craft a "long" input array and overwrite the return address with a desired value • How could this be exploited? Memory / RAM void greet() { ... 0xfffffffc User string: char name[12]; 54 6f 6d 6d 79 1e ac 5f 47 80 81 gets(name); 62 37 48 31 92 54 93 61 72 39 72 Overwritten 4314 9268 printf("Hello %s\n", name); 41 20 e8 73 32 3c 68 92 14 43 Return } 3c32 73e8 0x7ffff0f8 Address 2041 7239 0x7ffff0f4 greet: 7261 9354 0x7ffff0f0 subq $24, %rsp 0000 0x7ffff0ec 9231 4837 movq %rsp, %rdi Processor movl $0, %eax 0x7ffff0e8 6281 8047 call gets rip 0000 0000 0004 d8c4 5fac 1e79 0x7ffff0e4 movl $.LC0, %esi movl $1, %edi 6d6d 6f54 0x7ffff0e0 rsp 0000 0000 7fff f0e0 movl $0, %eax name call __printf_chk rdi 0000 0000 0000 0000 addq $24, %rsp 0x0 ret

7.14 Executing Code CS:APP 3.10.4 • We could determine the desired machine code for some sequence we want to execute on the machine and enter that as our string • We can then craft a return address to go to the starting location of our code Memory / RAM void greet() { ... 0xfffffffc User string: char name[12]; 54 6f 6d 6d 79 1e ac 5f 47 80 81 gets(name); Overwritten 62 37 48 31 92 54 93 61 72 39 72 0000 0000 printf("Hello %s\n", name); Return 41 20 e8 f0 ff 7f 00 00 00 00 } 7fff f0e8 0x7ffff0f8 Address 2041 7239 0x7ffff0f4 greet: 7261 9354 0x7ffff0f0 subq $24, %rsp 0000 0x7ffff0ec 9231 4837 movq %rsp, %rdi Processor movl $0, %eax 0x7ffff0e8 6281 8047 call gets rip 0000 0000 0004 d8c4 5fac 1e79 0x7ffff0e4 movl $.LC0, %esi movl $1, %edi 6d6d 6f54 0x7ffff0e0 rsp 0000 0000 7fff f0e0 movl $0, %eax name call __printf_chk rdi 0000 0000 0000 0000 addq $24, %rsp 0x0 ret

7.15 Exploits • Common code that we try to inject on the stack would start a shell so that we can now type any other commands • We can enter specific binary codes when a program prompts for a Typing: "\x54\x6f\x5d..." allows you enter the hex string by entering it in hex representation as a string using the \x prefix

7.16 Methods of Prevention • Various methods have been devised to prevent or make it harder to exploit this code – Better libraries that do not allow an overrun strcpy (char* dest, char* src) strncpy(char* dest, char* src, size_t len ) – Add a stack protector (e.g., canary values) – Address space layout randomization (ASLR) techniques – Privilege/access control bits

CS356 Unit 7 Data Layout & Intermediate Stack Frames 7.2 - PowerPoint PPT Presentation

7.1 CS356 Unit 7 Data Layout & Intermediate Stack Frames 7.2 Structs CS:APP 3.9.1 Structs are just collections of heterogeneous data Each member is laid out in consecutive memory locations, with some padding inserted to ensure

Introduction to CS356 CS356 Object-Oriented Design and Programming http://cs356.yusun.io

SOLID: Principles of OOD CS356 Object-Oriented Design and Programming http://cs356.yusun.io

HOUSING PROJECT 1 UNIT 4 UNIT 1 UNIT 6 UNIT 5 UNIT 3 UNIT 2 Application of the Concept

CS356 Unit 4 Intro to x86 Instruction Set 4.2 Why Learn Assembly To understand something of

CS356 Unit 5 x86 Control Flow 5.2 JUMP/BRANCHING OVERVIEW 5.3 Concept of Jumps/Branches

CS356 Unit 10 Memory Allocation & Heap Management 10.2 BASIC OS CONCEPTS & TERMINOLOGY

CS356 Unit 6 x86 Procedures Basic Stack Frames 6.2 Review of Program Counter (IP register)

CS356 Unit 9 Virtual Memory & Address Translation 9.2 Indirection Indirection means

CS356 Unit 6 x86 Procedures Basic Stack Frames 6.2 Review of Program Counter (Instruc. Pointer)

CS356 Unit 4 x86 Instruction Set 4.2 Why Learn Assembly Understand hardware limitations

CS356 Unit 12 Processor Hardware Organization Pipelining 12.2 From combinational to sequential

CS356 Unit 11 Linking 11.2 In complex C projects... We would like to: Split source into

CS356 Unit 15 Review 15.2 Final Jeopardy Binary Instruction Random Riddles Memory Processor

Goals Understand the terms and ideas used in a modern, high-performance processor CS356 Unit

CS356 Unit 12a Processor Hardware Organization BASIC HW Pipelining 12a.3 12a.4 Logic Circuits

CS356 Unit 10 Memory Allocation & Heap Management BASIC OS CONCEPTS & TERMINOLOGY 10.3

- ~ ~ ~ ~ ~ ~ ~ ~ ~ ~D - :Z C) bj =c = c:. ._ . . . S*+ 11111111XL 2z 896

Neutron electric dipole moment from lattice QCD with the theta term Jian Liang1, Terrence

Equilibrium and Balayage A mini-tutorial A. Martnez-Finkelshtein (U. Almera) Optimal

Class 37: Charging and Discharging RL Circuits Course Evaluation: 1. Starts Wednesday, ends Dec 10

Barry R Weingast Barry R. Weingast Stanford University (with Margaret Levi Tania Melo Frances

Who we are Introduction to Statistical Machine Chris = PhD student at University of

Materials for GSA Outreach October 2019 GSP Topics & Project Schedule GSP Topics GSP

Int Inter erdep epend endenc ence b e between een Fi Fiscal cal an and d Mo Monetar

CS356 Unit 7 Data Layout & Intermediate Stack Frames 7.2 - PowerPoint PPT Presentation

7.1 CS356 Unit 7 Data Layout & Intermediate Stack Frames 7.2 Structs CS:APP 3.9.1 Structs are just collections of heterogeneous data Each member is laid out in consecutive memory locations, with some padding inserted to ensure

Introduction to CS356 CS356 Object-Oriented Design and Programming http://cs356.yusun.io

SOLID: Principles of OOD CS356 Object-Oriented Design and Programming http://cs356.yusun.io

HOUSING PROJECT 1 UNIT 4 UNIT 1 UNIT 6 UNIT 5 UNIT 3 UNIT 2 Application of the Concept

CS356 Unit 4 Intro to x86 Instruction Set 4.2 Why Learn Assembly To understand something of

CS356 Unit 5 x86 Control Flow 5.2 JUMP/BRANCHING OVERVIEW 5.3 Concept of Jumps/Branches

CS356 Unit 10 Memory Allocation &amp; Heap Management 10.2 BASIC OS CONCEPTS &amp; TERMINOLOGY

CS356 Unit 6 x86 Procedures Basic Stack Frames 6.2 Review of Program Counter (IP register)

CS356 Unit 9 Virtual Memory &amp; Address Translation 9.2 Indirection Indirection means

CS356 Unit 6 x86 Procedures Basic Stack Frames 6.2 Review of Program Counter (Instruc. Pointer)

CS356 Unit 4 x86 Instruction Set 4.2 Why Learn Assembly Understand hardware limitations

CS356 Unit 12 Processor Hardware Organization Pipelining 12.2 From combinational to sequential

CS356 Unit 11 Linking 11.2 In complex C projects... We would like to: Split source into

CS356 Unit 15 Review 15.2 Final Jeopardy Binary Instruction Random Riddles Memory Processor

Goals Understand the terms and ideas used in a modern, high-performance processor CS356 Unit

CS356 Unit 12a Processor Hardware Organization BASIC HW Pipelining 12a.3 12a.4 Logic Circuits

CS356 Unit 10 Memory Allocation &amp; Heap Management BASIC OS CONCEPTS &amp; TERMINOLOGY 10.3

- ~ ~ ~ ~ ~ ~ ~ ~ ~ ~~~~~D - :Z C) bj =~~~~c = c:. ._ . . . S*+ 11111111XL 2z 896

Neutron electric dipole moment from lattice QCD with the theta term Jian Liang1, Terrence

Equilibrium and Balayage A mini-tutorial A. Martnez-Finkelshtein (U. Almera) Optimal

Class 37: Charging and Discharging RL Circuits Course Evaluation: 1. Starts Wednesday, ends Dec 10

Barry R Weingast Barry R. Weingast Stanford University (with Margaret Levi Tania Melo Frances

Who we are Introduction to Statistical Machine Chris = PhD student at University of

Materials for GSA Outreach October 2019 GSP Topics &amp; Project Schedule GSP Topics GSP

Int Inter erdep epend endenc ence b e between een Fi Fiscal cal an and d Mo Monetar

CS356 Unit 10 Memory Allocation & Heap Management 10.2 BASIC OS CONCEPTS & TERMINOLOGY

CS356 Unit 9 Virtual Memory & Address Translation 9.2 Indirection Indirection means

CS356 Unit 10 Memory Allocation & Heap Management BASIC OS CONCEPTS & TERMINOLOGY 10.3

- ~ ~ ~ ~ ~ ~ ~ ~ ~ ~D - :Z C) bj =c = c:. ._ . . . S*+ 11111111XL 2z 896

Materials for GSA Outreach October 2019 GSP Topics & Project Schedule GSP Topics GSP