CS241 Computer Organization Spring 2015 Buffer Overflow 4-022015 - - PowerPoint PPT Presentation

cs241 computer organization spring 2015
SMART_READER_LITE
LIVE PREVIEW

CS241 Computer Organization Spring 2015 Buffer Overflow 4-022015 - - PowerPoint PPT Presentation

CS241 Computer Organization Spring 2015 Buffer Overflow 4-022015 Outline Linking & Loading, continued Buffer Overflow Read: CSAPP2: section 3.12: out-of-bounds memory references & buffer overflow K&R:


slide-1
SLIDE 1

CS241
 Computer Organization
 Spring 2015

Buffer Overflow 4-02–2015

slide-2
SLIDE 2

Linking & Loading, continued Buffer Overflow

Read:

■ CSAPP2: section 3.12: out-of-bounds memory references & buffer overflow ■ K&R: Chapter 5, section 5.11 ■ C Traps & Pitfalls (course website, on-line references)

Quiz today on IA32 (HW4) Quiz Tuesday, April 7th on run-time stack (HW5) Lab#3 BufferLab goes live tomorrow HW#7 due today

HW#6 due: Tuesday, April 7th

Outline

slide-3
SLIDE 3

Carnegie Mellon

Linker Symbols

⬛ Global symbols

▪ Symbols defined by module m that can be referenced by other

modules.

▪ E.g.: non-static C functions and non-static global variables.

⬛ External symbols

▪ Global symbols that are referenced by module m but defined by

some other module.

⬛ Local symbols

▪ Symbols that are defined and referenced exclusively by module m. ▪ E.g.: C functions and variables defined with the static attribute. ▪ Local linker symbols are not local program variables

slide-4
SLIDE 4

Carnegie Mellon

Resolving Symbols

int buf[2] = {1, 2}; int main() { swap(); return 0; } main.c extern int buf[]; static int *bufp0 = &buf[0]; static int *bufp1; void swap() { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; } swap.c Global External External Local Global Linker knows nothing of temp

slide-5
SLIDE 5

Carnegie Mellon

Relocating Code and Data

main()

main.o

int *bufp0=&buf[0] swap()

swap.o

int buf[2]={1,2} Headers main() swap() System code int *bufp0=&buf[0] int buf[2]={1,2} System data More system code int *bufp1 System data

Relocatable Object Files Executable Object File

.text .text .data .text .data .text .data .bss

.symtab .debug

.data

Uninitialized data

.bss

System code

slide-6
SLIDE 6

Carnegie Mellon

Relocation Info (main)

0000000 <main>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 83 ec 08 sub $0x8,%esp 6: e8 fc ff ff ff call 7 <main+0x7> 7: R_386_PC32 swap b: 31 c0 xor %eax,%eax d: 89 ec mov %ebp,%esp f: 5d pop %ebp 10: c3 ret Disassembly of section .data: 00000000 <buf>: 0: 01 00 00 00 02 00 00 00 Source: objdump

int buf[2] = {1,2}; int main() { swap(); return 0; } main.c main.o

slide-7
SLIDE 7

Carnegie Mellon

Relocation Info (swap, .text)

Disassembly of section .text: 00000000 <swap>: 0: 55 push %ebp 1: 8b 15 00 00 00 00 mov 0x0,%edx 3: R_386_32 bufp0 7: a1 0 00 00 00 mov 0x4,%eax 8: R_386_32 buf c: 89 e5 mov %esp,%ebp e: c7 05 00 00 00 00 04movl $0x4,0x0 15: 00 00 00 10: R_386_32 bufp1 14: R_386_32 buf 18: 89 ec mov %ebp,%esp 1a: 8b 0a mov (%edx),%ecx 1c: 89 02 mov %eax,(%edx) 1e: a1 00 00 00 00 mov 0x0,%eax 1f: R_386_32 bufp1 23: 89 08 mov %ecx,(%eax) 25: 5d pop %ebp 26: c3 ret

extern int buf[]; static int *bufp0 = &buf[0]; static int *bufp1; void swap() { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; } swap.c swap.o

slide-8
SLIDE 8

Carnegie Mellon

Relocation Info (swap, .data)

Disassembly of section .data: 00000000 <bufp0>: 0: 00 00 00 00 0: R_386_32 buf

extern int buf[]; static int *bufp0 = &buf[0]; static int *bufp1; void swap() { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; } swap.c

slide-9
SLIDE 9

Carnegie Mellon

Executable After Relocation (.text)

080483b4 <main>: 80483b4: 55 push %ebp 80483b5: 89 e5 mov %esp,%ebp 80483b7: 83 ec 08 sub $0x8,%esp 80483ba: e8 09 00 00 00 call 80483c8 <swap> 80483bf: 31 c0 xor %eax,%eax 80483c1: 89 ec mov %ebp,%esp 80483c3: 5d pop %ebp 80483c4: c3 ret 080483c8 <swap>: 80483c8: 55 push %ebp 80483c9: 8b 15 5c 94 04 08 mov 0x804945c,%edx 80483cf: a1 58 94 04 08 mov 0x8049458,%eax 80483d4: 89 e5 mov %esp,%ebp 80483d6: c7 05 48 95 04 08 58 movl $0x8049458,0x8049548 80483dd: 94 04 08 80483e0: 89 ec mov %ebp,%esp 80483e2: 8b 0a mov (%edx),%ecx 80483e4: 89 02 mov %eax,(%edx) 80483e6: a1 48 95 04 08 mov 0x8049548,%eax 80483eb: 89 08 mov %ecx,(%eax) 80483ed: 5d pop %ebp 80483ee: c3 ret

slide-10
SLIDE 10

Carnegie Mellon

Executable After Relocation (.data)

Disassembly of section .data: 08049454 <buf>: 8049454: 01 00 00 00 02 00 00 00 0804945c <bufp0>: 804945c: 54 94 04 08

slide-11
SLIDE 11

Carnegie Mellon

Strong and Weak Symbols

⬛ Program symbols are either strong or weak

▪ Strong: procedures and initialized globals ▪ Weak: uninitialized globals

int foo=5; p1() { } int foo; p2() { } p1.c p2.c strong weak strong strong

slide-12
SLIDE 12

Carnegie Mellon

Linker’s Symbol Rules

⬛ Rule 1: Multiple strong symbols are not allowed

▪ Each item can be defined only once ▪ Otherwise: Linker error

⬛ Rule 2: Given a strong symbol and multiple weak

symbol, choose the strong symbol

▪ References to the weak symbol resolve to the strong symbol

⬛ Rule 3: If there are multiple weak symbols, pick an

arbitrary one

▪ Can override this with gcc –fno-common

slide-13
SLIDE 13

Carnegie Mellon

Linker Puzzles

int x; p1() {} int x; p2() {} int x; int y; p1() {} double x; p2() {} int x=7; int y=5; p1() {} double x; p2() {} int x=7; p1() {} int x; p2() {} int x; p1() {} p1() {}

Link time error: two strong symbols (p1) References to x will refer to the same uninitialized int. Is this what you really want? Writes to x in p2 might overwrite y! Evil! Writes to x in p2 will overwrite y! Nasty! Nightmare scenario: two identical weak structs, compiled by different compilers with different alignment rules. References to x will refer to the same initialized variable.

slide-14
SLIDE 14

Carnegie Mellon

Global Variables

⬛ Avoid if you can ⬛ Otherwise

▪ Use static if you can ▪ Initialize if you define a global variable ▪ Use extern if you use external global variable

slide-15
SLIDE 15

Carnegie Mellon

Packaging Commonly Used Functions

⬛ How to package functions commonly used by

programmers?

▪ Math, I/O, memory management, string manipulation, etc.

⬛ Awkward, given the linker framework so far:

▪ Option 1: Put all functions into a single source file

▪ Programmers link big object file into their programs ▪ Space and time inefficient

▪ Option 2: Put each function in a separate source file

▪ Programmers explicitly link appropriate binaries into their

programs

▪ More efficient, but burdensome on the programmer

slide-16
SLIDE 16

Carnegie Mellon

Solution: Static Libraries

⬛ Static libraries (.a archive files)

▪ Concatenate related relocatable object files into a single file

with an index (called an archive).

▪ Enhance linker so that it tries to resolve unresolved external

references by looking for the symbols in one or more archives.

▪ If an archive member file resolves reference, link into

executable.

slide-17
SLIDE 17

Carnegie Mellon

Creating Static Libraries

Translator atoi.c atoi.o Translator printf.c printf.o libc.a Archiver (ar)

...

Translator random.c random.o

unix> ar rs libc.a \ atoi.o printf.o … random.o

C standard library

⬛ Archiver allows incremental updates ⬛ Recompile function that changes and replace .o file in archive.

slide-18
SLIDE 18

Carnegie Mellon

Commonly Used Libraries

libc.a (the C standard library) ▪ 8 MB archive of 900 object files. ▪ I/O, memory allocation, signal handling, string handling, data and time,

random numbers, integer math

libm.a (the C math library) ▪ 1 MB archive of 226 object files. ▪ floating point math (sin, cos, tan, log, exp, sqrt, …)

% ar -t /usr/lib/libc.a | sort … fork.o … fprintf.o fpu_control.o fputc.o freopen.o fscanf.o fseek.o fstab.o … % ar -t /usr/lib/libm.a | sort … e_acos.o e_acosf.o e_acosh.o e_acoshf.o e_acoshl.o e_acosl.o e_asin.o e_asinf.o e_asinl.o …

slide-19
SLIDE 19

Carnegie Mellon

Linking with Static Libraries

Translators (cpp, cc1, as) main2.c main2.o libc.a Linker (ld) p2 printf.o and any other modules called by printf.o libvector.a addvec.o Static libraries Relocatable

  • bject files

Fully linked executable object file vector.h Archiver (ar) addvec.o multvec.o

slide-20
SLIDE 20

Carnegie Mellon

Using Static Libraries

⬛ Linker’s algorithm for resolving external references:

▪ Scan .o files and .a files in the command line order. ▪ During the scan, keep a list of the current unresolved

references.

▪ As each new .o or .a file, obj, is encountered, try to resolve

each unresolved reference in the list against the symbols defined in obj.

▪ If any entries in the unresolved list at end of scan, then error.

⬛ Problem:

▪ Command line order matters! ▪ Moral: put libraries at the end of the command line.

unix> gcc -L. libtest.o -lmine unix> gcc -L. -lmine libtest.o libtest.o: In function `main': libtest.o(.text+0x4): undefined reference to `libfun'

slide-21
SLIDE 21

Carnegie Mellon

Loading Executable Object Files

ELF header Program header table (required for executables) .text section .data section .bss section .symtab .debug Section header table (required for relocatables)

Executable Object File

Kernel virtual memory Memory-mapped region for shared libraries Run-time heap (created by malloc) User stack (created at runtime) Unused %esp (stack pointer) Memory invisible to user code brk

0xc0000000 0x08048000 0x40000000

Read/write segment (.data, .bss) Read-only segment (.init, .text, .rodata) Loaded from the executable file .rodata section .line .init section .strtab

slide-22
SLIDE 22
slide-23
SLIDE 23

Carnegie Mellon

Internet Worm and IM War

⬛ November, 1988

▪ Internet Worm attacks thousands of Internet hosts. ▪ How did it happen?

slide-24
SLIDE 24

Carnegie Mellon

Internet Worm and IM War

⬛ November, 1988

▪ Internet Worm attacks thousands of Internet hosts. ▪ How did it happen?

⬛ July, 1999

▪ Microsoft launches MSN Messenger (instant messaging

system).

▪ Messenger clients can access popular AOL Instant Messaging

Service (AIM) servers

AIM AIM AIM MSN MSN

slide-25
SLIDE 25

Carnegie Mellon

Internet Worm and IM War (cont.)

⬛ August 1999

▪ Mysteriously, Messenger clients can no longer access AIM servers. ▪ Microsoft and AOL begin the IM war:

▪ AOL changes server to disallow Messenger clients ▪ Microsoft makes changes to clients to defeat AOL changes. ▪ At least 13 such skirmishes.

▪ How did it happen?

⬛ The Internet Worm and AOL/Microsoft War were both

based on stack buffer overflow exploits!

▪ many Unix functions do not check argument sizes. ▪ allows target buffers to overflow.

slide-26
SLIDE 26

Carnegie Mellon

String Library Code

⬛ Implementation of Unix function gets()

▪ No way to specify limit on number of characters to read

⬛ Similar problems with other Unix functions

▪ strcpy: Copies string of arbitrary length ▪ scanf, fscanf, sscanf, when given %s conversion

specification

/* Get string from stdin */ char *gets(char *dest)
 {
 int c = getchar(); char *p = dest; while (c != EOF && c != '\n') { *p++ = c; c = getchar(); } *p = '\0'; return dest; }

slide-27
SLIDE 27

Carnegie Mellon

Vulnerable Buffer Code

int main()
 {
 printf("Type a string:");
 echo();
 return 0;
 } /* Echo Line */
 void echo()
 {
 char buf[4]; /* Way too small! */
 gets(buf);
 puts(buf);
 } unix>./bufdemo Type a string:1234567 1234567 unix>./bufdemo Type a string:12345678 Segmentation Fault unix>./bufdemo Type a string:123456789ABC Segmentation Fault

slide-28
SLIDE 28

Carnegie Mellon

Buffer Overflow Disassembly

080484f0 <echo>: 80484f0: 55 push %ebp 80484f1: 89 e5 mov %esp,%ebp 80484f3: 53 push %ebx 80484f4: 8d 5d f8 lea 0xfffffff8(%ebp),%ebx 80484f7: 83 ec 14 sub $0x14,%esp 80484fa: 89 1c 24 mov %ebx,(%esp) 80484fd: e8 ae ff ff ff call 80484b0 <gets> 8048502: 89 1c 24 mov %ebx,(%esp) 8048505: e8 8a fe ff ff call 8048394 <puts@plt> 804850a: 83 c4 14 add $0x14,%esp 804850d: 5b pop %ebx 804850e: c9 leave 804850f: c3 ret 80485f2: e8 f9 fe ff ff call 80484f0 <echo> 80485f7: 8b 5d fc mov 0xfffffffc(%ebp),%ebx 80485fa: c9 leave 80485fb: 31 c0 xor %eax,%eax 80485fd: c3 ret

slide-29
SLIDE 29

Carnegie Mellon

Buffer Overflow Stack

echo: pushl %ebp # Save %ebp on stack movl %esp, %ebp pushl %ebx # Save %ebx leal -8(%ebp),%ebx # Compute buf as %ebp-8 subl $20, %esp # Allocate stack space movl %ebx, (%esp) # Push buf on stack call gets # Call gets . . . /* Echo Line */
 void echo()
 {
 char buf[4]; /* Way too small! */
 gets(buf);
 puts(buf);
 }

Return Address Saved %ebp %ebp Stack Frame for main Stack Frame for echo [3] [2] [1] [0] buf Before call to gets

slide-30
SLIDE 30

Carnegie Mellon

Buffer Overflow Stack Example

unix> gdb bufdemo (gdb) break echo Breakpoint 1 at 0x8048583 (gdb) run Breakpoint 1, 0x8048583 in echo () (gdb) print /x $ebp $1 = 0xffffc638 (gdb) print /x *(unsigned *)$ebp $2 = 0xffffc658 (gdb) print /x *((unsigned *)$ebp + 1) $3 = 0x80485f7

80485f2: call 80484f0 <echo> 80485f7: mov 0xfffffffc(%ebp),%ebx # Return Point 0xffffc638 buf 0xffffc658 Return Address Saved %ebp Stack Frame for main Stack Frame for echo [3] [2] [1] [0] Stack Frame for main Stack Frame for echo xx xx xx xx buf ff ff c6 58 08 04 85 f7 Before call to gets Before call to gets

slide-31
SLIDE 31

Carnegie Mellon

Buffer Overflow Example #1

Overflow buf, but no problem

0xffffc638 0xffffc658 Stack Frame for main Stack Frame for echo xx xx xx xx buf ff ff c6 58 08 04 85 f7 0xffffc638 0xffffc658 Stack Frame for main Stack Frame for echo 34 33 32 31 buf ff ff c6 58 08 04 85 f7 00 37 36 35 Before call to gets Input 1234567

slide-32
SLIDE 32

Carnegie Mellon

Buffer Overflow Example #2

Base pointer corrupted

0xffffc638 0xffffc658 Stack Frame for main Stack Frame for echo xx xx xx xx buf ff ff c6 58 08 04 85 f7 0xffffc638 0xffffc658 Stack Frame for main Stack Frame for echo 34 33 32 31 buf ff ff c6 00 08 04 85 f7 38 37 36 35 Before call to gets Input 12345678

. . . 804850a: 83 c4 14 add $0x14,%esp # deallocate space 804850d: 5b pop %ebx # restore %ebx 804850e: c9 leave # movl %ebp, %esp; popl %ebp 804850f: c3 ret # Return

slide-33
SLIDE 33

Carnegie Mellon

Buffer Overflow Example #3

Return address corrupted

0xffffc638 0xffffc658 Stack Frame for main Stack Frame for echo xx xx xx xx buf ff ff c6 58 08 04 85 f7 0xffffc638 0xffffc658 Stack Frame for main Stack Frame for echo 34 33 32 31 buf 43 42 41 39 08 04 85 00 38 37 36 35 Before call to gets Input 123456789ABC

80485f2: call 80484f0 <echo> 80485f7: mov 0xfffffffc(%ebp),%ebx # Return Point

slide-34
SLIDE 34

Carnegie Mellon

Malicious Use of Buffer Overflow

⬛ Input string contains byte representation of executable code ⬛ Overwrite return address with address of buffer ⬛ When bar() executes ret, will jump to exploit code

int bar() { char buf[64]; gets(buf); ... return ...; } void foo(){ bar(); ... } return address A Stack after call to gets() B foo stack frame bar stack frame B exploit code pad data written by gets()

slide-35
SLIDE 35

Carnegie Mellon

Exploits Based on Buffer Overflows

⬛ Buffer overflow bugs allow remote machines to

execute arbitrary code on victim machines

⬛ Internet worm

▪ Early versions of the finger server (fingerd) used gets() to read

the argument sent by the client:

▪ finger droh@cs.cmu.edu

▪ Worm attacked fingerd server by sending phony argument:

▪ finger “exploit-code padding new-return-

address”

▪ exploit code: executed a root shell on the victim machine

with a direct TCP connection to the attacker.

slide-36
SLIDE 36

Carnegie Mellon

Exploits Based on Buffer Overflows

⬛ Buffer overflow bugs allow remote machines to

execute arbitrary code on victim machines

⬛ IM War

▪ AOL exploited existing buffer overflow bug in AIM clients ▪ exploit code: returned 4-byte signature (the bytes at some

location in the AIM client) to server.

▪ When Microsoft changed code to match signature, AOL changed

signature location.

slide-37
SLIDE 37

Carnegie Mellon

Date: Wed, 11 Aug 1999 11:30:57 -0700 (PDT) From: Phil Bucking <philbucking@yahoo.com> Subject: AOL exploiting buffer overrun bug in their own software! To: rms@pharlap.com

  • Mr. Smith,

I am writing you because I have discovered something that I think you might find interesting because you are an Internet security expert with experience in this area. I have also tried to contact AOL but received no response. I am a developer who has been working on a revolutionary new instant messaging client that should be released later this year. ... It appears that the AIM client has a buffer overrun bug. By itself this might not be the end of the world, as MS surely has had its share. But AOL is now *exploiting their own buffer overrun bug* to help in its efforts to block MS Instant Messenger. .... Since you have significant credibility with the press I hope that you can use this information to help inform people that behind AOL's friendly exterior they are nefariously compromising peoples' security. Sincerely, Phil Bucking Founder, Bucking Consulting philbucking@yahoo.com

It was later determined that this email originated from within Microsoft!

slide-38
SLIDE 38

Carnegie Mellon

Code Red Worm

⬛ History

▪ June 18, 2001. Microsoft announces buffer overflow

vulnerability in IIS Internet server

▪ July 19, 2001. over 250,000 machines infected by new virus

in 9 hours

▪ White house must change its IP address. Pentagon shut down

public WWW servers for day

⬛ When We Set Up CS:APP Web Site

▪ Received strings of form

GET /default.ida? NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN....NNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN %u9090%u6858%ucbd3%u7801%u9090%u6858%ucbd3%u7801%u9 090%u6858%ucbd3%u7801%u9090%u9090%u8190%u00c3%u0003 %u8b00%u531b%u53ff%u0078%u0000%u00=a HTTP/1.0" 400 325 "-" "-"

slide-39
SLIDE 39

Carnegie Mellon

Code Red Exploit Code

⬛ Starts 100 threads running ⬛ Spread self

▪ Generate random IP addresses & send attack string ▪ Between 1st & 19th of month

⬛ Attack www.whitehouse.gov

▪ Send 98,304 packets; sleep for 4-1/2 hours; repeat

▪ Denial of service attack

▪ Between 21st & 27th of month

⬛ Deface server’s home page

▪ After waiting 2 hours

slide-40
SLIDE 40

Carnegie Mellon

Code Red Effects

⬛ Later Version Even More Malicious

▪ Code Red II ▪ As of April, 2002, over 18,000 machines infected ▪ Still spreading

⬛ Paved Way for NIMDA

▪ Variety of propagation methods ▪ One was to exploit vulnerabilities left behind by Code Red II

⬛ ASIDE (security flaws start at home)

▪ .rhosts used by Internet Worm ▪ Attachments used by MyDoom (1 in 6 emails Monday

morning!)

slide-41
SLIDE 41

Carnegie Mellon

Avoiding Overflow Vulnerability

⬛ Use library routines that limit string lengths

▪ fgets instead of gets ▪ strncpy instead of strcpy ▪ Don’t use scanf with %s conversion specification

▪ Use fgets to read the string ▪ Or use %ns where n is a suitable integer

/* Echo Line */
 void echo()
 {
 char buf[4]; /* Way too small! */
 fgets(buf, 4, stdin);
 puts(buf);
 }

slide-42
SLIDE 42

Carnegie Mellon

System-Level Protections

unix> gdb bufdemo (gdb) break echo (gdb) run (gdb) print /x $ebp $1 = 0xffffc638 (gdb) run (gdb) print /x $ebp $2 = 0xffffbb08 (gdb) run (gdb) print /x $ebp $3 = 0xffffc6a8

⬛ Randomized stack offsets

▪ At start of program, allocate random

amount of space on stack

▪ Makes it difficult for hacker to predict

beginning of inserted code

⬛ Nonexecutable code segments

▪ In traditional x86, can mark region of

memory as either “read-only” or “writeable”

▪ Can execute anything readable

▪ Add explicit “execute” permission

slide-43
SLIDE 43

Carnegie Mellon

Worms and Viruses

⬛ Worm: A program that

▪ Can run by itself ▪ Can propagate a fully working version of itself to other

computers

⬛ Virus: Code that

▪ Add itself to other programs ▪ Cannot run independently

⬛ Both are (usually) designed to spread among

computers and to wreak havoc