Memory corruption public enemy number 1 Erik Poll Digital Security - - PowerPoint PPT Presentation

memory corruption
SMART_READER_LITE
LIVE PREVIEW

Memory corruption public enemy number 1 Erik Poll Digital Security - - PowerPoint PPT Presentation

Software Security Memory corruption public enemy number 1 Erik Poll Digital Security Radboud University Nijmegen 1 Security in the development lifecycle 2.1 week 3: exercise week4: group project Security in the development lifecycle


slide-1
SLIDE 1

Software Security

Memory corruption

public enemy number 1

Erik Poll

Digital Security Radboud University Nijmegen

1

slide-2
SLIDE 2

Security in the development lifecycle

2.1

slide-3
SLIDE 3

Security in the development lifecycle

2.2

week 3: exercise

Static analysis PREfast

week4: group project

fuzzing afl memory sanitizers Asan, Msan

Memory Corruption

slide-4
SLIDE 4

Security in the development lifecycle

2.3

week 3: exercise

Static analysis PREfast

week4: group project

fuzzing afl memory sanitizers Asan, Msan

Memory Corruption

More foundational improvements later:

  • Safe(r) programming languages (week 5)
  • LangSec for safer input languages (week 6)
slide-5
SLIDE 5

Overview (next 2 weeks)

1. How do memory corruption flaws work? 2. What can be the impact? 3. How can we spot such problems in C(++) code?

Next weeks: tool-support for this

  • SAST: PREfast individual project
  • DAST: Fuzzing group project

4. What can ‘the platform’ do about it?

  • ie. the compiler, system libraries, hardware, OS, ..

5. What can the programmer do about it?

3

slide-6
SLIDE 6

Reading material

  • SoK article: ‘Eternal War in Memory’ S&P 2013

– Excl. Section VII.

– This article is quite dense. You are not expected to be able to reproduce or remember all the discussion here. It’s good enough if you can follow the article, with a steady supply of coffee while googling if the terminology is not clear.

  • Chapter 3.1 & 3.2 in lecture notes on memory-safety

We’ll revisit safe programming languages – incl. other safety features – and rest of Chapter 3 in later lecture

4

slide-7
SLIDE 7

Essence of the problem

Suppose in a C program you have an array of length 4

char buffer[4];

What happens if the statement below is executed?

buffer[4] = 'a';

5.1

slide-8
SLIDE 8

Essence of the problem

Suppose in a C program you have an array of length 4

char buffer[4];

What happens if the statement below is executed?

buffer[4] = 'a'; We don’t know! This is defined to be ANYTHING can happen

5.2

slide-9
SLIDE 9

undefined behaviour: anything

thing can happen

6

slide-10
SLIDE 10

7

undefined behaviour: anything

thing can happen

slide-11
SLIDE 11

undefined behaviour: anything

thing can happen

Suppose in a C program you have an array of length 4

char buffer[4];

What happens if the statement below is executed?

buffer[4] = 'a'; If the attacker can control the value 'a' then anything that the attacker wants may happen

  • If you are lucky : a SEGMENTATION FAULT

– and you’ll know that something went wrong

  • If you are unlucky : remote code execution (RCE)

– and you won’t know

8

slide-12
SLIDE 12

undefined behaviour: anything

thing can happen

Suppose in a C program you have an array of length 4

char buffer[4];

What happens if the statement below is executed?

buffer[4] = 'a'; A compiler could remove the statement above,

  • ie. do nothing
  • This would be correct compilation by the C standard

because anything includes nothing

  • Compiler may actually do this (as part of optimalisation) and this

has caused security problems; examples later & in the lecture notes.

9

slide-13
SLIDE 13

Solution to this problem

Regrettably, people often choose performance over security

  • As a result, buffer overflows have been the no 1 security

problem in software ever since.

  • Fortunately, Perl, Python, Java, C#, PHP, Javascript, and

Visual Basic do check array bounds

10.1

slide-14
SLIDE 14

Solution to this problem

  • Check array bounds at runtime

– Algol 60 proposed this back in 1960! Regrettably, people often choose performance over security

  • As a result, buffer overflows have been the no 1 security

problem in software ever since.

  • Fortunately, Perl, Python, Java, C#, PHP, Javascript, and

Visual Basic do check array bounds

10.2

slide-15
SLIDE 15

Solution to this problem

  • Check array bounds at runtime

– Algol 60 proposed this back in 1960!

  • Unfortunately, C and C++ have not adopted this solution.

Regrettably, people often choose performance over security

  • As a result, buffer overflows have been the no 1 security

problem in software ever since.

  • Fortunately, Perl, Python, Java, C#, PHP, Javascript, and

Visual Basic do check array bounds

10.3

slide-16
SLIDE 16

Solution to this problem

  • Check array bounds at runtime

– Algol 60 proposed this back in 1960!

  • Unfortunately, C and C++ have not adopted this solution.
  • Why?

Regrettably, people often choose performance over security

  • As a result, buffer overflows have been the no 1 security

problem in software ever since.

  • Fortunately, Perl, Python, Java, C#, PHP, Javascript, and

Visual Basic do check array bounds

10.4

slide-17
SLIDE 17

Solution to this problem

  • Check array bounds at runtime

– Algol 60 proposed this back in 1960!

  • Unfortunately, C and C++ have not adopted this solution.
  • Why?
  • For efficiency

Regrettably, people often choose performance over security

  • As a result, buffer overflows have been the no 1 security

problem in software ever since.

  • Fortunately, Perl, Python, Java, C#, PHP, Javascript, and

Visual Basic do check array bounds

10.5

slide-18
SLIDE 18

Tony Hoare on design principles of ALGOL 60

In his Turing Award lecture in 1980 “The first principle was security : ... every subscript was checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished an option to switch off these checks in the interests of efficiency. Unanimously, they urged us not to - they knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980, language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law.”

[C.A.R. Hoare, The Emperor’s Old Clothes, Communications of the ACM, 1980]

11

slide-19
SLIDE 19

Buffer overflow

  • The most common security problem in (machine code

compiled from) C and C++

  • ever since the first Morris Worm in 1988
  • Check out CVEs mentioning buffer (or buffer%20overflow)

https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=buffer

  • Ongoing arms race of attacks & defences:

attacks are getting cleverer, defeating ever better countermeasures

12

slide-20
SLIDE 20

More memory corruption problems

Errors with pointers and with dynamic memory (the heap)

13.1

slide-21
SLIDE 21

More memory corruption problems

Errors with pointers and with dynamic memory (the heap)

  • Have you ever written a C(++) program that uses pointers?

13.2

slide-22
SLIDE 22

More memory corruption problems

Errors with pointers and with dynamic memory (the heap)

  • Have you ever written a C(++) program that uses pointers?
  • Have you ever had such a program crashing?

13.3

slide-23
SLIDE 23

More memory corruption problems

Errors with pointers and with dynamic memory (the heap)

  • Have you ever written a C(++) program that uses pointers?
  • Have you ever had such a program crashing?
  • Have you even written a C(++) program that uses dynamic

memory, ie. malloc() and free()?

13.4

slide-24
SLIDE 24

More memory corruption problems

Errors with pointers and with dynamic memory (the heap)

  • Have you ever written a C(++) program that uses pointers?
  • Have you ever had such a program crashing?
  • Have you even written a C(++) program that uses dynamic

memory, ie. malloc() and free()?

  • Have you ever had such a program crashing?

13.5

slide-25
SLIDE 25

More memory corruption problems

Errors with pointers and with dynamic memory (the heap)

  • Have you ever written a C(++) program that uses pointers?
  • Have you ever had such a program crashing?
  • Have you even written a C(++) program that uses dynamic

memory, ie. malloc() and free()?

  • Have you ever had such a program crashing?

In C/C++, the programmer is responsible for memory management, and this is very error-prone – Technical term: C and C++ do not offer memory-safety

(see lecture notes, §3.1-3.2)

13.6

slide-26
SLIDE 26

Memory corruption problems

Typical causes

  • access outside array bounds
  • buggy pointer arithmetic
  • dereferencing null pointer
  • using a dangling pointer or stale pointer, caused by
  • use-after-free
  • double-free
  • forgetting to check for failures in allocation
  • forgetting to de-allocate, aka memory leaks
  • not a memory corruption issue,

but a memory availability issue

14

slide-27
SLIDE 27

Spot all (potential) defects

1000

1001 void f (){ 1002

char* buf, buf1;

1003

buf = malloc(100);

1004

buf[0] = ’a’;

...

2001

free(buf1);

2002

buf[0] = ’b’; ...

3001

free(buf);

3002

buf[0] = ’c’;

3003

buf1 = malloc(100);

3004

buf[0] = ’d’

3005 } 15.1

slide-28
SLIDE 28

Spot all (potential) defects

1000

1001 void f (){ 1002

char* buf, buf1;

1003

buf = malloc(100);

1004

buf[0] = ’a’;

...

2001

free(buf1);

2002

buf[0] = ’b’; ...

3001

free(buf);

3002

buf[0] = ’c’;

3003

buf1 = malloc(100);

3004

buf[0] = ’d’

3005 } 15.2

use-after-free; buf[0] points to de-allocated memory

slide-29
SLIDE 29

Spot all (potential) defects

1000

1001 void f (){ 1002

char* buf, buf1;

1003

buf = malloc(100);

1004

buf[0] = ’a’;

...

2001

free(buf1);

2002

buf[0] = ’b’; ...

3001

free(buf);

3002

buf[0] = ’c’;

3003

buf1 = malloc(100);

3004

buf[0] = ’d’

3005 } 15.3

use-after-free; buf[0] points to de-allocated memory use-after-free, but now buf[0] might point to memory that has now been re-allocated

slide-30
SLIDE 30

Spot all (potential) defects

1000

1001 void f (){ 1002

char* buf, buf1;

1003

buf = malloc(100);

1004

buf[0] = ’a’;

...

2001

free(buf1);

2002

buf[0] = ’b’; ...

3001

free(buf);

3002

buf[0] = ’c’;

3003

buf1 = malloc(100);

3004

buf[0] = ’d’

3005 } 15.4

possible null dereference (if malloc failed) use-after-free; buf[0] points to de-allocated memory use-after-free, but now buf[0] might point to memory that has now been re-allocated

slide-31
SLIDE 31

Spot all (potential) defects

1000

1001 void f (){ 1002

char* buf, buf1;

1003

buf = malloc(100);

1004

buf[0] = ’a’;

...

2001

free(buf1);

2002

buf[0] = ’b’; ...

3001

free(buf);

3002

buf[0] = ’c’;

3003

buf1 = malloc(100);

3004

buf[0] = ’d’

3005 } 15.5

potential use-after-free if buf & buf1 are aliased possible null dereference (if malloc failed) use-after-free; buf[0] points to de-allocated memory use-after-free, but now buf[0] might point to memory that has now been re-allocated

slide-32
SLIDE 32

Spot all (potential) defects

1000

1001 void f (){ 1002

char* buf, buf1;

1003

buf = malloc(100);

1004

buf[0] = ’a’;

...

2001

free(buf1);

2002

buf[0] = ’b’; ...

3001

free(buf);

3002

buf[0] = ’c’;

3003

buf1 = malloc(100);

3004

buf[0] = ’d’

3005 } 15.6

potential use-after-free if buf & buf1 are aliased possible null dereference (if malloc failed) use-after-free; buf[0] points to de-allocated memory use-after-free, but now buf[0] might point to memory that has now been re-allocated memory leak; pointer buf1 to this memory is lost & memory is never freed

slide-33
SLIDE 33

How does classic buffer overflow work? aka smashing the stack

16

slide-34
SLIDE 34

Process memory layout

17

Arguments/ Environment Stack Unused Memory Heap (dynamic data) Static Data Program Code .text

Low addresses High addresses Stack grows down, by procedure calls Heap grows up,

  • eg. by malloc

and new

.data

slide-35
SLIDE 35

Stack layout

The stack consists of Activation Records:

18.1

AR main()

Stack grows downwards void f(int x) { char[8] buf; gets(buf); } void main() { f(…); … } void format_hard_disk(){…}

slide-36
SLIDE 36

Stack layout

The stack consists of Activation Records:

18.2

AR main() AR f()

Stack grows downwards void f(int x) { char[8] buf; gets(buf); } void main() { f(…); … } void format_hard_disk(){…}

slide-37
SLIDE 37

Stack layout

The stack consists of Activation Records:

18.3

AR main() AR f()

Stack grows downwards void f(int x) { char[8] buf; gets(buf); } void main() { f(…); … } void format_hard_disk(){…} x return address buf[4..7] buf[0..3]

slide-38
SLIDE 38

Stack layout

The stack consists of Activation Records:

18.4

AR main() AR f()

Stack grows downwards void f(int x) { char[8] buf; gets(buf); } void main() { f(…); … } void format_hard_disk(){…} x return address buf[4..7] buf[0..3]

slide-39
SLIDE 39

Stack layout

The stack consists of Activation Records:

18.5

AR main() AR f()

Stack grows downwards void f(int x) { char[8] buf; gets(buf); } void main() { f(…); … } void format_hard_disk(){…} x return address buf[4..7] buf[0..3] Buffer grows upwards

slide-40
SLIDE 40

Stack overflow attack - case 1

What if gets() reads more than 8 bytes ?

19.1

AR main() AR f()

void f(int x) { char[8] buf; gets(buf); } void main() { f(…); … } void format_hard_disk(){…} x return address buf[4..7] buf[0..3]

slide-41
SLIDE 41

Stack overflow attack - case 1

What if gets() reads more than 8 bytes ? Attacker can jump to arbitrary point in the code!

19.2

AR main() AR f()

void f(int x) { char[8] buf; gets(buf); } void main() { f(…); … } void format_hard_disk(){…} x return address buf[4..7] buf[0..3]

slide-42
SLIDE 42

Stack overflow attack - case 2

What if gets() reads more than 8 bytes ? Attacker can jump to his own code (aka shell code)

20

AR main() AR f()

void f(int x) { char[8] buf; gets(buf); } void main() { f(…); … } void format_hard_disk(){…} x return address /bin/sh exec

slide-43
SLIDE 43

Stack overflow attack - case 2

What if gets() reads more than 8 bytes ? Attacker can jump to his own code (aka shell code)

21

AR main() AR f()

void f(int x) { char[8] buf; gets(buf); } void main() { f(…); … } void format_hard_disk(){…} x return address /bin/sh exec

never use gets!

gets has been removed from the C standard in 2011

slide-44
SLIDE 44

Code injection vs code reuse

The two attack scenarios in these examples (2) is a code injection attack

attacker inserts his own shell code in a buffer and corrupts return addresss to point to this code In the example, exec('/bin/sh') This is the classic buffer overflow attack

[Smashing the stack for fun and profit, Aleph One, 1996]

(1) is a code reuse attack

attacker corrupts return address to point to existing code In the example, format_hard_disk

Lots of details to get right!

  • knowing precise location of return address and other data on

stack, knowing address of code to jump to, ....

22

slide-45
SLIDE 45

What to attack? More fun on the stack

Suppose the attacker can overflow username void f(void(*error_handler)(int),...) { int diskquota = 200; bool is_super_user = false; char* filename = "/tmp/scratchpad"; char[8] username; int j = 12; ... }

23.1

slide-46
SLIDE 46

What to attack? More fun on the stack

Suppose the attacker can overflow username In addition to corrupting the return address, this might corrupt

  • pointers, eg filename
  • ther data on the stack, eg is_super_user,diskquota
  • function pointers, eg error_handler

But not j, unless the compiler chooses to allocate variables in a different order, which the compiler is free to do. void f(void(*error_handler)(int),...) { int diskquota = 200; bool is_super_user = false; char* filename = "/tmp/scratchpad"; char[8] username; int j = 12; ... }

23.2

slide-47
SLIDE 47

What to attack? Fun on the heap

struct BankAccount { int number; char username[20]; int balance; } Suppose attacker can overflow username

24.1

slide-48
SLIDE 48

What to attack? Fun on the heap

struct BankAccount { int number; char username[20]; int balance; } Suppose attacker can overflow username This can corrupt other fields in the struct. Which field(s) can be corrupted depends on the order of the fields in memory, which the compiler is free to choose.

24.2

slide-49
SLIDE 49

Spotting the problem

slide-50
SLIDE 50

str h e l l

  • \0

Reminder: C chars & strings

  • A char in C is always exactly one byte
  • A string is a sequence of chars terminated by a NULL byte
  • String variables are pointers of type char*

char* str = "hello"; // a string str strlen(str) = 5

26

slide-51
SLIDE 51

Example: gets

char buf[20]; gets(buf); // read user input until // first EoL or EoF character

  • Never use gets
  • gets has been removed from the C library
  • Use fgets(buf, size, file) instead

27

slide-52
SLIDE 52

Example: strcpy

char dest[20]; strcpy(dest, src); // copies string src to dest

  • strcpy assumes dest is long enough

and src is null-terminated

  • Use strncpy(dest, src, size) instead

28.1

slide-53
SLIDE 53

Example: strcpy

char dest[20]; strcpy(dest, src); // copies string src to dest

  • strcpy assumes dest is long enough

and src is null-terminated

  • Use strncpy(dest, src, size) instead

Beware of difference between sizeof and strlen sizeof(dest) = 20 // size of an array strlen(dest) = number of chars up to first null byte // length of a string

28.2

slide-54
SLIDE 54

Spot the defect!

char buf[20]; char prefix[] = "http://"; char* path; ... strcpy(buf, prefix); // copies the string prefix to buf strncat(buf, path, sizeof(buf)); // concatenates path to the string buf

29

slide-55
SLIDE 55

Spot the defect! (1)

char buf[20]; char prefix[] = "http://"; char* path; ... strcpy(buf, prefix); // copies the string prefix to buf strncat(buf, path, sizeof(buf)); // concatenates path to the string buf

30.1

slide-56
SLIDE 56

Spot the defect! (1)

char buf[20]; char prefix[] = "http://"; char* path; ... strcpy(buf, prefix); // copies the string prefix to buf strncat(buf, path, sizeof(buf)); // concatenates path to the string buf

30.2

strncat’s 3rd parameter is number

  • f chars to copy, not the buffer size

So this should be sizeof(buf)-7

slide-57
SLIDE 57

Spot the defect! (2)

char src[9]; char dest[9]; char* base_url = "www.ru.nl"; strncpy(src, base_url, 9); // copies base_url to src strcpy(dest, src); // copies src to dest

31

slide-58
SLIDE 58

char src[9]; char dest[9]; char* base_url = "www.ru.nl"; strncpy(src, base_url, 9); // copies base_url to src strcpy(dest, src); // copies src to dest

Spot the defect! (2)

32

base_url is 10 chars long, incl. its null terminator, so src will not be null-terminated

slide-59
SLIDE 59

Spot the defect! (2)

char src[9]; char dest[9]; char* base_url = ”www.ru.nl”; strncpy(src, base_url, 9); // copies base_url to src strcpy(dest, src); // copies src to dest

33

so strcpy will overrun the buffer dest, because src is not null-terminated base_url is 10 chars long, incl. its null terminator, so src is now not null-terminated

slide-60
SLIDE 60

Example: strcpy and strncpy

Don’t replace strcpy(dest, src) with strncpy(dest, src, sizeof(dest)) but with strncpy(dest, src, sizeof(dest)-1) dst[sizeof(dest)-1] = '\0'; if dest should be null-terminated! NB: a strongly typed programming language would guarantee that strings are always null-terminated, without the programmer having to worry about this...

34

slide-61
SLIDE 61

Spot the defect! (3)

char *buf; int len; ... buf = malloc(MAX(len,1024)); // allocate buffer read(fd,buf,len); // read len bytes into buf

35

slide-62
SLIDE 62

Spot the defect! (3)

char *buf; int len; ... buf = malloc(MAX(len,1024)); // allocate buffer read(fd,buf,len); // read len bytes into buf

36

What happens if len is negative? The length parameter of read is unsigned! So negative len is interpreted as a big positive one!

(At the exam, you’re not expected to remember that read treats its 3rd argument as an unsigned int)

slide-63
SLIDE 63

Spot the defect! (3)

char *buf; int len; ... if (len < 0) {error ("negative length"); return; } buf = malloc(MAX(len,1024)); read(fd,buf,len);

Note that buf is not guaranteed to be null-terminated; we ignore this for now.

37

slide-64
SLIDE 64

Spot the defect! (3)

char *buf; int len; ... if (len < 0) {error ("negative length"); return; } buf = malloc(MAX(len,1024)); read(fd,buf,len);

38.1

slide-65
SLIDE 65

Spot the defect! (3)

char *buf; int len; ... if (len < 0) {error ("negative length"); return; } buf = malloc(MAX(len,1024)); read(fd,buf,len);

38.2

What if the malloc() fails, because we ran out of memory ?

slide-66
SLIDE 66

Spot the defect! (3)

char *buf; int len; ... if (len < 0) {error ("negative length"); return; } buf = malloc(MAX(len,1024)); if (buf==NULL) { exit(-1);} // or something a bit more graceful read(fd,buf,len);

39

slide-67
SLIDE 67

Better still

char *buf; int len; ... if (len < 0) {error ("negative length"); return; } buf = calloc(MAX(len,1024)); //to initialise allocate memory to 0 if (buf==NULL) { exit(-1);} // or something a bit more graceful read(fd,buf,len);

40

slide-68
SLIDE 68

Spot the defect!

#define MAX_BUF 256 void BadCode (char* in) { short len; char buf[MAX_BUF]; len = strlen(in); if (len < MAX_BUF) strcpy(buf,in); }

41

slide-69
SLIDE 69

Spot the defect!

#define MAX_BUF 256 void BadCode (char* in) { short len; char buf[MAX_BUF]; len = strlen(in); if (len < MAX_BUF) strcpy(buf,in); }

42.1

slide-70
SLIDE 70

Spot the defect!

#define MAX_BUF 256 void BadCode (char* in) { short len; char buf[MAX_BUF]; len = strlen(in); if (len < MAX_BUF) strcpy(buf,in); }

42.2

What if in is longer than 32K ?

slide-71
SLIDE 71

Spot the defect!

#define MAX_BUF 256 void BadCode (char* in) { short len; char buf[MAX_BUF]; len = strlen(in); if (len < MAX_BUF) strcpy(buf,in); }

42.3

What if in is longer than 32K ? len may be a negative number, due to integer overflow

slide-72
SLIDE 72

Spot the defect!

#define MAX_BUF 256 void BadCode (char* in) { short len; char buf[MAX_BUF]; len = strlen(in); if (len < MAX_BUF) strcpy(buf,in); }

42.4

What if in is longer than 32K ? len may be a negative number, due to integer overflow hence: potential buffer overflow

slide-73
SLIDE 73

Spot the defect!

#define MAX_BUF 256 void BadCode (char* in) { short len; char buf[MAX_BUF]; len = strlen(in); if (len < MAX_BUF) strcpy(buf,in); }

The integer overflow is the root problem, the (heap) buffer overflow it causes makes it exploitable

See https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=integer+overflow

42.5

What if in is longer than 32K ? len may be a negative number, due to integer overflow hence: potential buffer overflow

slide-74
SLIDE 74

Spot the defect!

bool CopyStructs(InputFile* f, long count) { structs = new Structs[count]; for (long i = 0; i < count; i++) { if !(ReadFromFile(f,&structs[i]))) break; } }

43.1

slide-75
SLIDE 75

Spot the defect!

bool CopyStructs(InputFile* f, long count) { structs = new Structs[count]; for (long i = 0; i < count; i++) { if !(ReadFromFile(f,&structs[i]))) break; } }

43.2

effectively does a malloc(count*sizeof(type)) which may cause integer overflow

slide-76
SLIDE 76

Spot the defect!

bool CopyStructs(InputFile* f, long count) { structs = new Structs[count]; for (long i = 0; i < count; i++) { if !(ReadFromFile(f,&structs[i]))) break; } }

And this integer overflow can lead to a (heap) buffer overflow

Since 2005 Visual Studio C++ compiler adds check to prevent this

43.3

effectively does a malloc(count*sizeof(type)) which may cause integer overflow

slide-77
SLIDE 77

NB absence of language-level security

In a safer programming language than C/C++, the programmer would not have to worry about

  • writing past array bounds

(because you'd get an IndexOutOfBoundsException instead)

  • implicit conversions from signed to unsigned integers

(because the type system/compiler would forbid this or warn)

  • malloc possibly returning null

(because you'd get an OutOfMemoryException instead)

  • malloc not initialising memory

(because language could always ensure default initialisation)

  • integer overflow

(because you'd get an IntegerOverflowException instead)

  • ...

44

slide-78
SLIDE 78

Spot the defect!

  • 1. void* f(int start)

2. if (start+100 < start) return SOME_ERROR; 3. // checks for overflow 4. for (int i=start; i < start+100; i++) { 5. . . . // i will not overflow

  • 6. } }

45.1

slide-79
SLIDE 79

Spot the defect!

  • 1. void* f(int start)

2. if (start+100 < start) return SOME_ERROR; 3. // checks for overflow 4. for (int i=start; i < start+100; i++) { 5. . . . // i will not overflow

  • 6. } }

Integer overflow is undefined behaviour! This means

45.2

slide-80
SLIDE 80

Spot the defect!

  • 1. void* f(int start)

2. if (start+100 < start) return SOME_ERROR; 3. // checks for overflow 4. for (int i=start; i < start+100; i++) { 5. . . . // i will not overflow

  • 6. } }

Integer overflow is undefined behaviour! This means

  • You cannot assume that overflow produces a negative number;

so line 2 is not a good check for integer overflow.

45.3

slide-81
SLIDE 81

Spot the defect!

  • 1. void* f(int start)

2. if (start+100 < start) return SOME_ERROR; 3. // checks for overflow 4. for (int i=start; i < start+100; i++) { 5. . . . // i will not overflow

  • 6. } }

Integer overflow is undefined behaviour! This means

  • You cannot assume that overflow produces a negative number;

so line 2 is not a good check for integer overflow.

  • Worse still, if integer overflow occurs, behaviour is undefined, and

ANY compilation is ok

45.4

slide-82
SLIDE 82

Spot the defect!

  • 1. void* f(int start)

2. if (start+100 < start) return SOME_ERROR; 3. // checks for overflow 4. for (int i=start; i < start+100; i++) { 5. . . . // i will not overflow

  • 6. } }

Integer overflow is undefined behaviour! This means

  • You cannot assume that overflow produces a negative number;

so line 2 is not a good check for integer overflow.

  • Worse still, if integer overflow occurs, behaviour is undefined, and

ANY compilation is ok

  • So compiled code can do anything if start+100 overflows

45.5

slide-83
SLIDE 83

Spot the defect!

  • 1. void* f(int start)

2. if (start+100 < start) return SOME_ERROR; 3. // checks for overflow 4. for (int i=start; i < start+100; i++) { 5. . . . // i will not overflow

  • 6. } }

Integer overflow is undefined behaviour! This means

  • You cannot assume that overflow produces a negative number;

so line 2 is not a good check for integer overflow.

  • Worse still, if integer overflow occurs, behaviour is undefined, and

ANY compilation is ok

  • So compiled code can do anything if start+100 overflows
  • So compiled code can do nothing if start+100 overflows

45.6

slide-84
SLIDE 84

Spot the defect!

  • 1. void* f(int start)

2. if (start+100 < start) return SOME_ERROR; 3. // checks for overflow 4. for (int i=start; i < start+100; i++) { 5. . . . // i will not overflow

  • 6. } }

Integer overflow is undefined behaviour! This means

  • You cannot assume that overflow produces a negative number;

so line 2 is not a good check for integer overflow.

  • Worse still, if integer overflow occurs, behaviour is undefined, and

ANY compilation is ok

  • So compiled code can do anything if start+100 overflows
  • So compiled code can do nothing if start+100 overflows
  • This means the compiler may remove line 2

45.7

slide-85
SLIDE 85

Spot the defect!

  • 1. void* f(int start)

2. if (start+100 < start) return SOME_ERROR; 3. // checks for overflow 4. for (int i=start; i < start+100; i++) { 5. . . . // i will not overflow

  • 6. } }

Integer overflow is undefined behaviour! This means

  • You cannot assume that overflow produces a negative number;

so line 2 is not a good check for integer overflow.

  • Worse still, if integer overflow occurs, behaviour is undefined, and

ANY compilation is ok

  • So compiled code can do anything if start+100 overflows
  • So compiled code can do nothing if start+100 overflows
  • This means the compiler may remove line 2

Modern C compilers are clever enough to know x+100 < x is always false, and optimise code accordingly

45.8

slide-86
SLIDE 86

Spot the defect! (code from Linux kernel)

  • 1. unsigned int tun_chr_poll( struct file *file,
  • 2. poll_table *wait)
  • 3. { ...
  • 4. struct sock *sk = tun->sk; // take sk field of tun
  • 5. if (!tun) return POLLERR; // return if tun is NULL
  • 6. ...
  • 7. }

46.1

slide-87
SLIDE 87

Spot the defect! (code from Linux kernel)

  • 1. unsigned int tun_chr_poll( struct file *file,
  • 2. poll_table *wait)
  • 3. { ...
  • 4. struct sock *sk = tun->sk; // take sk field of tun
  • 5. if (!tun) return POLLERR; // return if tun is NULL
  • 6. ...
  • 7. }

If tun is a null pointer, then tun->sk is undefined

46.2

slide-88
SLIDE 88

Spot the defect! (code from Linux kernel)

  • 1. unsigned int tun_chr_poll( struct file *file,
  • 2. poll_table *wait)
  • 3. { ...
  • 4. struct sock *sk = tun->sk; // take sk field of tun
  • 5. if (!tun) return POLLERR; // return if tun is NULL
  • 6. ...
  • 7. }

If tun is a null pointer, then tun->sk is undefined What this function does if tun is null is undefined: ANYTHING may happen then.

46.3

slide-89
SLIDE 89

Spot the defect! (code from Linux kernel)

  • 1. unsigned int tun_chr_poll( struct file *file,
  • 2. poll_table *wait)
  • 3. { ...
  • 4. struct sock *sk = tun->sk; // take sk field of tun
  • 5. if (!tun) return POLLERR; // return if tun is NULL
  • 6. ...
  • 7. }

If tun is a null pointer, then tun->sk is undefined What this function does if tun is null is undefined: ANYTHING may happen then. So compiler can remove line 5, as the behaviour when tun is NULL is undefined anyway, so this check is 'redundant'.

46.4

slide-90
SLIDE 90

Spot the defect! (code from Linux kernel)

  • 1. unsigned int tun_chr_poll( struct file *file,
  • 2. poll_table *wait)
  • 3. { ...
  • 4. struct sock *sk = tun->sk; // take sk field of tun
  • 5. if (!tun) return POLLERR; // return if tun is NULL
  • 6. ...
  • 7. }

If tun is a null pointer, then tun->sk is undefined What this function does if tun is null is undefined: ANYTHING may happen then. So compiler can remove line 5, as the behaviour when tun is NULL is undefined anyway, so this check is 'redundant'. Standard compilers (gcc, clang) do this 'optimalisation' ! This is actually code from the Linux kernel, and removing line 5 led to a security vulnerability [CVE-2009-1897]

46.5

slide-91
SLIDE 91

Spot the defect! (code from Windows kernel)

// TCHAR is 1 byte ASCII or multiple byte UNICODE #ifdef UNICODE # define TCHAR wchar_t # define _sntprintf _snwprintf #else # define TCHAR char # define _sntprintf _snprintf #endif TCHAR buf[MAX_SIZE]; _sntprintf(buf, sizeof(buf), input);

For code handling ASCI: 1 character is one byte For code handling UNICODE: 1 character is several bytes

47.1 [slide from presentation by Jon Pincus]

slide-92
SLIDE 92

Spot the defect! (code from Windows kernel)

// TCHAR is 1 byte ASCII or multiple byte UNICODE #ifdef UNICODE # define TCHAR wchar_t # define _sntprintf _snwprintf #else # define TCHAR char # define _sntprintf _snprintf #endif TCHAR buf[MAX_SIZE]; _sntprintf(buf, sizeof(buf), input);

For code handling ASCI: 1 character is one byte For code handling UNICODE: 1 character is several bytes Lots of code written under the assumption that characters are

  • ne byte contained overflows after switch from ASCI to Unicode

The CodeRed worm exploited such an mismatch.

47.2

sizeof(buf) is the size in bytes, but this parameter gives the number

  • f characters that will be copied

[slide from presentation by Jon Pincus]

slide-93
SLIDE 93

Spot the defect!

#include <stdio.h> int main(int argc, char* argv[]) { if (argc > 1) printf(argv[1]); return 0; }

48.1

slide-94
SLIDE 94

Spot the defect!

#include <stdio.h> int main(int argc, char* argv[]) { if (argc > 1) printf(argv[1]); return 0; }

This program is vulnerable to format string attacks, where calling the program with strings containing special characters can result in a buffer overflow attack.

48.2

slide-95
SLIDE 95

Format string attacks

New type of memory corruption discovered in 2000

  • Strings can contain special characters, eg %s in

printf("Cannot find file %s", filename); Such strings are called format strings

  • What happens if we execute the code below?

printf("Cannot find file %s");

  • What can happen if we execute

printf(string) where string is user-supplied ?

  • Esp. if it contains special characters, eg %s, %x, %n, %hn?

49

slide-96
SLIDE 96

Format string attacks

Suppose attacker can feed malicious input string s to printf(s). This can

50.1

slide-97
SLIDE 97

Format string attacks

Suppose attacker can feed malicious input string s to printf(s). This can

  • read the stack

%x reads and prints bytes from stack so the input

%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x %x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x %x%x%x%x%x%x%x%x%x%x%x%x%x%x%x...

dumps the stack ,including passwords, keys,… stored on the stack

50.2

slide-98
SLIDE 98

Format string attacks

Suppose attacker can feed malicious input string s to printf(s). This can

  • read the stack

%x reads and prints bytes from stack so the input

%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x %x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x %x%x%x%x%x%x%x%x%x%x%x%x%x%x%x...

dumps the stack ,including passwords, keys,… stored on the stack

  • corrupt the stack

%n writes the number of characters printed to the stack, so input 12345678%n writes value 8 to the stack

50.3

slide-99
SLIDE 99

Format string attacks

Suppose attacker can feed malicious input string s to printf(s). This can

  • read the stack

%x reads and prints bytes from stack so the input

%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x %x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x %x%x%x%x%x%x%x%x%x%x%x%x%x%x%x...

dumps the stack ,including passwords, keys,… stored on the stack

  • corrupt the stack

%n writes the number of characters printed to the stack, so input 12345678%n writes value 8 to the stack

  • read arbitrary memory

a carefully crafted format string of the form \xEF\xCD\xCD\xAB %x%x...%x%s print the string at memory address ABCDCDEF

50.4

slide-100
SLIDE 100

Preventing format string attacks is EASY

  • Always replace printf(str)

with printf("%s", str)

  • Compiler or static analysis tool could warn if the number of

arguments does not match the format string, eg in

printf ("x is %i and y is %i", x);

Eg gcc has (far too many?) command line options for this:

  • Wformat –Wformat-no-literal –Wformat-security ...

Check https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=format+string to see how depressingly common format strings still are

51.1

  • Wformat-overflow
slide-101
SLIDE 101

Preventing format string attacks is EASY

  • Always replace printf(str)

with printf("%s", str)

  • Compiler or static analysis tool could warn if the number of

arguments does not match the format string, eg in

printf ("x is %i and y is %i", x);

Eg gcc has (far too many?) command line options for this:

  • Wformat –Wformat-no-literal –Wformat-security ...
  • If the format string is not a compile-time constant, we cannot

decide this at compile time  Would you want your compiler or SAST tool to give false positive

  • r false negative?

Check https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=format+string to see how depressingly common format strings still are

51.2

  • Wformat-overflow
slide-102
SLIDE 102

Recap: buffer overflows

  • Buffer overflow is #1 weakness in C and C++ programs

– because these language are not memory-safe

  • Tricky to spot
  • Typical cause: programming with arrays, pointers, and

strings – esp. library functions for null-terminated strings

  • Related attacks
  • Format string attack: another way of corrupting stack
  • Integer overflows: often a stepping stone to getting a

buffer to overflows

  • just the integer overflow can already have a security

impact, eg think of banking software

52

slide-103
SLIDE 103

Platform-level defences

slide-104
SLIDE 104

Platform-level defences

  • Defenses the compiler, hardware, OS,… can take,

without the programmer having to know

  • Some defenses may need OS & hardware support
  • Some defenses cause overhead

– if the overhead is unacceptable in production code, we can still use it when testing

  • Some defenses may break binary compatibility

– eg if a compiler adds extra book-keeping & checks, then all libraries may need to be re-compiled with that compiler

54

slide-105
SLIDE 105

Platform-level defenses

  • 1. Stack canaries
  • 2. Non-executable memory (NX, WX)
  • 3. Address space layout randomization (ASLR)

More advanced defenses

  • 1. More randomisation: eg. pointer & memory encryption
  • 2. More memory safety checks:
  • eg. checks on bounds (spatial) or on allocation (temporal)
  • 3. Checks on control flow
  • 4. Execution-aware memory protection

History shows that all new defenses are eventually defeated...

55

now standard

  • n many

platforms

slide-106
SLIDE 106
  • 1. Stack canaries
  • A dummy value - stack canary or cookie - is written on the stack

in front of the return address and checked when function returns

  • A careless stack overflow will overwrite the canary, which can

then be detected

  • first introduced in as StackGuard in gcc
  • nly very small runtime overhead

56

slide-107
SLIDE 107

Stack canaries

Stack without canary Stack with canary

57

x return address buf[4..7] buf[0..3] x return address buf[4..7] buf[0..3] canary value

slide-108
SLIDE 108

Further improvements

  • More variation in canary values: eg not a fixed values hardcoded

in binary but a random values chosen for each execution

  • Better still, XOR the return address into the canary value
  • Include a null byte in the canary value, because C string

functions cannot write nulls inside strings

58.1

slide-109
SLIDE 109

Further improvements

  • More variation in canary values: eg not a fixed values hardcoded

in binary but a random values chosen for each execution

  • Better still, XOR the return address into the canary value
  • Include a null byte in the canary value, because C string

functions cannot write nulls inside strings A careful attacker can still defeat canaries, by

  • verwriting the canary with the correct value
  • corrupting a pointer to point to the return address to then change

the return address without killing the canary eg changing to

58.2

return buf[4..7] buf[0..3] canary value char* ptr return buf[4..7] buf[0..3] canary value char* ptr

slide-110
SLIDE 110

Further improvements

  • Re-order elements on the stack to reduce the potential impact of
  • verruns
  • swapping parameters buf and fp on stack changes whether
  • verrunning buf can corrupt fp
  • which is especially dangerous if fp is a function pointer
  • hence it is safer to allocated array buffers ‘above’ all other

local variables First introduced by IBM’s ProPolice.

  • A separate shadow stack
  • with copies of return addresses, used to check for corrupted

return addresses

  • Of course, the attacker should not be able to corrupt the

shadow stack

60

slide-111
SLIDE 111

Windows 2003 Stack Protection

Nice example of the ways in which things can go wrong...

  • Enabled with /GS command line option in Visual Studio
  • When canary is corrupted, control is transferred to an exception

handler

61.1

slide-112
SLIDE 112

Windows 2003 Stack Protection

Nice example of the ways in which things can go wrong...

  • Enabled with /GS command line option in Visual Studio
  • When canary is corrupted, control is transferred to an exception

handler

  • Exception handler information is stored ...

61.2

slide-113
SLIDE 113

Windows 2003 Stack Protection

Nice example of the ways in which things can go wrong...

  • Enabled with /GS command line option in Visual Studio
  • When canary is corrupted, control is transferred to an exception

handler

  • Exception handler information is stored ...
  • n the stack!

61.3

slide-114
SLIDE 114

Windows 2003 Stack Protection

Nice example of the ways in which things can go wrong...

  • Enabled with /GS command line option in Visual Studio
  • When canary is corrupted, control is transferred to an exception

handler

  • Exception handler information is stored ...
  • n the stack!
  • Attacker can corrupt the exception handler info on the stack, in

the process corrupt the canaries, and then let Stack Protection mechanism transfer control to a malicious exception handler

61.4

slide-115
SLIDE 115

Windows 2003 Stack Protection

Nice example of the ways in which things can go wrong...

  • Enabled with /GS command line option in Visual Studio
  • When canary is corrupted, control is transferred to an exception

handler

  • Exception handler information is stored ...
  • n the stack!
  • Attacker can corrupt the exception handler info on the stack, in

the process corrupt the canaries, and then let Stack Protection mechanism transfer control to a malicious exception handler [http://www.securityfocus.com/bid/8522/info]

61.5

slide-116
SLIDE 116

Windows 2003 Stack Protection

Nice example of the ways in which things can go wrong...

  • Enabled with /GS command line option in Visual Studio
  • When canary is corrupted, control is transferred to an exception

handler

  • Exception handler information is stored ...
  • n the stack!
  • Attacker can corrupt the exception handler info on the stack, in

the process corrupt the canaries, and then let Stack Protection mechanism transfer control to a malicious exception handler [http://www.securityfocus.com/bid/8522/info]

  • Countermeasure: only allow transfer of control to registered

exception handlers

61.6

slide-117
SLIDE 117
  • 2. ASLR (Address Space Layout Randomisation)

62.1

slide-118
SLIDE 118
  • 2. ASLR (Address Space Layout Randomisation)
  • Attacker needs detailed info about memory layout

– eg to jump to specific piece of code –

  • r to corrupt a pointer at known position on the stack

62.2

slide-119
SLIDE 119
  • 2. ASLR (Address Space Layout Randomisation)
  • Attacker needs detailed info about memory layout

– eg to jump to specific piece of code –

  • r to corrupt a pointer at known position on the stack
  • Attacks become harder if we randomise the memory layout every

time we start a program

  • ie. change the offset of the heap, stack, etc, in memory by

some random value

62.3

slide-120
SLIDE 120
  • 2. ASLR (Address Space Layout Randomisation)
  • Attacker needs detailed info about memory layout

– eg to jump to specific piece of code –

  • r to corrupt a pointer at known position on the stack
  • Attacks become harder if we randomise the memory layout every

time we start a program

  • ie. change the offset of the heap, stack, etc, in memory by

some random value

  • Attackers can still analyse memory layout on their own laptop,

but will have to determine the offsets used on the victim’s machine to carry out an attack

62.4

slide-121
SLIDE 121
  • 2. ASLR (Address Space Layout Randomisation)
  • Attacker needs detailed info about memory layout

– eg to jump to specific piece of code –

  • r to corrupt a pointer at known position on the stack
  • Attacks become harder if we randomise the memory layout every

time we start a program

  • ie. change the offset of the heap, stack, etc, in memory by

some random value

  • Attackers can still analyse memory layout on their own laptop,

but will have to determine the offsets used on the victim’s machine to carry out an attack

  • NB security by obscurity, despite its bad reputation, is a really

great defense mechanism to annoy attackers!

62.5

slide-122
SLIDE 122
  • 2. ASLR (Address Space Layout Randomisation)
  • Attacker needs detailed info about memory layout

– eg to jump to specific piece of code –

  • r to corrupt a pointer at known position on the stack
  • Attacks become harder if we randomise the memory layout every

time we start a program

  • ie. change the offset of the heap, stack, etc, in memory by

some random value

  • Attackers can still analyse memory layout on their own laptop,

but will have to determine the offsets used on the victim’s machine to carry out an attack

  • NB security by obscurity, despite its bad reputation, is a really

great defense mechanism to annoy attackers!

  • Once the offset leaks, we’re back to square one…

62.6

slide-123
SLIDE 123
  • 3. Non-eXecutable memory (NX , WX,DEP)

Distinguish

  • X: executable memory (for storing code)
  • W: writeable, non-executable memory (for storing data)

and let processor refuse to execute non-executable code Attackers can then no longer jump to their own attack code, as any input provide as attack code will be non-executable Aka DEP (Data Execution Prevention). Intel calls it eXecute-Disable (XD) AMD calls it Enhanced Virus Protection

63.1

slide-124
SLIDE 124
  • 3. Non-eXecutable memory (NX , WX,DEP)

Distinguish

  • X: executable memory (for storing code)
  • W: writeable, non-executable memory (for storing data)

and let processor refuse to execute non-executable code Attackers can then no longer jump to their own attack code, as any input provide as attack code will be non-executable Aka DEP (Data Execution Prevention). Intel calls it eXecute-Disable (XD) AMD calls it Enhanced Virus Protection Limitation: this technique does not work for JIT (Just In Time) compilation, where e.g. JavaScript is compiled to machine code at run time.

63.2

slide-125
SLIDE 125

Defeating NX: return-to-libc attacks

With NX, code injection attacks no longer possible, but code reuse attacks still are...

  • Attackers can no longer corrupt code or insert their own code,

but can still corrupt code pointers

  • Called control-flow hijack in SoK paper

So instead of jumping to own attack code corrupt return address to jump to existing code

  • esp. library code in libc

libc is a rich library that offers lots of functionality,

  • eg. system(), exec(),

which provides attackers with all they need...

64

slide-126
SLIDE 126

(ROP)

Next stage in evolution of attacks, as people removed or protected dangerous libc calls such as system() Instead of using entire library call, attackers can

  • look for gadgets, small snippets of code which end with a return,

in the existing code base ...; ins1 ; ins2 ; ins3 ; ret

  • chain these gadgets together as subroutines to form a program

that does what they want This turns out to be doable

  • Most libraries contain enough gadgets to provide a Turing

complete programming language

  • ROP compilers can then translate arbitrary code to a string of

these gadgets A newer variant is Jump-Oriented Programming (JOP) which uses a different kind of code fragment as gadgets

65

slide-127
SLIDE 127

More advanced defences

[See SoK Eternal War in Memory paper]

66

slide-128
SLIDE 128

Types of (building blocks for) attacks

  • Code corruption attack

Overwrite the original program code in memory; impossible with WX

  • Control-flow hijack attack

Overwrite a code pointer, eg return address, jump address, function pointer, or pointer in vtable of C++ object

  • Data-only attack

Overwrite some data, eg bool isAdmin;

  • Information leak

Only reading some data; recall Heartbleed attack on TLS

67

slide-129
SLIDE 129

Control flow hijack via code pointers

  • A compiler translates function calls in source code to

call <address> or JSR <address> in machine code where <address> is the location of the code for the function.

68.1

slide-130
SLIDE 130

Control flow hijack via code pointers

  • A compiler translates function calls in source code to

call <address> or JSR <address> in machine code where <address> is the location of the code for the function.

  • For a function call f(...) in C a static address (or offset) of the

code for f may be known at compile time. If compiler can hard-code this static address in the binary, WX can prevent attackers from corrupting this address

68.2

slide-131
SLIDE 131

Control flow hijack via code pointers

  • A compiler translates function calls in source code to

call <address> or JSR <address> in machine code where <address> is the location of the code for the function.

  • For a function call f(...) in C a static address (or offset) of the

code for f may be known at compile time. If compiler can hard-code this static address in the binary, WX can prevent attackers from corrupting this address

  • For a virtual function call o->m(...) in C++ the address of the

code for m typically has to be determined at runtime, by inspecting the virtual function table (vtable) WX does not prevent attackers from corrupting code pointers in these tables

68.3

slide-132
SLIDE 132

Classification of defences [SoK paper]

  • Probabilistic methods

Basic idea: add randomness to make attacks harder – in location where certain data is located (eg ASLR),

  • r in the way data is represented in memory (eg pointer

encryption)

  • Memory Safety

Basic idea: do additional bookkeeping & add runtime checks to prevent some illegal memory access

  • Control-Flow Hijack Defenses

Basic idea: do additional bookkeeping & add runtime check to prevent strange control flow

69

slide-133
SLIDE 133

More randomness: Pointer Encryption (PointGuard)

  • Many buffer overflow attacks involve corrupting pointers,

pointers to data or code pointers

70.1

slide-134
SLIDE 134

More randomness: Pointer Encryption (PointGuard)

  • Many buffer overflow attacks involve corrupting pointers,

pointers to data or code pointers

  • To complicate this: store pointers encrypted in main memory,

unencrypted in registers – simple & fast encryption scheme: eg. XOR with a fixed value, randomly chosen when a process starts

70.2

slide-135
SLIDE 135

More randomness: Pointer Encryption (PointGuard)

  • Many buffer overflow attacks involve corrupting pointers,

pointers to data or code pointers

  • To complicate this: store pointers encrypted in main memory,

unencrypted in registers – simple & fast encryption scheme: eg. XOR with a fixed value, randomly chosen when a process starts

  • Attacker can still corrupt encrypted pointers in memory,

but these will not decrypt to predictable values

70

slide-136
SLIDE 136

More randomness: Pointer Encryption (PointGuard)

  • Many buffer overflow attacks involve corrupting pointers,

pointers to data or code pointers

  • To complicate this: store pointers encrypted in main memory,

unencrypted in registers – simple & fast encryption scheme: eg. XOR with a fixed value, randomly chosen when a process starts

  • Attacker can still corrupt encrypted pointers in memory,

but these will not decrypt to predictable values – This uses encryption to ensure integrity. Normally NOT a good idea, but here it works.

  • More extreme variant: Data Space Randomisation (DSR)

– store not just pointers encrypted in main memory, but store all data encrypted in memory

70

slide-137
SLIDE 137

More memory safety

Additional book-keeping of meta-data & extra runtime checks to prevent illegal memory access Different possibilities

  • add information to pointer about size of memory chunks it points

to (fat pointers)

  • add information to memory chunks about their size (Spatial

safety with object bounds)

71

ptr

slide-138
SLIDE 138

Fat pointers

The compiler

  • records size information for all pointers
  • adds runtime checks for pointer arithmetic & array indexing

A pointer A fat pointer r Downsides

  • Considerable execution time overhead
  • Not binary compatible – ie all code needs to be compiled to add

this book-keeping for all pointers

72

s

  • m

e d a t a p size p

slide-139
SLIDE 139

More memory safety

Additional book keeping of meta-data & extra runtime checks to prevent illegal memory access Different possibilities

  • add information to pointer about size of memory chunks it points

to (fat pointers)

  • add information to memory chunks about their size (Spatial

safety with object bounds)

  • keep a shadow administration of this meta-data, separate from

the pointers & the existing memory (SoftBounds)

  • keep a shadow administration of which memory cells have been

allocated (Valgrind, Memcheck, AddressSanitizer or ASan) – to also spot temporal bugs, ie. malloc/free bugs

73

ptr

slide-140
SLIDE 140

Object-based temporal safety (Valgrind, Memcheck, ASan)

Shadow admin

  • f allocated memory

to keep track of which memory is allocated, to generate runtime error when code tries to read/write unallocated memory

  • Can also catch spatial bugs, ie. small buffer overruns, by keeping

empty space between allocated chunks (unless overrun is huge) – small overrun will end up in this unallocated space

  • Cannot spot illegal access via a stale pointer if the data chunk it

points to has been re-allocated

  • Eg the last bug, line 3004, on slide 15

74

s

  • m

e d a t a

  • l

d j u n k X Y Z h e l l

  • \0

1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1

slide-141
SLIDE 141

Guard pages to improve memory safety

Allocate chunks with the end at a page boundary with a non-readable, non-writeable page between them Buffer overwrite or overread will cause a memory fault.

Small execution overhead, but big memory overhead

75

s

  • m

e d a t a h e l l

  • \0

p q

slide-142
SLIDE 142

Control Flow Integrity (CFI)

Extra bookkeeping & checks to spot unexpected control flow

  • Dynamic return integrity

Stack canaries, or shadow stack that keeps copies of all return addresses, providing extra check against corruption of return addresses

  • Static control flow integrity

Idea: determine the control flow graph (cfg) and monitor jumps in the control flow to spot deviant behavior If f() never calls g(), because g()does not even occur in the code of f(), then call from f() to g() is suspicious, as is a return from g() to f() We could interrupt execution when this happens This can detect Return-to-libc and ROP attacks

76

slide-143
SLIDE 143

Static control flow integrity: example code & CFG

Before and/or after every control transfer (function call or return) we could check if it is legal – ie. allowed by the cfg Some weird returns would still be allowed

  • eg if we call h() from g(), and the return is to f(), this would be

allowed by the static cfg

  • Additional dynamic return integrity check can narrow this down

to actual call site – using recorded call site on shadow stack

void f() { ... ; g(); ... ; g(); ... ; h(); ... } void g(){ ..h();} void h(){ ... }

77

call g call h return call g call h g() h() f() return

slide-144
SLIDE 144

Downsides of static control flow integrity checks

  • Requires a whole program analysis
  • Use of function pointers in C or virtual functions in C++ (that both

result in so-called indirect control transfers) complicate compile-time analysis of the cfg: we’d need

  • a points-to analysis to determine where such code pointers

can point to eg in C++, if Animal->eat() can resolve to Cat->eat() or Dog->eat(), so both these addresses are valid targets for transferring control

  • r: simply allow transfer to any function entry point

78

slide-145
SLIDE 145

Typical ypical in input ut prob

  • blem

lem

Input problems always follow the same pattern: 1)attacker supplies some malicious input 2)application 'processes' the input a)by itself and/or b)using external tools (OS, file system, SQL database, …) 3)processing 'goes of the rails' which unintentionally exposes dangerous functionality to the attacker

New(er) features of main OS [not exam material]

  • Pointer encryption in iOS (2018)
  • Hardware-enforced Stack Protection in Windows 10 (2020)
  • with a shadow stack,

using Intel Control-flow Enforcement Technology (CET)

https://techcommunity.microsoft.com/t5/windows-kernel-internals/understanding- hardware-enforced-stack-protection/ba-p/1247815

  • Evolution of CFI at Microsoft discussed by Joe Bialek

https://www.youtube.com/watch?v=oOqpl-2rMTw

The Evolution of CFI Attacks and Defenses @ OffensiveCON 18 79

slide-146
SLIDE 146

Exam questions: you should be able to

  • Explain how simple buffer overflows work & what root causes are
  • Spot a simple buffer overflow, memory-allocation problem,

format string attack, or integer overflow in some C code

  • Explain how countermeasures - such as stack canaries, non-

executable memory, ASLR, CFI, bounds checkers, pointer encryption, … - work

  • Explain why they might not always work