Reconciling High-Level Optimizations and Low-Level Code in LLVM - - PowerPoint PPT Presentation

reconciling high level optimizations and
SMART_READER_LITE
LIVE PREVIEW

Reconciling High-Level Optimizations and Low-Level Code in LLVM - - PowerPoint PPT Presentation

OOPSLA 18 Boston Reconciling High-Level Optimizations and Low-Level Code in LLVM Juneyoung Lee Seoul National Univ. Chung-Kil Hur MPI-SWS Ralf Jung Zhengyang Liu University of Utah John Regehr Nuno P. Lopes Microsoft Research


slide-1
SLIDE 1

Reconciling High-Level Optimizations and Low-Level Code in LLVM

Nuno P. Lopes

OOPSLA’18 Boston

Seoul National Univ. Juneyoung Lee Chung-Kil Hur Zhengyang Liu John Regehr University of Utah Ralf Jung Microsoft Research MPI-SWS

slide-2
SLIDE 2

Overview

2

PC

C

Low-level Code

slide-3
SLIDE 3

Overview

2

PC

C

Low-level Code

Allows access via int-to-ptr cast

slide-4
SLIDE 4

Overview

2

PIR P’IR

High-level Optimizations

PC

C LLVM IR

Low-level Code

Allows access via int-to-ptr cast

slide-5
SLIDE 5

Overview

2

PIR P’IR

High-level Optimizations

PC

C LLVM IR

Low-level Code

Allows access via int-to-ptr cast Assumes no one can access my local vars

slide-6
SLIDE 6

Overview

2

PIR P’IR

High-level Optimizations

PC

C LLVM IR

Low-level Code

Allows access via int-to-ptr cast Assumes no one can access my local vars

slide-7
SLIDE 7

Finding a Good Memory Model

  • A memory model specifies the behavior of memory operations
  • As a result, it determines
  • 1. Which low-level programs are valid
  • 2. Which high-level assumptions are valid
  • A good memory model should make valid both
  • 1. Common low-level programs
  • 2. Common high-level assumptions

3

slide-8
SLIDE 8

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); }

Memory ≠ Byte Array

4

slide-9
SLIDE 9

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); }

Memory ≠ Byte Array

4

slide-10
SLIDE 10

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); }

Memory ≠ Byte Array

4

slide-11
SLIDE 11

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); }

Memory ≠ Byte Array

4

slide-12
SLIDE 12

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Memory ≠ Byte Array

4

slide-13
SLIDE 13

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

We use C syntax for LLVM IR code for readability

constant prop.

Memory ≠ Byte Array

4

slide-14
SLIDE 14

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Memory ≠ Byte Array

4

0x0

Memory:

slide-15
SLIDE 15

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Memory ≠ Byte Array

4

0x0

Memory:

  • 0x100

0x100

slide-16
SLIDE 16

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Memory ≠ Byte Array

4

0x0

Memory:

  • 0x100

0x101

0x100 0x101

slide-17
SLIDE 17

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Memory ≠ Byte Array

4

0x0

Memory:

  • 0x100

0x101

0x100 0x101 true

slide-18
SLIDE 18

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Memory ≠ Byte Array

4

0x0

Memory:

  • 0x100

0x101

0x100 0x101 0x101 true

slide-19
SLIDE 19

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Memory ≠ Byte Array

4

0x0

Memory:

  • 0x100

0x101

0x100 0x101 0x101 true

10

slide-20
SLIDE 20

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Memory ≠ Byte Array

4

0x0

Memory:

  • 0x100

0x101

0x100 0x101 0x101 true

10

slide-21
SLIDE 21

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Memory ≠ Byte Array

4

0x0

Memory:

  • 0x100

0x101

0x100 0x101 0x101 true

10

10

slide-22
SLIDE 22

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Memory ≠ Byte Array

4

0x0

Memory:

  • 0x100

0x101

0x100 0x101 0x101 true

10

10

Problem

q can be accessed from p by pointer arithmetic

slide-23
SLIDE 23

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); }

0x100 0x101

  • 0x100

0x101

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Abstract Memory Explains Optimizations

5

0x0

Memory:

slide-24
SLIDE 24

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); }

(p,0x100) (q,0x101)

  • 0x100

0x101

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Abstract Memory Explains Optimizations

5

0x0

Memory:

p: -

0x100 0x101

q: 0

Provenance Provenance

slide-25
SLIDE 25

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); }

(p,0x100) (q,0x101)

  • 0x100

0x101

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Abstract Memory Explains Optimizations

5

0x0

Memory:

p: -

0x100

true

0x101

q: 0

Provenance Provenance

slide-26
SLIDE 26

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); }

(p,0x100) (q,0x101)

  • 0x100

0x101

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Abstract Memory Explains Optimizations

5

0x0

Memory:

p: -

0x100

(p,0x101) true

0x101

q: 0

Provenance Provenance

slide-27
SLIDE 27

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); }

(p,0x100) (q,0x101)

  • 0x100

0x101

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Abstract Memory Explains Optimizations

5

0x0

Memory:

p: -

0x100

(p,0x101) true

0x101

q: 0

Provenance Provenance

slide-28
SLIDE 28

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); }

(p,0x100) (q,0x101)

  • 0x100

0x101

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Abstract Memory Explains Optimizations

5

0x0

Memory:

p: -

0x100

(p,0x101) true

0x101

q: 0

Undefined Behavior because p ≠ q Provenance Provenance

slide-29
SLIDE 29

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); }

(p,0x100) (q,0x101)

  • 0x100

0x101

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

constant prop.

Abstract Memory Explains Optimizations

5

0x0

Memory:

p: -

0x100

(p,0x101) true

0x101

q: 0

Undefined Behavior because p ≠ q Provenance Provenance

Principles of UB

  • 1. Compilers assume input programs never raise UB
  • 2. Programmers should not write programs raising UB
slide-30
SLIDE 30

6

constant prop.

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

Miscompilation with Int-Ptr Casting

slide-31
SLIDE 31

6

constant prop.

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

Miscompilation with Int-Ptr Casting

slide-32
SLIDE 32

6

constant prop.

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); }

Miscompilation with Int-Ptr Casting

char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

  • int. eq.

prop. cast elim.

slide-33
SLIDE 33

7

Miscompilation with Int-Ptr Casting

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

  • int. eq.

prop.

slide-34
SLIDE 34

7

Miscompilation with Int-Ptr Casting

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

  • int. eq.

prop.

slide-35
SLIDE 35

7

Miscompilation with Int-Ptr Casting

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

  • int. eq.

prop.

slide-36
SLIDE 36

7

Miscompilation with Int-Ptr Casting

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

cast elim.

slide-37
SLIDE 37

7

Miscompilation with Int-Ptr Casting

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

constant prop.

slide-38
SLIDE 38

7

Miscompilation with Int-Ptr Casting

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

10

slide-39
SLIDE 39

7

Miscompilation with Int-Ptr Casting

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

10

slide-40
SLIDE 40

7

Miscompilation with Int-Ptr Casting

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

10

slide-41
SLIDE 41

7

Miscompilation with Int-Ptr Casting

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

10

We found this miscompilation bug in both LLVM & GCC

slide-42
SLIDE 42

7

Miscompilation with Int-Ptr Casting

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

10

Goal of this paper Finding a good memory model for pointer ↔ integer casting We found this miscompilation bug in both LLVM & GCC

slide-43
SLIDE 43

Problems & Our Solutions

8

slide-44
SLIDE 44

Problem 1

9

slide-45
SLIDE 45

Pointer → Integer Casting?

10

(int)p

slide-46
SLIDE 46

Pointer → Integer Casting?

10

(int)p

(p,0x100)

slide-47
SLIDE 47

Pointer → Integer Casting?

10

(int)p

(p,0x100)

slide-48
SLIDE 48

Pointer → Integer Casting?

10

(int)p

(p,0x100) (p,0x100)

  • 1. Carry

Provenance

slide-49
SLIDE 49

Pointer → Integer Casting?

10

(int)p

(p,0x100) 0x100 (p,0x100)

  • 2. Drop

Provenance

  • 1. Carry

Provenance

slide-50
SLIDE 50

Pointer → Integer Casting?

10

(int)p

(p,0x100) 0x100 (p,0x100)

  • 2. Drop

Provenance

  • 1. Carry

Provenance

slide-51
SLIDE 51

Pointer → Integer Casting?

10

(int)p

(p,0x100) 0x100 (p,0x100)

  • 2. Drop

Provenance

  • 1. Carry

Provenance

slide-52
SLIDE 52

k = (i==j ? i : j)

Carry Provenance: Integer Optimization Problem

11

k = j

  • 1. Carry Provenance
slide-53
SLIDE 53

k = (i==j ? i : j)

Carry Provenance: Integer Optimization Problem

11

k = j true

  • 1. Carry Provenance
slide-54
SLIDE 54

k = (i==j ? i : j)

Carry Provenance: Integer Optimization Problem

11

k = j true

  • 1. Carry Provenance

i

slide-55
SLIDE 55

k = (i==j ? i : j)

Carry Provenance: Integer Optimization Problem

11

k = j true

  • 1. Carry Provenance

j

slide-56
SLIDE 56

k = (i==j ? i : j)

Carry Provenance: Integer Optimization Problem

11

k = j

  • 1. Carry Provenance

false

slide-57
SLIDE 57

k = (i==j ? i : j)

Carry Provenance: Integer Optimization Problem

11

k = j

  • 1. Carry Provenance

false j

slide-58
SLIDE 58

k = (i==j ? i : j)

Carry Provenance: Integer Optimization Problem

11

k = j

  • 1. Carry Provenance

j

slide-59
SLIDE 59

k = (i==j ? i : j)

Carry Provenance: Integer Optimization Problem

11

k = j (p,0x100)

  • 1. Carry Provenance
slide-60
SLIDE 60

k = (i==j ? i : j)

Carry Provenance: Integer Optimization Problem

11

k = j (q,0x100) (p,0x100)

  • 1. Carry Provenance
slide-61
SLIDE 61

k = (i==j ? i : j)

Carry Provenance: Integer Optimization Problem

11

k = j true (q,0x100) (p,0x100)

  • 1. Carry Provenance
slide-62
SLIDE 62

k = (i==j ? i : j)

Carry Provenance: Integer Optimization Problem

11

k = j true (q,0x100) (p,0x100)

  • 1. Carry Provenance
slide-63
SLIDE 63

k = (i==j ? i : j)

Carry Provenance: Integer Optimization Problem

11

k = j true (q,0x100) (q,0x100) (p,0x100)

  • 1. Carry Provenance
slide-64
SLIDE 64

k = (i==j ? i : j)

Carry Provenance: Integer Optimization Problem

11

k = j true (q,0x100) (q,0x100) (p,0x100)

  • 1. Carry Provenance

Problem

Integer optimizations may change provenance

slide-65
SLIDE 65

k = (i==j ? i : j)

Carry Provenance: Integer Optimization Problem

12

k = j

  • 2. Drop Provenance

true (q,0x100) (q,0x100) (p,0x100)

slide-66
SLIDE 66

k = (i==j ? i : j)

Carry Provenance: Integer Optimization Problem

12

k = j

  • 2. Drop Provenance

true (q,0x100) (q,0x100) (p,0x100) 0x100 0x100 0x100

slide-67
SLIDE 67

Problem 2

13

slide-68
SLIDE 68

(int*)0x100

Integer → Pointer Casting?

14

char p[1]

0x100

Memory:

slide-69
SLIDE 69

(int*)0x100

Integer → Pointer Casting?

14

char p[1]

0x100

(Ø,0x100)

  • 1. Always Empty

Memory:

slide-70
SLIDE 70

(int*)0x100

Integer → Pointer Casting?

14

char p[1]

0x100

(Ø,0x100)

  • 1. Always Empty

Memory:

  • 2. Depending on

the Memory Layout

slide-71
SLIDE 71

(int*)0x100

(Ø,0x100)

Integer → Pointer Casting?

14

0x100

(Ø,0x100)

  • 1. Always Empty

Memory:

  • 2. Depending on

the Memory Layout

slide-72
SLIDE 72

(int*)0x100

(p,0x100)

Integer → Pointer Casting?

14

char p[1]

0x100

(Ø,0x100)

  • 1. Always Empty

Memory:

  • 2. Depending on

the Memory Layout

slide-73
SLIDE 73

(int*)0x100

(p,0x100)

Integer → Pointer Casting?

14

char p[1]

0x100

(Ø,0x100) (∗,0x100)

  • 1. Always Empty
  • 3. Always Full

Memory:

  • 2. Depending on

the Memory Layout

slide-74
SLIDE 74

(int*)0x100

(p,0x100)

Integer → Pointer Casting?

14

char p[1]

0x100

(Ø,0x100) (∗,0x100)

  • 1. Always Empty
  • 3. Always Full

Memory:

  • 2. Depending on

the Memory Layout

slide-75
SLIDE 75

(int*)0x100

(p,0x100)

Integer → Pointer Casting?

14

char p[1]

0x100

(Ø,0x100) (∗,0x100)

  • 1. Always Empty
  • 3. Always Full

Memory:

  • 2. Depending on

the Memory Layout

slide-76
SLIDE 76
  • 1. Always Empty

Empty Provenance: Pointer – Integer Round Trip

15

i = (int)p p2 = (char*)i *p2 = 10

char p[1]

0x100

Memory:

slide-77
SLIDE 77
  • 1. Always Empty

Empty Provenance: Pointer – Integer Round Trip

15

i = (int)p p2 = (char*)i *p2 = 10

char p[1]

0x100

Memory:

(p,0x100)

slide-78
SLIDE 78
  • 1. Always Empty

Empty Provenance: Pointer – Integer Round Trip

15

i = (int)p p2 = (char*)i *p2 = 10

char p[1]

0x100

Memory:

0x100 (p,0x100)

slide-79
SLIDE 79
  • 1. Always Empty

Empty Provenance: Pointer – Integer Round Trip

15

i = (int)p p2 = (char*)i *p2 = 10

(∅,0x100) char p[1]

0x100

Memory:

0x100 (p,0x100)

slide-80
SLIDE 80
  • 1. Always Empty

Empty Provenance: Pointer – Integer Round Trip

15

i = (int)p p2 = (char*)i *p2 = 10

(∅,0x100)

UB

char p[1]

0x100

Memory:

0x100 (p,0x100)

slide-81
SLIDE 81
  • 1. Always Empty

Empty Provenance: Pointer – Integer Round Trip

15

i = (int)p p2 = (char*)i *p2 = 10

(∅,0x100)

UB

char p[1]

0x100

Memory:

0x100 (p,0x100)

Problem Common program patterns raise UB

slide-82
SLIDE 82

Empty Provenance: Pointer – Integer Round Trip

16

i = (int)p p2 = (char*)i *p2 = 10

(∅,0x100)

UB

char p[1]

0x100

Memory:

0x100

  • 3. Always Full

(p,0x100)

slide-83
SLIDE 83

Empty Provenance: Pointer – Integer Round Trip

16

i = (int)p p2 = (char*)i *p2 = 10

(∅,0x100)

UB

char p[1]

0x100

Memory:

0x100

  • 3. Always Full

(∗,0x100) (p,0x100)

slide-84
SLIDE 84

Empty Provenance: Pointer – Integer Round Trip

16

i = (int)p p2 = (char*)i *p2 = 10

(∅,0x100) char p[1]

0x100

Memory:

0x100

  • 3. Always Full

(∗,0x100) (p,0x100) 10

slide-85
SLIDE 85

(int*)0x100

Integer → Pointer Casting?

17

char p[1]

0x100

(Ø,0x100) (∗,0x100)

  • 1. Always Empty
  • 3. Always Full

(p,0x100)

Memory:

(p,0x100)

  • 2. Depending on

the Memory Layout

slide-86
SLIDE 86

(int*)0x100

Integer → Pointer Casting?

17

char p[1]

0x100

(Ø,0x100) (∗,0x100)

  • 1. Always Empty
  • 3. Always Full

(p,0x100)

Memory:

(p,0x100)

  • 2. Depending on

the Memory Layout

slide-87
SLIDE 87
  • 2. Depending on the Memory Layout

Depending on the Memory Layout: Reordering

18

char *p = malloc(1) q = (int*)0x100 q = (int*)0x100 char *p = malloc(1)

char p[1]

0x100

Memory:

slide-88
SLIDE 88
  • 2. Depending on the Memory Layout

Depending on the Memory Layout: Reordering

18

char *p = malloc(1) q = (int*)0x100 q = (int*)0x100 char *p = malloc(1)

char p[1]

0x100

Memory:

slide-89
SLIDE 89
  • 2. Depending on the Memory Layout

Depending on the Memory Layout: Reordering

18

char *p = malloc(1) q = (int*)0x100 q = (int*)0x100 char *p = malloc(1)

(p,0x100) char p[1]

0x100

Memory:

slide-90
SLIDE 90
  • 2. Depending on the Memory Layout

Depending on the Memory Layout: Reordering

18

char *p = malloc(1) q = (int*)0x100 q = (int*)0x100 char *p = malloc(1)

(p,0x100)

0x100

Memory:

slide-91
SLIDE 91
  • 2. Depending on the Memory Layout

Depending on the Memory Layout: Reordering

18

char *p = malloc(1) q = (int*)0x100 q = (int*)0x100 char *p = malloc(1)

(p,0x100) (Ø,0x100)

0x100

Memory:

slide-92
SLIDE 92
  • 2. Depending on the Memory Layout

Depending on the Memory Layout: Reordering

18

char *p = malloc(1) q = (int*)0x100 q = (int*)0x100 char *p = malloc(1)

(p,0x100) (Ø,0x100)

0x100

Memory:

Problem Movement of casts, or functions including them, is restricted

slide-93
SLIDE 93
  • 3. Always Full

q = (int*)0x100 char *p = malloc(1)

Depending on the Memory Layout: Reordering

19

char *p = malloc(1) q = (int*)0x100

(p,0x100) char p[1]

0x100

Memory:

(Ø,0x100)

slide-94
SLIDE 94
  • 3. Always Full

q = (int*)0x100 char *p = malloc(1)

Depending on the Memory Layout: Reordering

19

char *p = malloc(1) q = (int*)0x100

(p,0x100) char p[1]

0x100

Memory:

(∗,0x100) (Ø,0x100) (∗,0x100)

slide-95
SLIDE 95

Problem 3

20

slide-96
SLIDE 96

Problems with Full Provenance

21

char p[1] = {0}; f(); print(p[0]);

Anyone can modify other’s local variables by

  • 1. Guessing their addresses &
  • 2. Acquiring full provenance via casting
slide-97
SLIDE 97

Problems with Full Provenance

21

char p[1] = {0}; f(); print(p[0]);

Anyone can modify other’s local variables by

  • 1. Guessing their addresses &
  • 2. Acquiring full provenance via casting

char p[1] = {0}; f(); print(0);

constant prop.

slide-98
SLIDE 98

Problems with Full Provenance

21

char p[1] = {0}; f(); print(p[0]);

(p,0x100)

Anyone can modify other’s local variables by

  • 1. Guessing their addresses &
  • 2. Acquiring full provenance via casting

char p[1] = {0}; f(); print(0);

constant prop.

slide-99
SLIDE 99

Problems with Full Provenance

21

char p[1] = {0}; f(); print(p[0]);

(p,0x100)

*(char*)(0x100)=1;

Anyone can modify other’s local variables by

  • 1. Guessing their addresses &
  • 2. Acquiring full provenance via casting

char p[1] = {0}; f(); print(0);

constant prop.

slide-100
SLIDE 100

Problems with Full Provenance

21

char p[1] = {0}; f(); print(p[0]);

(p,0x100)

*(char*)(0x100)=1;

Anyone can modify other’s local variables by

  • 1. Guessing their addresses &
  • 2. Acquiring full provenance via casting

(∗,0x100)

char p[1] = {0}; f(); print(0);

constant prop.

Full Provenance

slide-101
SLIDE 101

Problems with Full Provenance

21

char p[1] = {0}; f(); print(p[0]);

(p,0x100)

*(char*)(0x100)=1;

Anyone can modify other’s local variables by

  • 1. Guessing their addresses &
  • 2. Acquiring full provenance via casting

(∗,0x100)

char p[1] = {0}; f(); print(0);

constant prop.

Guessing Full Provenance

slide-102
SLIDE 102

Problems with Full Provenance

21

char p[1] = {0}; f(); print(p[0]);

(p,0x100)

1

*(char*)(0x100)=1;

Anyone can modify other’s local variables by

  • 1. Guessing their addresses &
  • 2. Acquiring full provenance via casting

(∗,0x100)

char p[1] = {0}; f(); print(0);

constant prop.

Guessing Full Provenance

slide-103
SLIDE 103

Our Solution

22

char p[1] = {0}; f(); print(p[0]);

1

constant prop.

Basic Idea Exploit Nondeterministic Allocation

(p,0x100)

*(char*)(0x100)=1;

(∗,0x100)

char p[1] = {0}; f(); print(0);

slide-104
SLIDE 104

Our Solution

22

char p[1] = {0}; f(); print(p[0]);

1

constant prop.

Basic Idea Exploit Nondeterministic Allocation

  • Exec. 1
  • Exec. 2

(p,0x100)

*(char*)(0x100)=1;

(∗,0x100) (p,0x200)

char p[1] = {0}; f(); print(0);

slide-105
SLIDE 105

Our Solution

22

char p[1] = {0}; f(); print(p[0]);

1

constant prop.

Basic Idea Exploit Nondeterministic Allocation

  • Exec. 1
  • Exec. 2

(p,0x100)

*(char*)(0x100)=1;

(∗,0x100) (p,0x200)

char p[1] = {0}; f(); print(0);

slide-106
SLIDE 106

Our Solution

22

char p[1] = {0}; f(); print(p[0]);

1

constant prop.

Basic Idea Exploit Nondeterministic Allocation

  • Exec. 1
  • Exec. 2

(p,0x100)

*(char*)(0x100)=1;

(∗,0x100)

UB in Exec. 2 :

no object at 0x100

(p,0x200)

char p[1] = {0}; f(); print(0);

slide-107
SLIDE 107

More Formally, Twin Allocation

23

char p[1] = {0}; *(char*)(0x100) = 1; print(p[0]); 0x100

Memory:

0x200

slide-108
SLIDE 108

More Formally, Twin Allocation

23

char p[1] = {0}; *(char*)(0x100) = 1; print(p[0]); 0x100

Memory:

0x200

slide-109
SLIDE 109

p

More Formally, Twin Allocation

23

char p[1] = {0}; *(char*)(0x100) = 1; print(p[0]); (p,0x100) 0x100

Memory:

0x200

  • Exec. 1
slide-110
SLIDE 110

p p

More Formally, Twin Allocation

23

char p[1] = {0}; *(char*)(0x100) = 1; print(p[0]); (p,0x100) (p,0x200) 0x100

Memory:

0x200

  • Exec. 1
  • Exec. 2
slide-111
SLIDE 111

p p

More Formally, Twin Allocation

23

char p[1] = {0}; *(char*)(0x100) = 1; print(p[0]); (p,0x100) (p,0x200) 0x100

Memory:

0x200

UB in Exec. 2 :

inaccessible at 0x100

  • Exec. 1
  • Exec. 2
slide-112
SLIDE 112

p p

More Formally, Twin Allocation

23

char p[1] = {0}; *(char*)(0x100) = 1; print(p[0]); (p,0x100) (p,0x200) 0x100

Memory:

0x200

UB in Exec. 2 :

inaccessible at 0x100

  • Exec. 1
  • Exec. 2

N.B. This argument works only for unobserved addresses

slide-113
SLIDE 113

char p[1] = {0}; *(char*)(0x100) = 1; print(p[0]);

Example with Observed Address

24

slide-114
SLIDE 114

char p[1] = {0}; if (p == 0x100){ *(char*)(0x100) = 1; } print(p[0]);

Example with Observed Address

24

slide-115
SLIDE 115

char p[1] = {0}; if (p == 0x100){ *(char*)(0x100) = 1; } print(p[0]);

  • Exec. 1
  • Exec. 2

Example with Observed Address

24

(p,0x100) (p,0x200)

slide-116
SLIDE 116

char p[1] = {0}; if (p == 0x100){ *(char*)(0x100) = 1; } print(p[0]);

  • Exec. 1
  • Exec. 2

Example with Observed Address

24

(p,0x100) (p,0x200)

slide-117
SLIDE 117

char p[1] = {0}; if (p == 0x100){ *(char*)(0x100) = 1; } print(p[0]);

  • Exec. 1
  • Exec. 2

Example with Observed Address

24

(p,0x100) (p,0x200)

No UB in Exec. 2

slide-118
SLIDE 118

char p[1] = {0}; if (p == 0x100){ *(char*)(0x100) = 1; } print(p[0]);

  • Exec. 1
  • Exec. 2

Example with Observed Address

24

(p,0x100) (p,0x200)

Consistent with common compilers’ assumption: Observed variables can be modified by others

No UB in Exec. 2

slide-119
SLIDE 119

25

Miscompilation Revisited

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

constant prop.

  • int. eq.

prop. cast elim.

slide-120
SLIDE 120

25

Miscompilation Revisited

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

cast elim.

slide-121
SLIDE 121

25

Miscompilation Revisited

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

Can Access q[0] due to Full Prov. cast elim.

slide-122
SLIDE 122

25

Miscompilation Revisited

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

Can Access q[0] due to Full Prov. cast elim. Cannot Access q[0] due to Prov. p

slide-123
SLIDE 123

25

Miscompilation Revisited

char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(0); } char p[1],q[1] = {0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(p+1) = 10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)(int)(p+1)=10; print(q[0]); } char p[1],q[1]={0}; int ip = (int)(p+1); int iq = (int)q; if (iq == ip) { *(char*)iq = 10; print(q[0]); }

Can Access q[0] due to Full Prov. cast elim. Cannot Access q[0] due to Prov. p

Ptr → Int → Ptr Cast Elimination Is Unsound: A Potential Performance Issue

slide-124
SLIDE 124

Solution to the Cast Elim. Problem

Reducing # of Int ↔ Ptr Casts

  • Most casts are introduced by compilers for convenience
  • We recovered performance by reducing unnecessary casts
  • Int → Ptr: 95% removed
  • Ptr → Int: 75% removed

26

slide-125
SLIDE 125

Solution to the Cast Elim. Problem

Reducing # of Int ↔ Ptr Casts

  • Most casts are introduced by compilers for convenience
  • We recovered performance by reducing unnecessary casts
  • Int → Ptr: 95% removed
  • Ptr → Int: 75% removed

26

The paper includes more details & a formal specification

slide-126
SLIDE 126

Implementation & Evaluation

  • We fixed LLVM 6.0 to be sound in our memory model
  • We had to change only 1.7K LOC in total
  • Benchmark Results
  • SPEC CPU2017 : <0.1% avg, <0.5% max slowdown
  • LLVM Nightly Tests : <0.1% avg , <3% max slowdown
  • We verified key properties of our memory model in Coq

27

slide-127
SLIDE 127

Conclusion

  • We develop a memory model for IR which supports

both low-level code & high-level optimizations

  • We use full provenance & twin allocation to reconcile them
  • Applying our model to LLVM has little impact on performance

28