Formal C semantics: CompCert and the C standard Robbert Krebbers 1 - - PowerPoint PPT Presentation

formal c semantics compcert and the c standard
SMART_READER_LITE
LIVE PREVIEW

Formal C semantics: CompCert and the C standard Robbert Krebbers 1 - - PowerPoint PPT Presentation

Formal C semantics: CompCert and the C standard Robbert Krebbers 1 Xavier Leroy 2 Freek Wiedijk 1 1 ICIS, Radboud University Nijmegen, The Netherlands 2 Inria Paris-Rocquencourt, France July 17, 2014 @ ITP, Vienna, Austria 1 Underspecification


slide-1
SLIDE 1

1

Formal C semantics: CompCert and the C standard

Robbert Krebbers1 Xavier Leroy2 Freek Wiedijk1

1ICIS, Radboud University Nijmegen, The Netherlands 2Inria Paris-Rocquencourt, France

July 17, 2014 @ ITP, Vienna, Austria

slide-2
SLIDE 2

2

Underspecification in C

◮ Unspecified behavior: two or more behaviors are allowed

For example: order of evaluation in expressions

◮ Implementation defined behavior: like unspecified

behavior, but the compiler has to document its choice For example: size and endianness of integers

◮ Undefined behavior: the standard imposes no requirements

at all, the program is even allowed to crash For example: dereferencing a NULL or dangling pointer, signed integer overflow, . . .

slide-3
SLIDE 3

2

Underspecification in C

◮ Unspecified behavior: two or more behaviors are allowed

For example: order of evaluation in expressions Non-determinism

◮ Implementation defined behavior: like unspecified

behavior, but the compiler has to document its choice For example: size and endianness of integers Parametrization

◮ Undefined behavior: the standard imposes no requirements

at all, the program is even allowed to crash For example: dereferencing a NULL or dangling pointer, signed integer overflow, . . . No semantics/crash state

slide-4
SLIDE 4

3

Pros and cons of underspecification

Pros for optimizing compilers:

◮ More optimizations are possible ◮ High run-time efficiency ◮ Easy to support multiple architectures

Cons for programmers/formal methods people:

◮ Portability and maintenance problems ◮ Hard to formally reason about

slide-5
SLIDE 5

4

Approaches to underspecification

CompCert (Leroy et al.)

◮ Main goal: verified optimizing compiler in ◮ Specific choices for unspecified/impl-defined behavior

For example: 32-bits ints

◮ Describes some undefined behavior

For example: dereferencing NULL, integer overflow defined

◮ Compiler correctness proof only for programs without

undefined behavior Formalin (Krebbers & Wiedijk)

◮ Main goal: compiler independent separation logic in ◮ Describes some implementation-defined behavior

For example: no legacy architectures with 0’s complement

◮ Aims to describe all unspecified and undefined behavior

slide-6
SLIDE 6

5

Defined behaviors in C11, Formalin and CompCert C

C11 Formalin CompCert C

comparing with end-of-array pointers byte-wise pointer copy subtle casts subtle type punning integer overflow aliasing violations sequence point violations use of dangling block scope pointers arithmetic on pointer bytes

slide-7
SLIDE 7

5

Defined behaviors in C11, Formalin and CompCert C

C11 Formalin CompCert C

comparing with end-of-array pointers byte-wise pointer copy subtle casts subtle type punning integer overflow aliasing violations sequence point violations use of dangling block scope pointers arithmetic on pointer bytes

This talk: add to CompCert so we get Formalin ⊆ CompCert

slide-8
SLIDE 8

6

Comparing with end-of-array pointers (problem)

Useful: void inc_array(int *p, int n) { int *end = p + n; while (p < end) (*p++)++; }

slide-9
SLIDE 9

6

Comparing with end-of-array pointers (problem)

Useful: void inc_array(int *p, int n) { int *end = p + n; while (p < end) (*p++)++; } x0 x1 xn−1 p end

slide-10
SLIDE 10

6

Comparing with end-of-array pointers (problem)

Useful: void inc_array(int *p, int n) { int *end = p + n; while (p < end) (*p++)++; } x0 + 1 x1 xn−1 p end

slide-11
SLIDE 11

6

Comparing with end-of-array pointers (problem)

Useful: void inc_array(int *p, int n) { int *end = p + n; while (p < end) (*p++)++; } x0 + 1 x1 + 1 xn−1 p end

slide-12
SLIDE 12

6

Comparing with end-of-array pointers (problem)

Useful: void inc_array(int *p, int n) { int *end = p + n; while (p < end) (*p++)++; } x0 + 1 x1 + 1 xn−1 p end

slide-13
SLIDE 13

6

Comparing with end-of-array pointers (problem)

Useful: void inc_array(int *p, int n) { int *end = p + n; while (p < end) (*p++)++; } x0 + 1 x1 + 1 xn−1 + 1 p end

slide-14
SLIDE 14

6

Comparing with end-of-array pointers (problem)

Useful: void inc_array(int *p, int n) { int *end = p + n; while (p < end) (*p++)++; } x0 + 1 x1 + 1 xn−1 + 1 p end Bizarre: int x, y; if (&x + 1 == &y) printf("x and y are adjacent\n");

slide-15
SLIDE 15

6

Comparing with end-of-array pointers (problem)

Useful: void inc_array(int *p, int n) { int *end = p + n; while (p < end) (*p++)++; } x0 + 1 x1 + 1 xn−1 + 1 p end Bizarre: int x, y; if (&x + 1 == &y) printf("x and y are adjacent\n"); &x &y

slide-16
SLIDE 16

6

Comparing with end-of-array pointers (problem)

Useful: void inc_array(int *p, int n) { int *end = p + n; while (p < end) (*p++)++; } x0 + 1 x1 + 1 xn−1 + 1 p end Bizarre: int x, y; if (&x + 1 == &y) printf("x and y are adjacent\n"); &x &x + 1 &y

slide-17
SLIDE 17

6

Comparing with end-of-array pointers (problem)

Useful: void inc_array(int *p, int n) { int *end = p + n; while (p < end) (*p++)++; } x0 + 1 x1 + 1 xn−1 + 1 p end Bizarre: int x, y; if (&x + 1 == &y) printf("x and y are adjacent\n"); &x &x + 1 &y == ?

slide-18
SLIDE 18

6

Comparing with end-of-array pointers (problem)

Useful: void inc_array(int *p, int n) { int *end = p + n; while (p < end) (*p++)++; } x0 + 1 x1 + 1 xn−1 + 1 p end Bizarre: int x, y; if (&x + 1 == &y) printf("x and y are adjacent\n"); &x &x + 1 &y == ? Both undefined behavior in CompCert (1.12 and before)

slide-19
SLIDE 19

7

Comparing with end-of-array pointers (solution)

Solution: Comparison of pointers is defined if:

◮ Same block: both should within block bounds

  • ×
slide-20
SLIDE 20

7

Comparing with end-of-array pointers (solution)

Solution: Comparison of pointers is defined if:

◮ Same block: both should within block bounds

  • ×

◮ Different block: both should be strictly within block bounds

  • ×
slide-21
SLIDE 21

7

Comparing with end-of-array pointers (solution)

Solution: Comparison of pointers is defined if:

◮ Same block: both should within block bounds

  • ×

◮ Different block: both should be strictly within block bounds

  • ×

Stable under compilation and gives a semantics to common programming practice with end-of-array pointers

slide-22
SLIDE 22

8

Byte-wise copying of objects (problem)

struct { short x; short *r; } s1 = {10, &s.x}, s2; unsigned char *p = &s1, *q = &s2; unsigned char *end = p + size_of(s1); while (p < end) *p++ = *q++;

slide-23
SLIDE 23

8

Byte-wise copying of objects (problem)

struct { short x; short *r; } s1 = {10, &s.x}, s2; unsigned char *p = &s1, *q = &s2; unsigned char *end = p + size_of(s1); while (p < end) *p++ = *q++; s1: 0x0a 0x00

(bs1 , 0)0 (bs1 , 0)1 (bs1 , 0)2 (bs1 , 0)3

p end s2: q

slide-24
SLIDE 24

8

Byte-wise copying of objects (problem)

struct { short x; short *r; } s1 = {10, &s.x}, s2; unsigned char *p = &s1, *q = &s2; unsigned char *end = p + size_of(s1); while (p < end) *p++ = *q++; s1: 0x0a 0x00

(bs1 , 0)0 (bs1 , 0)1 (bs1 , 0)2 (bs1 , 0)3

p end s2: 0x0a q

slide-25
SLIDE 25

8

Byte-wise copying of objects (problem)

struct { short x; short *r; } s1 = {10, &s.x}, s2; unsigned char *p = &s1, *q = &s2; unsigned char *end = p + size_of(s1); while (p < end) *p++ = *q++; s1: 0x0a 0x00

(bs1 , 0)0 (bs1 , 0)1 (bs1 , 0)2 (bs1 , 0)3

p end s2: 0x0a 0x00 q Previously undefined, need to allow copying indeterminate bytes

slide-26
SLIDE 26

8

Byte-wise copying of objects (problem)

struct { short x; short *r; } s1 = {10, &s.x}, s2; unsigned char *p = &s1, *q = &s2; unsigned char *end = p + size_of(s1); while (p < end) *p++ = *q++; s1: 0x0a 0x00

(bs1 , 0)0 (bs1 , 0)1 (bs1 , 0)2 (bs1 , 0)3

p end s2: 0x0a 0x00 q Previously undefined, need to allow copying indeterminate bytes

slide-27
SLIDE 27

8

Byte-wise copying of objects (problem)

struct { short x; short *r; } s1 = {10, &s.x}, s2; unsigned char *p = &s1, *q = &s2; unsigned char *end = p + size_of(s1); while (p < end) *p++ = *q++; s1: 0x0a 0x00

(bs1 , 0)0 (bs1 , 0)1 (bs1 , 0)2 (bs1 , 0)3

p end s2: 0x0a 0x00 q Previously undefined, need to allow copying symbolic pointer bytes

slide-28
SLIDE 28

8

Byte-wise copying of objects (problem)

struct { short x; short *r; } s1 = {10, &s.x}, s2; unsigned char *p = &s1, *q = &s2; unsigned char *end = p + size_of(s1); while (p < end) *p++ = *q++; s1: 0x0a 0x00

(bs1 , 0)0 (bs1 , 0)1 (bs1 , 0)2 (bs1 , 0)3

p end s2: 0x0a 0x00

(bs1 , 0)0

q Previously undefined, need to allow copying symbolic pointer bytes

slide-29
SLIDE 29

8

Byte-wise copying of objects (problem)

struct { short x; short *r; } s1 = {10, &s.x}, s2; unsigned char *p = &s1, *q = &s2; unsigned char *end = p + size_of(s1); while (p < end) *p++ = *q++; s1: 0x0a 0x00

(bs1 , 0)0 (bs1 , 0)1 (bs1 , 0)2 (bs1 , 0)3

p end s2: 0x0a 0x00

(bs1 , 0)0 (bs1 , 0)1

q Previously undefined, need to allow copying symbolic pointer bytes

slide-30
SLIDE 30

8

Byte-wise copying of objects (problem)

struct { short x; short *r; } s1 = {10, &s.x}, s2; unsigned char *p = &s1, *q = &s2; unsigned char *end = p + size_of(s1); while (p < end) *p++ = *q++; s1: 0x0a 0x00

(bs1 , 0)0 (bs1 , 0)1 (bs1 , 0)2 (bs1 , 0)3

p end s2: 0x0a 0x00

(bs1 , 0)0 (bs1 , 0)1 (bs1 , 0)2

q Previously undefined, need to allow copying symbolic pointer bytes

slide-31
SLIDE 31

8

Byte-wise copying of objects (problem)

struct { short x; short *r; } s1 = {10, &s.x}, s2; unsigned char *p = &s1, *q = &s2; unsigned char *end = p + size_of(s1); while (p < end) *p++ = *q++; s1: 0x0a 0x00

(bs1 , 0)0 (bs1 , 0)1 (bs1 , 0)2 (bs1 , 0)3

p end s2: 0x0a 0x00

(bs1 , 0)0 (bs1 , 0)1 (bs1 , 0)2 (bs1 , 0)3

q Previously undefined, need to allow copying symbolic pointer bytes

slide-32
SLIDE 32

9

Byte-wise copying of objects (solution)

Solution: extend values with pointer fragment values Inductive val: Type := | Vundef: val | Vint: int -> val | Vlong: int64 -> val | Vfloat: float -> val | Vptr: block -> int -> val | Vptrfrag: block -> int -> nat -> val. Subtleties:

◮ Dealing with arithmetic on pointer fragments ◮ Dealing with implicit casts (at assignments) ◮ More values possible, need to extend static analysis

slide-33
SLIDE 33

10

Conclusion and future work

◮ Semantics to useful behaviors that were previously undefined

◮ Comparing with end-of-array pointers ◮ Byte-wise pointer copy

◮ CompCert proofs adapted for these extensions

◮ Small changes to the semantics ◮ Involves proofs of many compilation passes

◮ Needed for cross validation of CompCert and Formalin ◮ Call-by-reference passing of struct values future work

slide-34
SLIDE 34

11

Questions

Sources: http://github.com/robbertkrebbers