The C Language: It Is Not What You Think It Is! Mark Batty and - - PowerPoint PPT Presentation

the c language it is not what you think it is
SMART_READER_LITE
LIVE PREVIEW

The C Language: It Is Not What You Think It Is! Mark Batty and - - PowerPoint PPT Presentation

The C Language: It Is Not What You Think It Is! Mark Batty and Peter Sewell University of Cambridge Joint work of Jade Alglave, Mark Batty, Luc Maranget, Robin Morisset, Justus Matthiesen, Kayvan Memarian, Paul McKenney, Scott Owens, Pankaj


slide-1
SLIDE 1

The C Language: It Is Not What You Think It Is!

Mark Batty and Peter Sewell

University of Cambridge Joint work of Jade Alglave, Mark Batty, Luc Maranget, Robin Morisset, Justus Matthiesen, Kayvan Memarian, Paul McKenney, Scott Owens, Pankaj Pawan, Susmit Sarkar, Peter Sewell, Francesco Zappa Nardelli, and others.

Linux Collaboration Summit, April 2013

  • p. 1
slide-2
SLIDE 2
  • p. 2
slide-3
SLIDE 3

C is?

What the Standards say: ISO/IEC 9899:1999 ISO/IEC 9899:2011 endless DRs and internet discussions What the existing compilers provide ...and what future compilers will provide “the newer version of Clang is stricter about certain things” [Behan Webster, 2.15pm] What the corpus of C code requires What programmers believe What analysis and verification tools assume

  • p. 3
slide-4
SLIDE 4

A Simple Message-Passing Example (MP)

Thread 0 Thread 1 data=1 while (flag = 1) {} if this reads 1... flag=1 print data ...could this read 0? Initial state: data=0 flag=0

Test MP Thread 0 a: W[x]=1 b: W[y]=1 c: R[y]=1 Thread 1 d: R[x]=0 po rf po rf

  • p. 4
slide-5
SLIDE 5

A Simple Message-Passing Example (MP)

Thread 0 Thread 1 data=1 while (flag = 1) {} if this reads 1... flag=1 print data ...could this read 0? Initial state: data=0 flag=0

Test MP Thread 0 a: W[x]=1 b: W[y]=1 c: R[y]=1 Thread 1 d: R[x]=0 po rf po rf

x86: ./runMP.sh

  • p. 4
slide-6
SLIDE 6

A Simple Message-Passing Example (MP)

Thread 0 Thread 1 data=1 while (flag = 1) {} if this reads 1... flag=1 print data ...could this read 0? Initial state: data=0 flag=0

Test MP Thread 0 a: W[x]=1 b: W[y]=1 c: R[y]=1 Thread 1 d: R[x]=0 po rf po rf

x86: ./runMP.sh POWER and ARM:

POWER ARM Kind PowerG5 Power6 Power7 Tegra2 Tegra3 APQ8060 A5X MP Allow 10M/4.9G 6.5M/29G 1.7G/167G 40M/3.8G 138k/16M 61k/552M 437k/185M

  • p. 4
slide-7
SLIDE 7

A Simple Message-Passing Example (MP)

Thread 0 Thread 1 data=1 while (flag = 1) {} if this reads 1... flag=1 print data ...could this read 0? Initial state: data=0 flag=0

Test MP Thread 0 a: W[x]=1 b: W[y]=1 c: R[y]=1 Thread 1 d: R[x]=0 po rf po rf

x86: ./runMP.sh POWER and ARM:

POWER ARM Kind PowerG5 Power6 Power7 Tegra2 Tegra3 APQ8060 A5X MP Allow 10M/4.9G 6.5M/29G 1.7G/167G 40M/3.8G 138k/16M 61k/552M 437k/185M

GCC and HotSpot: observable (in some contexts)

  • p. 4
slide-8
SLIDE 8

A Simple Message-Passing Example (MP)

Thread 0 Thread 1 data=1 while (flag = 1) {} if this reads 1... flag=1 print data ...could this read 0? Initial state: data=0 flag=0

Test MP Thread 0 a: W[x]=1 b: W[y]=1 c: R[y]=1 Thread 1 d: R[x]=0 po rf po rf

x86: ./runMP.sh POWER and ARM:

POWER ARM Kind PowerG5 Power6 Power7 Tegra2 Tegra3 APQ8060 A5X MP Allow 10M/4.9G 6.5M/29G 1.7G/167G 40M/3.8G 138k/16M 61k/552M 437k/185M

GCC and HotSpot: observable (in some contexts) C/C++11 standards: undefined program

  • p. 4
slide-9
SLIDE 9

Explaining MP

Have to deal with h/w relaxations (for multiple architectures) and compiler optimisations

Thread 0 Thread 1 data=1 while (flag = 1) ; if this reads 1... flag=1 print data ...could this read 0? Initial state: data=0 flag=0

Test MP Thread 0 a: W[x]=1 b: W[y]=1 c: R[y]=1 Thread 1 d: R[x]=0 po rf po rf

  • p. 5
slide-10
SLIDE 10

Explaining MP

Have to deal with h/w relaxations (for multiple architectures) and compiler optimisations

Thread 0 Thread 1 r = data data=1 while (flag = 1) ; if this reads 1... flag=1 print data ...could this read 0? Initial state: data=0 flag=0

Test MP Thread 0 a: W[x]=1 b: W[y]=1 c: R[y]=1 Thread 1 d: R[x]=0 po rf po rf

  • p. 5
slide-11
SLIDE 11

A Less Simple Example

POWER ARM Kind PowerG5 Power6 Power7 Tegra2 Tegra3 APQ8060 A5X PPOCA Allow 1.1k/3.4G 0/49G 175k/157G 0/24G 0/39G 233/743M 0/2.2G PPOAA Forbid 0/3.4G 0/46G 0/209G 0/24G 0/39G 0/26G 0/2.2G

Test PPOCA Thread 0 a: W[z]=1 b: W[y]=1 c: R[y]=1 Thread 1 e: R[x]=1 f: R[z]=0 d: W[x]=1 dmb/sync rf ctrl rf rf addr

  • p. 6
slide-12
SLIDE 12

Options

  • 1. program w.r.t. particular h/w architecture and compiler
  • ptimisation knowledge
  • 2. define macros that try to abstract (de facto Linux std)
  • 3. incorporate into language semantics (C/C++11)
  • p. 7
slide-13
SLIDE 13

A data-race-free model

H/W memory models must define the behaviour of all programs For C/C++ can deem racy programs to be bad, saying: programs that are race-free in sequentially consistant (SC) semantics have SC behaviour programs that have a race in some execution in SC semantics can behave in any way at all ...and then add several tiers of low-level atomics, each with more performance (and relaxed behaviour) than the last: SC, release/acquire, release/consume, relaxed.

  • p. 8
slide-14
SLIDE 14

What we’ve done

clarify multicore h/w behaviour: x86, ARM, IBM Power clarify C/C++11 concurrency semantics and standards Central artifacts: mathematical reference models describes the set of all allowed behaviours of a program mathematically precise (might be wrong, but not ambiguous) as clear as possible executable as test oracle (for small programs) sound (w.r.t. implementations or standards, variously)

  • p. 9
slide-15
SLIDE 15

Using the models

cppmem tool C/C++11 concurrency compilation scheme (research) compiler verification (production) compiler testing

  • p. 10
slide-16
SLIDE 16

Testing Compilers vs Semantics

[Robin Morisset, Pankaj Pawan, Francesco Zappa Nardelli]

Reduce problem to testing compilers on sequential programs: random sequential program

reference semantics (GCC -O0)

  • GCC -O2
  • pt executable

execute

  • reference trace

sound-transformation ?

  • pt trace

A transformation is sound if it does not introduce any behavior in any non-racy context. [PLDI 2013]

  • p. 11
slide-17
SLIDE 17

Compiler Concurrency Bug

int g1 = 1; int g2 = 0; int func1(void) { int l; for (l = 0; (l != 4); l++) { if (g1) return l; for (g2 = 0; (g2 >= 26); ++g2) ; } } int main (int argc, char* argv[]) { func1(); } miscompiled by GCC 4.7.0 with -O2

  • p. 12
slide-18
SLIDE 18

Csem: Scaling up to more of C?

Remember, we want a usable reference model. Formalise the Standard?

  • p. 13
slide-19
SLIDE 19

Csem: Scaling up to more of C?

Remember, we want a usable reference model. Formalise the Standard? Maybe as a starting point — but immediately hit cases where ISO standard and de facto standard differ.

  • p. 13
slide-20
SLIDE 20

Simple Questions

PC.1 In practice, can one compare (with <, >, <=, or >=) two pointers to separately allocated objects (of compatible object types)?

  • p. 14
slide-21
SLIDE 21

Simple Questions

PC.1 In practice, can one compare (with <, >, <=, or >=) two pointers to separately allocated objects (of compatible object types)? asked 11 experts: 4 used and supported in practice 2 not useful, but compilers do support it 4 should not be used; compilers might well not support it 1 don’t know PC.1 (standard) Is it allowed by the C standard? 1 allowed 6 not allowed 3 don’t know

  • p. 14
slide-22
SLIDE 22

Simple Questions

PC.1 In practice, can one compare (with <, >, <=, or >=) two pointers to separately allocated objects (of compatible object types)? “Again, memory allocator implementations do this a lot. Also, this is used in practice to avoid deadlock – if a pair of locks to be acquired have the same address, only acquire one of them. (In fact, I am currently working on a patch that does just this.)”

  • p. 14
slide-23
SLIDE 23

Simple Questions

PC.1 In practice, can one compare (with <, >, <=, or >=) two pointers to separately allocated objects (of compatible object types)? “May produce inconsistent results in practice if p and q straddle the exact middle of the address

  • space. We’ve run into practical problems with this.”
  • p. 14
slide-24
SLIDE 24

Simple Basic Questions

PR.1.a Given two separately allocated objects of the same type, can a pointer to the first be turned into a usable pointer to the second, by pointer arithmetic together with a check that the result has an identical representation to that of another usable pointer to the second? int y = 2, x=1; int main() { intptr t offset = &y - &x; // ? int *p = &x + offset, *q = &y; if (memcmp(&p, &q, sizeof(p)) == 0) { *p = 11; // ? printf("x=%d y=%d *p=%d *q=%d",x,y,*p,*q);

} }

  • p. 15
slide-25
SLIDE 25

Simple Basic Questions

PR.1.a Given two separately allocated objects of the same type, can a pointer to the first be turned into a usable pointer to the second, by pointer arithmetic together with a check that the result has an identical representation to that of another usable pointer to the second? int y = 2, x=1; int main() { intptr t offset = &y - &x; // ? int *p = &x + offset, *q = &y; if (memcmp(&p, &q, sizeof(p)) == 0) { *p = 11; // ? printf("x=%d y=%d *p=%d *q=%d",x,y,*p,*q);

} }

$gcc-4.7 -std=c11 -pedantic -Wall -O1: x=1 y=11 *p=11 *q=11 $gcc-4.7 -std=c11 -pedantic -Wall -O2: x=1 y=2 *p=11 *q=2

  • p. 15
slide-26
SLIDE 26

moral: what matters is not just what happens to the bits at runtime, but what the compiler analysis assumes

  • p. 16
slide-27
SLIDE 27

Basic Questions

ET.1 If one uses a pointer to a struct to initialise a single member of the struct in a malloc’d region, does

  • 1. the footprint of the whole struct take on the struct type as

its effective type, or

  • 2. does the footprint of the member take on an effective type
  • f that struct member, or
  • 3. does the footprint of the member take on an effective type

as the type of that struct member?

  • p. 17
slide-28
SLIDE 28

Basic Questions

ET.1 If one uses a pointer to a struct to initialise a single member of the struct in a malloc’d region, does

  • 1. the footprint of the whole struct take on the struct type as

its effective type, or

  • 2. does the footprint of the member take on an effective type
  • f that struct member, or
  • 3. does the footprint of the member take on an effective type

as the type of that struct member? 11 don’t know ET.1 And according to the C standard? 11 Don’t know “-fno-strict-aliasing, the flag that makes CPU-bound code 1-5% faster in exchange for making it not work...”

  • p. 17
slide-29
SLIDE 29

Basic Questions

Compiler writers want to exploit unspecified values – but how much? TR.1 Where (if at all) do any current implementations of C have trap representations? UV.1 In the absence of any writes, is an unspecified value stable? I.e., will multiple reads of it always return the same value? UV.2 Is computation strict with respect to unspecified values? I.e., if an argument of a unary or binary operator is an unspecified value, is the result also an unspecified value? UV.3 Are the representation bytes of an unspecified value also unspecified values?

  • p. 18
slide-30
SLIDE 30

#include <stdio.h> int main() { int a[5]; int i; int x=0; for (i=0; i<10; i++) { x = x+a[i]; }; printf("x = %d\n",x); }

[example from Greta Yorsh]

  • p. 19
slide-31
SLIDE 31

.L branch L

  • p. 20
slide-32
SLIDE 32

How can we investigate the de facto standard?

What the Standards say: ISO/IEC 9899:1999 ISO/IEC 9899:2011 endless DRs and internet discussions What the corpus of C code requires What the existing compilers provide ...and what future compilers will provide What programmers believe What analysis and verification tool writers assume

  • p. 21
slide-33
SLIDE 33

Survey (first draft)

www.cl.cam.ac.uk/users/pes20/csurvey/viewform2.html (Remember to click “Submit” at the end. Partial responses are

  • k)
  • p. 22
slide-34
SLIDE 34

How can we investigate the de facto standard?

What the corpus of C code requires

  • p. 23
slide-35
SLIDE 35

Conclusion

A lot of mileage from usable reference models Clarified subtle concurrency behaviour in x86, ARM, Power, C/C++11 (arising from h/w and s/w optimisations) Industry engagement with semantic models for these key abstractions Questions: de facto standard concurrency models? scaling up to full languages? portability testing?

  • p. 24
slide-36
SLIDE 36

The End

Thanks:

Jade Alglave, Luc Maranget, Justus Matthiesen, Kayvan Memarian, Robin Morisset, Scott Owens, Pankaj Pawan, Jean Pichon, Susmit Sarkar, Tjark Weber, Derek Williams, Francesco Zappa Nardelli,... (variously Cambridge, INRIA, IBM, UCL, UPenn,...) P .S. We’re recruiting: Rigorous Engineering for Mainstream Systems, 2013-2019

  • p. 25