The
CompCert
Memory Model
Because we need some way to understand how C works
CompCert Memory Model Because we need some way to understand how C - - PowerPoint PPT Presentation
The CompCert Memory Model Because we need some way to understand how C works Outline The general idea Version 1 Limitations Version 2 Problems solved? The general idea Blocks and memory states The idea: blocks*
Because we need some way to understand how C works
The general idea Version 1 Limitations Version 2 Problems solved?
Blocks and memory states
𝑚 low_bound 𝑛 𝑐 ℎ high_bound 𝑛 𝑐 𝑐 pointer is a pair: 𝑐, 𝑗 * version 1
𝑐 * version 1 𝑐′ 𝑐′′
𝑐 * version 1 𝑐′ 𝑐′′ Memory state 𝑛 ⇒ store 𝑛 𝜐 𝑐 𝑗 𝑤 ⇒ 𝑛′
Values, alignment and operations
𝜐 – the type
𝜐 ∷= | Mint8signed | Mint8unsigned | Mint16signed | Mint16unsigned | Mint32 | Mfloat32 | Mfloat64 𝜐 is the size of 𝜐 in bytes 𝜐 is the natural alignment of 𝜐
𝑤 – the value
𝑤 ∷= | Vint 𝑗 | Vfloat 𝑔 | Vptr 𝑐, 𝑗 | Vundef store 𝑛 𝜐 𝑐 𝑗 𝑤 = Some 𝑛′
mem → memory_chunk → block → Z → val → option mem
Stores a value 𝑤 with type 𝜐 in block 𝑐 at offset 𝑗 Returns Some 𝑛′ if it succeeds Returns None if it fails So it can fail! When?
store 𝑛 𝜐 𝑐 𝑗 𝑤 = Some 𝑛′
𝜐 divides 𝑗 ∧ low_bound 𝑛 𝑐 ≤ 𝑗 ∧ 𝑗 + 𝜐 ≤ high_bound m b
𝑗 is aligned correctly It doesn’t start too early It doesn’t end too late
load 𝑛 𝜐 𝑐 𝑗 = Some 𝑤 mem → memory_chunk → block → Z → option val
Reads value of type 𝜐 from block 𝑐 at offset 𝑗 Returns Some 𝑤 if it succeeds Returns None if it fails So it can fail! When? Firstly: satisfy bounds and alignment constraints Secondly...
alloc 𝑛 𝑚 ℎ = 𝑛′, 𝑐 mem → Z → Z → mem × block
Creates a block 𝑐 with boundaries 𝑚 and ℎ Leaves all other blocks unchanged, so:
load 𝑛′ 𝜐′ 𝑐′ 𝑗′ = load 𝑛 𝜐′ 𝑐′ 𝑗′ when 𝑐′ ≠ 𝑐
If the load succeeds after an alloc, the value is Vundef:
load 𝑛 𝜐 𝑐 𝑗 = Some v → 𝑤 = Vundef
free 𝑛 𝑐 = 𝑛′ mem → block → mem
Deallocates block 𝑐, giving it bounds 𝑚 = 0 and ℎ = 0 Returns updated memory state 𝑛′ Leaves all other blocks unchanged, so:
load 𝑛′ 𝜐′ 𝑐′ 𝑗′ = load 𝑛 𝜐′ 𝑐′ 𝑗′ when 𝑐′ ≠ 𝑐
A load will not succeed after a free:
load 𝑛 𝜐 𝑐 𝑗 = None
store 𝑛 𝜐 𝑐 𝑗 𝑤 = Some 𝑛′ mem → memory_chunk → block → Z → val → option mem
Stores stuff, returns new state or None Again, leaves all other blocks unchanged, so:
load 𝑛′ 𝜐′ 𝑐′ 𝑗′ = load 𝑛 𝜐′ 𝑐′ 𝑗′ when 𝑐′ ≠ 𝑐
...but also leaves the rest of the block unchanged! And there is more!
store 𝑛 𝜐 𝑐 𝑗 𝑤 = Some 𝑛′ mem → memory_chunk → block → Z → val → option mem
When something is actually loaded from the right place,
with the right type, it all goes as expected! load 𝑛′ 𝜐′ 𝑐 𝑗 = Some 𝑤 load 𝑛′ 𝜐′ 𝑐 𝑗 = Some convert 𝜐′ 𝑤 if 𝜐′ = 𝜐
...well, almost. And there is more!
store 𝑛 𝜐 𝑐 𝑗 𝑤 = Some 𝑛′ mem → memory_chunk → block → Z → val → option mem
When something is loaded from the right place, but
with the wrong type... load 𝑛′ 𝜐′ 𝑐 𝑗 = Some 𝑤 and 𝜐′ ≠ 𝜐 → 𝑤 = Vundef
And there is more!
store 𝑛 𝜐 𝑐 𝑗 𝑤 = Some 𝑛′ mem → memory_chunk → block → Z → val → option mem
When something is just loaded from the wrong place...
load 𝑛′ 𝜐′ 𝑐 𝑗 = Some 𝑤 and 𝑗′ ≠ 𝑗 and 𝑗′ + 𝜐′ > 𝑗 and 𝑗 + 𝜐 > 𝑗′ → 𝑤 = Vundef
alloc 𝑛 𝑚 ℎ = 𝑛′, 𝑐 mem → Z → Z → mem × block store 𝑛 𝜐 𝑐 𝑗 𝑤 = Some 𝑛′ mem → memory_chunk → block → Z → val → option mem load 𝑛 𝜐 𝑐 𝑗 = Some 𝑤 mem → memory_chunk → block → Z → option val free 𝑛 𝑐 = 𝑛′ mem → block → mem
Why it kinda sucks
What it can do:
Type compatibility
and type casting
What it can’t do:
int x = 3; *((int *) (double *) &x) = 4; Second line: same as “x=4;” struct {int x, y, z;} s; s.y = 42; ((int *) &s)[1] = 42; *((int *) ((char *) &s + sizeof(int))) = 42; Bottom 3 lines: equivalent union point3d { struct {double x, y, z;} s; double d[3]; } p; p.s.x, p.s.y and p.s.z also reachable as: p.d[0], p.d[1] and p.d[2]
What it can do:
Type compatibility
and type casting
What it can’t do:
Access in-memory data
representation
unsigned int bswap (unsigned int x) { union { unsigned int i; char c[4]; } src, dst; src.i = x; dst.c[3] = src.c[0]; dst.c[2] = src.c[1]; dst.c[1] = src.c[2]; dst.c[0] = src.c[3]; return dst.i; }
What it can do:
Type compatibility
and type casting
Invariance by memory
transformations
What it can’t do:
Access in-memory data
representation
int x = 10; int y = 20; /* code */ 𝑦 gets saved in block 𝑐1 𝑧 gets saved in block 𝑐2 int y = 20; int x = 10; /* same code */ 𝑦 gets saved in block 𝑐2 𝑧 gets saved in block 𝑐1
int x = 10; int y = 20; /* code */ 𝑦 gets saved in block 𝑐𝑦 𝑧 gets saved in block 𝑐𝑧 There’s more... int y = 20; int x = 10; /* same code */ 𝑦 gets saved in block 𝑐𝑦 𝑧 gets saved in block 𝑐𝑧
What it can do:
Type compatibility
and type casting
Invariance by memory
transformations
What it can’t do:
Access in-memory data
representation
Fine-grained access
control
load and store always succeed
(assuming correct alignment)
A store should fail on a const A load should fail on something being written to free always succeeds
(even on a “free” block)
A free should fail on something already “free” A free should fail on global variables
What it can do:
Type compatibility
and type casting
Invariance by memory
transformations
What it can’t do:
Access in-memory data
representation
Fine-grained access
control
All of the pros, non of the cons
alloc 𝑛 𝑚 ℎ mem → Z → Z → mem × block store 𝑛 𝜐 𝑐 𝑗 𝑤 mem → memory_chunk → block → Z → val → option mem load 𝑛 𝜐 𝑐 𝑗 mem → memory_chunk → block → Z → option val free 𝑛 𝑐 mem → block → mem loadbytes 𝑛 𝑐 𝑗 𝑜 mem → block → Z → Z → option (list memval) storebytes 𝑛 𝑐 𝑗 𝐶 mem → block → Z → list memval → option mem drop_perm 𝑛 𝑐 𝑚 ℎ 𝑞 mem → block → Z → Z → permission → option mem
Freeable: free, store, load, compare pointers Writable: store, load, compare pointers Readable: load, compare pointers Nonempty: compare pointers
perm 𝑛 𝑐 𝑗 𝑙 𝑞
mem → block → Z → perm_kind → permission → Prop
perm_kind is an enumeration (Max | Cur) Cur permissions cannot exceed Max
Operation Effect on permissions alloc 𝑛 𝑚 ℎ = (𝑛′, 𝑐) Locations 𝑚, ℎ in block 𝑐 get Freeable permissions free 𝑛 𝑐 𝑚 ℎ Locations 𝑚, ℎ in block 𝑐 lose all permissions drop_perm 𝑛 𝑐 𝑚 ℎ 𝑞 Locations 𝑚, ℎ in block 𝑐 get permissions 𝑞 Operation Succeeds iff load 𝑛 𝜐 𝑐 𝑗 Range 𝑗, 𝑗 + 𝜐 (at least) readable store 𝑛 𝜐 𝑐 𝑗 𝑤 Range 𝑗, 𝑗 + 𝜐 (at least) writeable free 𝑛 𝑐 𝑚 ℎ Range 𝑚, ℎ freeable drop_perm 𝑛 𝑐 𝑚 ℎ 𝑞 Range 𝑚, ℎ freeable
low_bound and high_bound no longer necessary In version 1:
𝑐, 𝑗 is valid in 𝑛 ≝ low_bound 𝑛 𝑐 ≤ 𝑗 < high_bound 𝑛 𝑐
In version 2:
𝑐, 𝑗 is valid in 𝑛 ≝ perm 𝑛 𝑐 𝑗 Max Nonempty
loadbytes 𝑛 𝑐 𝑗 𝑜 mem → block → Z → Z → option (list memval) storebytes 𝑛 𝑐 𝑗 𝐶 mem → block → Z → list memval → option mem 𝑛𝑓𝑛𝑤𝑏𝑚 ∷= | Undef | Byte 𝑜 (𝑜 is a concrete integer in 0. . 255) | Pointer 𝑐 𝑗 𝑜 (the 𝑜-th byte of pointer 𝑐, 𝑗 )
Again, these can both fail!
encode_val memory_chunk → val → list memval decode_val memory_chunk → list memval → val
storebytes 𝑛 𝑐 𝑗 𝐶 = Some 𝑛′
When something gets loaded from the right place with
the right length: loadbytes 𝑛′ 𝑐 𝑗 𝐶 = Some 𝐶
When something gets loaded from the wrong place:
loadbytes 𝑛′ 𝑐′ 𝑗′ 𝑜′ = loadbytes 𝑛 𝑐′ 𝑗′ 𝑜′ if 𝑐′ ≠ 𝑐 or 𝑗′ + 𝑜′ ≤ 𝑗 or 𝑗 + 𝐶 ≤ 𝑗′
What it can do:
Type compatibility
and type casting
Invariance by memory
transformations
Access in-memory data
representation
Fine-grained access
control
What it can’t do:
Access in-memory data
representation
Fine-grained access
control
unsigned int bswap (unsigned int x) { union { unsigned int i; char c[4]; } src, dst; src.i = x; dst.c[3] = src.c[0]; dst.c[2] = src.c[1]; dst.c[1] = src.c[2]; dst.c[0] = src.c[3]; return dst.i; }
A store should fail on a const A load should fail on something being written to A free should fail on something already “free” A free should fail on global variables
Because I, for one, would probably have slept through the whole thing
In the slides about
“Invariance by memory transformations” there was a line that said “There’s more...”
The CompCert compiler does some weird sh*t to your
program.
Local variables whose addresses are never taken,
will not be placed in memory
...until it becomes clear that they don’t fit into registers
anymore.
Then they are “spilled” into memory.
Which requires all sorts of formalizations And I really don’t want to go into them anymore. I’m tired.