Extending the CompCert certified C compiler with instruction - - PowerPoint PPT Presentation

extending the compcert certified c compiler
SMART_READER_LITE
LIVE PREVIEW

Extending the CompCert certified C compiler with instruction - - PowerPoint PPT Presentation

Extending CompCert S. Boulm e ( Verimag , Grenoble-INP) RISC-V week @ Paris2019 Extending the CompCert certified C compiler with instruction scheduling and control-flow integrity (CFI) October 2019 Sylvain.Boulme@univ-grenoble-alpes.fr


slide-1
SLIDE 1

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Extending the CompCert certified C compiler

with instruction scheduling and control-flow integrity (CFI) October 2019 Sylvain.Boulme@univ-grenoble-alpes.fr Verimag, Grenoble-INP

1/19

slide-2
SLIDE 2

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Issue : optimizing compiler for safety-critical software

Compilation bugs in most C compilers (GCC, LLVM, etc). Attested by randomized differential testing : Eide-Regehr’08, Yang-et-al’11, Lidbury-et-al’15, ... Tests of optimizing compilers cannot cover all corner cases ! Strong safety-critical requirements of

Avionics (DO-178), Nuclear (IEC-61513), Automotive (ISO-26262), Railway (IEC-62279)

  • ften established at the source level with

human review of the compiled code. ← intractable if optimized One solution : a formally proved compiler !

formal proof = computer-aided review of the compiler code w.r.t its spec. ⇒ up-to-date & very sharp (formal) documentation of the compiler that also helps “external developers” (like us at Verimag)

2/19

slide-3
SLIDE 3

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Overview of https://github.com/AbsInt/CompCert

Input most of ISO C99 + a few extensions Output (32&64 bits) code for RISC-V, PowerPC, ARM, x86 Developed since 2005 by Leroy-et-al at Inria Commercial support since 2015 by AbsInt (German Company) Industrial uses in Avionics (Airbus) & Nuclear Plants (MTU) Unequaled level of trust for industrial-scaling compilers Correctness proved within the Coq proof assistant Performance of generated code (for PowerPC and ARM) 2× faster than gcc -O0 10% slower than gcc -O1 and 20% than gcc -O3. Example In MTU systems (emergency power for Nuclear Plants) 28% smaller WCET than with a previous unverified compiler.

3/19

slide-4
SLIDE 4

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Understanding the formal correctness of CompCert

Formally, correctness of compiled code is ensured modulo

    

  • correctness of C formal semantics in Coq
  • correctness of assembly formal semantics in Coq
  • absence of undefined behavior in the source program

Formal semantics ≃ relation between “programs” and “behaviors” i.e. a (possibly non-deterministic) interpretation of programs for C : formalization of ISO C99 standard for assembly : formalization/abstraction of ISA Source program assumed to be without undefined behavior

int x, t[10], y; ... if (...) { t[10]=1; // undefined behavior: out of bounds // the compiler could write in x or y, // or prune the branch as dead -code , ...

4/19

slide-5
SLIDE 5

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Informal view of CompCert formal correctness

Observable Value = int or float or address of global variable Trace = a sequence of external function calls (or volatile accesses) each of the form “f (v1, . . . , vn) → v” where f is name Behavior = one of the four possible cases (of an execution) :

        

an infinite trace (of a diverging execution) a finite trace followed by an infinite “silent” loop a finite trace followed by an integer exit code (terminating case) a finite trace followed by an error (UNDEFINED-BEHAVIOR) Semantics = maps each program to a set of behaviors. Correctness of the compiler For any source program S, if S has no UNDEFINED-BEHAVIOR, and if the compiler returns some assembly program C, then any behavior of C is also a behavior of S. NB : under these conditions, C has no UNDEFINED-BEHAVIOR.

5/19

slide-6
SLIDE 6

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Modular design of CompCert in Coq

Components independent/parametrized/specific w.r.t. the target

CompCert C Clight C#minor Cminor CminorSel RTL LTL Linear Mach Asm side-effects out of expressions type elimination loop simplification stack allocation

  • f variables

instruction selection CFG construction register allocation CFG optimizations linearization

  • f CFG

branch tunneling layout of stackframes assembly code generation

And now, Verimag’s Mach → Asm for two targets

  • 1. The “K1c” VLIW core of Kalray :

the 1st (scaling) certified compiler that optimizes ILP ?

  • 2. A variant of RISC-V with encryption and CFI.

6/19

slide-7
SLIDE 7

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Instruction scheduling for CompCert/Kalray’s K1c

Joint work with C. Six (Kalray/Verimag) and D. Monniaux (CNRS/Verimag)

Kalray’s K1c = a 6-issue VLIW with a 7-stage pipeline, e.g. with instruction level parallelism (ILP) in 2D

bundles of (upto) 6 instructions may run in parallel at each of the 7 pipeline stages.

with a very predictible semantics : in-order & interlocked. simplify WCET estimations & compilers design ! Two main contributions of our CompCert backend

  • 1. an (abstract) formal semantics of VLIW assembly expressing

parallel execution of instructions within bundles

  • 2. a certified instruction scheduler performing

assembly optimization w.r.t the 2D of ILP

a speedup of more 50% on the code generated by CompCert coming around 10% slower than GCC-O2 (Kalray’s backend) & generally 20% faster than GCC-O1 (without scheduling)

7/19

slide-8
SLIDE 8

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Issue : CompCert and Control-Flow Integrity (CFI) ?

status pay(float amount , id client , id vendor ){ if (auth(client )) goto transaction; return ABORTED; transaction: /* perform the transaction */

CompCert’s formal correctness implies that

the generated assembly cannot run code at transaction without being entered “normally” in function pay

under the two following conditions ◮ no undefined-behavior in the source (e.g. no BOV) ◮ trustworthy runtime environment (e.g. no hardware attack) very restrictive conditions w.r.t practice !

8/19

slide-9
SLIDE 9

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Overview of CFI in CEA’s IntrinSec

Works of O. Savry and its team at CEA-LETI

CEA’s IntrinSec = a RISC-V variant (still under design) with code/data encryption with CF&data access-control Control-Flow Integrity (in an adversarial context) provided by access-control on both the CF : ensuring that CF cannot “enter into functions” except at : function entry + return-address (RA) from callees the stack : ensuring that only “authorized instructions” can modify RA in the stack (e.g. no buffer-overflows). Actually, the processor aborts to prevent unsecure behaviors : Buffer-overflows can modify RA on the stack, but then, abort on the load into RA register

9/19

slide-10
SLIDE 10

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

CF access control for CompCert/CEA’s IntrinSec

Joint work with P. Torrini (Verimag) & hints from M.L. Potet (Verimag) and O. Savry (LETI)

Our contributions ◮ Extend CompCert’s RISC-V model with IntrinSec’s instructions of CF access control ◮ Make CompCert generate instructions of CF access control ◮ Formally prove the compiler correctness (work in progress) Future works ◮ Support of data access-control ◮ Informal CFI properties of the platform ◮ Toward a formalization of some CFI properties ? Issue : CompCert’s models too high-level for expressing attacks ?

10/19

slide-11
SLIDE 11

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Conclusions

CompCert = a moderately-optimizing C compiler with an unprecedented level of trust in its correctness

“CompCert is the only compiler we have tested for which Csmith cannot find wrong-code errors. This is not for lack of trying : we have devoted about six CPU-years to the task. [. . .] developing compiler optimizations within a proof framework [. . .] has tangible benefits for compiler users.” Yang-et-al’11 (from randomized differential testing)

CompCert ready to be included into chip codesign but, in parallel of a traditional compiler !

Cons some feature could still be hard to support in CompCert Pro formal feedback on the ISA (semantics & compilation process) ⇒ Convergence with RISC-V community

  • n safety, security, embedded systems, etc.

11/19

slide-12
SLIDE 12

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Appendix (offline slides)

Topics The Coq proof assistant Trust in ELF binaries produced with CompCert More details on the CompCert/Kalray’s K1c

Topics 12/19

slide-13
SLIDE 13

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

The Coq proof assistant

A language to formalize mathematical theories (and their proofs) with a computer. Examples :

  • Four-color & Odd-order theorems by Gonthier-et-al.
  • Univalence theory by Voevodsky (Fields Medalist).

With a high-level of confidence :

  • Logic reduced to a few powerful constructs ;

Proofs checked by a small verifiable kernel

  • Consistency-by-construction of most user theories

(promotes definitions instead of axioms)

ACM Software System Award in 2013 for Coquand, Huet, Paulin-Mohring et al.

Topics The Coq proof assistant 13/19

slide-14
SLIDE 14

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Formally proved programs in the Coq proof assistant

The Coq logic includes a functional programming language with pattern-matching on tree-like data-structures.

Example : inserting a key x in a balanced binary tree t

Fixpoint add (x:key) (t:avltree ): avltree := match t with | Leaf ⇒ Node 1 Leaf x Leaf | Node h l y r ⇒ match Key.compare x y with | Lt ⇒ bal (add x l) y r | Eq ⇒ Node h l y r | Gt ⇒ bal l y (add x r) end end

Extraction of Coq functions to OCaml + OCaml compilation to produce native code. ⇒ CompCert is programmed in both Coq and OCaml.

Topics The Coq proof assistant 14/19

slide-15
SLIDE 15

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Trust in ELF binaries produced with CompCert

Trust in binaries requires additional verifications, at least : ◮ absence of undefined behavior in C code (e.g. with Astr´ ee) ◮ correctness of assembling/linking (e.g. with Valex) Qualification of MTU development chain for Nuclear safety from K¨ aster, Barrho et al @ERTS’18

Topics Trust in ELF binaries produced with CompCert 15/19

slide-16
SLIDE 16

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Highly-modular certified postpass scheduler in CompCert

using “untrusted-oracle / certified-verifier” architecture

Scheduling is computed by an untrusted oracle

basic-block B → inequality system

solver − − − − − → solution → bundle-list lb

and dynamically verified (using symbolic evaluation of basic-blocks)

Asmblock Program PostpassScheduling Module AsmVLIW Program Error AbstractBasicBlock Verifiers Scheduling Oracle Hash Consing B lb B lb B, lb OK/Error Coq (trusted) OCaml (untrusted)

The solver is : ◮ by default, a greedy list scheduler (fast & near optimal) ◮ or, an ILP solver (optimal but very slow on some entries)

Topics More details on the CompCert/Kalray’s K1c 16/19

slide-17
SLIDE 17

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Compile-times (greedy list scheduler + its verifier)

100 101 102 Size of basic blocks 10

2

10

1

100 Time x1000 (s) Verifier Oracle slope of 1

Topics More details on the CompCert/Kalray’s K1c 17/19

slide-18
SLIDE 18

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

Speedup due to our scheduler in CompCert

b i n a r y _ s e a r c h f i l l b i n a r y _ s e a r c h s e a r c h b i t s l i c e d

  • a

e s b i t s l i c e d

  • t

e a c

  • m

p l e x _ m a t f l

  • a

t _ m a t f l

  • a

t _ m a t v 2 g l i b c _ q s

  • r

t h e a p s

  • r

t i d e a n t t q u i c k s

  • r

t s h a

  • 2

5 6 l i f t H e p t a g

  • n
  • r

a d i

  • t

r a n s L u s t r e v 4

  • h

e a t e r L u s t r e v 6

  • c
  • n

v e r t i b l e x

  • r

_ a n d _ m a t

0% 20% 40% 60% 80% 100% fastest ccomp ccomp nobundle ccomp pack ccomp noif

Topics More details on the CompCert/Kalray’s K1c 18/19

slide-19
SLIDE 19

Extending CompCert

  • S. Boulm´

e (Verimag, Grenoble-INP) RISC-V week @ Paris’2019

CompCert vs GCC on the Kalray-K1c

b i n a r y _ s e a r c h f i l l b i n a r y _ s e a r c h s e a r c h b i t s l i c e d

  • a

e s b i t s l i c e d

  • t

e a c

  • m

p l e x _ m a t f l

  • a

t _ m a t f l

  • a

t _ m a t v 2 g l i b c _ q s

  • r

t h e a p s

  • r

t i d e a n t t q u i c k s

  • r

t s h a

  • 2

5 6 l i f t H e p t a g

  • n
  • r

a d i

  • t

r a n s L u s t r e v 4

  • h

e a t e r L u s t r e v 6

  • c
  • n

v e r t i b l e x

  • r

_ a n d _ m a t

0% 20% 40% 60% 80% 100% fastest ccomp gcc o3 gcc o2 gcc o1 gcc o0

Topics More details on the CompCert/Kalray’s K1c 19/19