a concrete memory model for compcert
play

A concrete memory model for CompCert Frdric Besson Sandrine Blazy - PowerPoint PPT Presentation

A concrete memory model for CompCert Frdric Besson Sandrine Blazy Pierre Wilke Rennes, France P . Wilke A concrete memory model for CompCert 1 / 28 CompCert real-world C to ASM compiler used in industry (commercialised by AbsInt)


  1. A concrete memory model for CompCert Frédéric Besson Sandrine Blazy Pierre Wilke Rennes, France P . Wilke A concrete memory model for CompCert 1 / 28

  2. CompCert • real-world C to ASM compiler used in industry (commercialised by AbsInt) • proven correct in Coq: it does not introduce bugs! Clight C Cminor RTL ASM P . Wilke A concrete memory model for CompCert 2 / 28

  3. CompCert • real-world C to ASM compiler used in industry (commercialised by AbsInt) • proven correct in Coq: it does not introduce bugs! Clight C Cminor RTL ASM P . Wilke A concrete memory model for CompCert 2 / 28

  4. CompCert • real-world C to ASM compiler used in industry (commercialised by AbsInt) • proven correct in Coq: it does not introduce bugs! Clight C Cminor RTL ASM Each language has a Formal Semantics i.e. a mathematical meaning for programs P . Wilke A concrete memory model for CompCert 2 / 28

  5. CompCert • real-world C to ASM compiler used in industry (commercialised by AbsInt) • proven correct in Coq: it does not introduce bugs! Clight C Cminor RTL ASM Each language has a Formal Semantics i.e. a mathematical meaning for programs Proof of semantic preservation For every source program S that has a defined semantics, If the compiler succeeds to generate a target program T , Then T has the same behavior as S . P . Wilke A concrete memory model for CompCert 2 / 28

  6. CompCert • real-world C to ASM compiler used in industry (commercialised by AbsInt) • proven correct in Coq: it does not introduce bugs! Memory model Clight C Cminor RTL ASM Each language has a Formal Semantics i.e. a mathematical meaning for programs Proof of semantic preservation For every source program S that has a defined semantics, If the compiler succeeds to generate a target program T , Then T has the same behavior as S . P . Wilke A concrete memory model for CompCert 2 / 28

  7. Goal: Make the semantics of C more defined Why did C leave some behaviors undefined? • Portability • Performance Why do we want to make it more defined? • real-life programs use features that are undefined, according to C • the compilation theorem will be more useful What kind of undefined behaviors do we aim at? • undefined pointer arithmetic, i.e. bitwise operators • use of uninitialised memory Our starting point: CompCert P . Wilke A concrete memory model for CompCert 3 / 28

  8. An example of low-level C program in CompCert int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b q b r P . Wilke A concrete memory model for CompCert 4 / 28

  9. An example of low-level C program in CompCert int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b ( b , 0 ) b q b r P . Wilke A concrete memory model for CompCert 4 / 28

  10. An example of low-level C program in CompCert int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b ( b , 0 ) 42 b q b r P . Wilke A concrete memory model for CompCert 4 / 28

  11. An example of low-level C program in CompCert int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b ( b , 0 ) 42 b q Bitwise operators on pointers are b r undefined behavior! CompCert [JAR’09], KCC [POPL ’12], Krebbers [POPL ’14], Norrish [PhD’98]: undefined behavior Kang et al. [PLDI’15]: don’t model bitwise operators P . Wilke A concrete memory model for CompCert 4 / 28

  12. Contributions • Previous work [APLAS’14]: A memory model for low-level programs • This work: • integration of the memory model inside CompCert • correctness proofs of the memory model • correctness proofs of the transformations of the frontend (up to Cminor) P . Wilke A concrete memory model for CompCert 5 / 28

  13. Outline 1 CompCert’s memory model 2 New features of the memory model 3 Consistency of the memory models 4 CompCert proof: Overview 5 Conclusion P . Wilke A concrete memory model for CompCert 6 / 28

  14. Outline 1 CompCert’s memory model 2 New features of the memory model 3 Consistency of the memory models 4 CompCert proof: Overview 5 Conclusion P . Wilke A concrete memory model for CompCert 7 / 28

  15. New features of the memory model Symbolic expressions val ::= i | ( b , o ) not expressive enough We change the semantic domain to: expr ::= val | op 1 expr | expr op 2 expr P . Wilke A concrete memory model for CompCert 8 / 28

  16. New features of the memory model Symbolic expressions val ::= i | ( b , o ) not expressive enough We change the semantic domain to: expr ::= val | op 1 expr | expr op 2 expr Alignment constraints We need information about some bits of the concrete address of a pointer The alloc primitive takes an extra parameter mask , such that: A ( b ) & mask = A ( b ) P . Wilke A concrete memory model for CompCert 8 / 28

  17. Interaction with the memory model What is the semantics of reading from memory: *p ? In CompCert, p is evaluated into a pointer ( b , i ) , then we can use load ( M , b , i ) In our model, p is a symbolic expression. It needs to be transformed into a pointer so that we can use load . normalise : mem → expr → ⌊ val ⌋ We need to modify the semantics to include calls to normalise • memory accesses (load and store) • conditionnal branches P . Wilke A concrete memory model for CompCert 9 / 28

  18. Back to the example int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b 8 ( b , 0 ) 42 b q b r P . Wilke A concrete memory model for CompCert 10 / 28

  19. Back to the example int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b 8 ( b , 0 ) 42 b q ( b , 0 ) | 5 b r P . Wilke A concrete memory model for CompCert 10 / 28

  20. Back to the example int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b 8 ( b , 0 ) 42 b q ( b , 0 ) | 5 b r �� � � ( b , 0 ) | 5 ≫ 3 ≪ 3 P . Wilke A concrete memory model for CompCert 10 / 28

  21. Back to the example int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b 8 ( b , 0 ) 42 b q ( b , 0 ) | 5 b r normalise �� � � ( b , 0 ) | 5 ≫ 3 ≪ 3 ( b , 0 ) P . Wilke A concrete memory model for CompCert 10 / 28

  22. Back to the example int main(){ int * p = ( int *) malloc ( sizeof ( int )); *p = 42; int * q = p | 5; int * r = (q » 3) « 3; return *r; } b p b 8 ( b , 0 ) 42 b q ( b , 0 ) | 5 b r normalise �� � � ( b , 0 ) | 5 ≫ 3 ≪ 3 ( b , 0 ) P . Wilke A concrete memory model for CompCert 10 / 28

  23. Normalisation specification: concrete memories ( b 2 , 2 ) Abstract memory m 5 Concrete memories of m cm 1 cm 2 cm i ⊢ m cm 3 cm 4 • range : ] 0 ; 55 [ cm 5 • no overlap cm 6 • alignment 0 8 16 24 32 40 48 56 P . Wilke A concrete memory model for CompCert 11 / 28

  24. Normalisation: example 1 e = ((( b , 0 ) | 5 ) ≫ 3 ) ≪ 3 cm 1 = � ( b , o ) � cm 1 8 cm 2 = � ( b , o ) � cm 2 8 cm 3 = � ( b , o ) � cm 3 16 cm 4 = � ( b , o ) � cm 4 24 cm 5 = � ( b , o ) � cm 5 32 = � ( b , o ) � cm 6 cm 6 32 0 8 16 24 32 40 48 56 � e � cm 1 = ((( cm 1 ( b )+ 0 ) | 5 ) ≫ 3 ) = (( 8 | 5 ) ≫ 3 ) = (( 0b1000 | 5 ) ≫ 3 ) ≪ 3 = ( 0b1101 ≫ 3 ) ≪ 3 = 0b0001 ≪ 3 = 0b1000 = 8 = cm 1 ( b ) ∀ i , � e � cm i = cm i ( b ) , hence e normalises into ( b , 0 ) P . Wilke A concrete memory model for CompCert 12 / 28

  25. Normalisation: example 2 e = ( b , 0 ) > ( b ′ , 0 ) cm 1 true cm 2 true cm 3 true cm 4 false cm 5 false cm 6 false 0 8 16 24 32 40 48 56 There is no v such that ∀ i , � e � cm i = � v � cm i , hence e doesn’t normalise P . Wilke A concrete memory model for CompCert 13 / 28

  26. CompCert with symbolic expressions expr ::= val | op 1 expr | expr op 2 expr b 2 b 1 0 ( b 2 , 2 ) 5 b 3 7 5 ( b , o ) | 5 Memory model Clight C Cminor RTL ASM S S S S S P . Wilke A concrete memory model for CompCert 14 / 28

  27. Outline 1 CompCert’s memory model 2 New features of the memory model 3 Consistency of the memory models 4 CompCert proof: Overview 5 Conclusion P . Wilke A concrete memory model for CompCert 15 / 28

  28. How does our model compare to CompCert? x ( t ) x ( t ) t t Behaviors in CompCert Behaviors with symbolic expressions We are an extension of CompCert P . Wilke A concrete memory model for CompCert 16 / 28

  29. How does our model compare to CompCert? Formally, Lemma expr_add_ok : ∀ v 1 v 2 m v , sem_add v 1 v 2 m = ⌊ v ⌋ → ∃ e , sem_add_expr v 1 v 2 m = ⌊ e ⌋ ∧ normalise m e = v . If the addition of v 1 and v 2 succeeds in CompCert, Then it should succeed in our model as well, And the expression we compute should normalise into the same value. P . Wilke A concrete memory model for CompCert 17 / 28

  30. Discovery of bugs 2 cases where our model disagrees with CompCert • Bug in CompCert 2.4: Pointer comparison to NULL (fixed in CompCert 2.5) • Bug in our model: incorrect handling of pointers one past the end P . Wilke A concrete memory model for CompCert 18 / 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend