affect all software miscompilation bug
play

affect ALL software Miscompilation Bug int a, c, d, e = 1, f; int - PowerPoint PPT Presentation

affect ALL software Miscompilation Bug int a, c, d, e = 1, f; int fn1 () { int h; for (; d < 1; d = e) { h = (f == 0) ? 0 : 1 % f; if (f < 1) c = 0; $ gcc O0 test.c ; ./a.out else if (h) break ; $ gcc O2 test.c ; ./a.out } Floating


  1. affect ALL software

  2. Miscompilation Bug int a, c, d, e = 1, f; int fn1 () { int h; for (; d < 1; d = e) { h = (f == 0) ? 0 : 1 % f; if (f < 1) c = 0; $ gcc – O0 test.c ; ./a.out else if (h) break ; $ gcc – O2 test.c ; ./a.out } Floating point exception (core } dumped) int main () { fn1 (); return 0; } https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61383

  3. Crashing Bug int a; struct S0 { int f0; int f1; int f2; }; void fn1 () { int b = -1; $ clang – O0 test.c struct S0 f[1]; $ clang – O1 test.c if (a) { clang: Assertion failed. f[0] = f[b]; clang: error: Aborted (core dumped) } } int main () { fn1 (); return 0; } https://llvm.org/bugs/show_bug.cgi?id=18615

  4. Generate valid test programs • No undefined behavior • Determine the semantics of test programs • No referencing compilers •

  5. (EMI) generates valid, “equivalent” programs from existing programs *: V. Le, M. Afshari, and Z. Su. Compiler validation via equivalence modulo inputs. PLDI ‘14

  6. input I program P

  7. input I executed unexecuted program P output O

  8. input I ….. EMI output O

  9. input I ….. EMI equiv. w.r.t I output O

  10. Naïve EMI Instantiation • Randomly removes unexecuted code ã • Limitation Limited number of variants • Limited control- and data-flow diversity • Random generation • (*) V. Le, M. Afshari, and Z. Su. Compiler validation via equivalence modulo inputs. PLDI ‘14

  11. • Better mutation: deletion + injection Generates unlimited and diverse variants • • Guided generation: MCMC sampling Exposes deep compiler bugs • MCMC: Markov Chain Monte Carlo

  12. What to Inject? stmt-extractor existing code <context, statement> Context: conditions to apply a statement Used variables, functions, types, goto labels • Other properties (e.g., inserted loc must be in a loop) •

  13. How to Inject? input I < σ s , s> σ ⊨ σ s σ <context, statement> output O

  14. input I ? ? output O

  15. optimization problem Goal: generate more diverse variants

  16. Program Distance ∆ 𝑄, 𝑅 = 𝛽 ∗ 𝑒 𝑄 𝑂𝑝𝑒𝑓𝑡 , 𝑅 𝑂𝑝𝑒𝑓𝑡 + 𝛾 ∗ 𝑒 𝑄 𝐹𝑒𝑕𝑓𝑡 , 𝑅 𝐹𝑒𝑕𝑓𝑡 − 𝛿 ∗ |𝑄 − 𝑅| wℎ𝑓𝑠𝑓 𝑒 𝐵, 𝐶 = 1 − 𝐵 ∩ 𝐶 𝐵 ∪ 𝐶 𝑗𝑡 𝐾𝑏𝑑𝑑𝑏𝑠𝑒 𝑒𝑗𝑡𝑢𝑏𝑜𝑑𝑓

  17. Sampling High-value EMI Variants

  18. Sampling High-value EMI Variants

  19. Sampling High-value EMI Variants

  20. Sampling High-value EMI Variants

  21. Sampling High-value EMI Variants

  22. Sampling High-value EMI Variants ….

  23. int a, c, d, e = 1, f; int fn1 () { int h; for (; d < 1; d = e) { h = (f == 0) ? 0 : 1 % f; if (f < 1) c = 0; $ gcc – O0 test.c ; ./a.out else if (h) break ; $ gcc – O2 test.c ; ./a.out } Floating point exception (core } dumped) int main () { fn1 (); return 0; } https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61383

  24. int a, c, d, e = 1, f; int a, c, d, e = 1, f; int fn1 () { int fn1 () { int h; int h; for (; d < 1; d = e) { for (; d < 1; d = e) { h = (f == 0) ? 0 : 1 % f; h = (f == 0) ? 0 : 1 % f; Athena if (f < 1) c = 0; if (f < 1) c = 0; else c = 1; else if (h) break ; } } } } ==DB Entry== int main () { int main () { requires_loop fn1 (); fn1 (); i: int return 0; return 0; ----- } } å å ã ã if (i) break;

  25. int a, c, d, e = 1, f; int fn1 () { int h; for (; d < 1; d = e) { PRE: loop invariant h = (f == 0) ? 0 : 1 % f; if (f < 1) c = 0; else if (h) break ; } } int main () { fn1 (); return 0; } PRE: Partial Redundancy Elimination

  26. int a, c, d, e = 1, f; int fn1 () { int h; int g = 1 % f; LIM: hoist (1 % f) for (; d < 1; d = e) { h = (f == 0) ? 0 : g ; if (f < 1) c = 0; else if (h) break ; } } $ gcc – O0 test.c ; ./a.out int main () { $ gcc – O2 test.c ; ./a.out fn1 (); Floating point exception (core return 0; dumped) } LIM: Loop Invariant Motion

  27. int a; struct S0 { int f0; int f1; int f2; }; void fn1 () { int b = -1; $ clang – O0 test.c struct S0 f[1]; $ clang – O1 test.c if (a) { clang: Assertion failed. f[0] = f[b]; clang: error: Aborted (core dumped) } } int main () { fn1 (); return 0; } https://llvm.org/bugs/show_bug.cgi?id=18615

  28. int a; int a; struct S0 { struct S0 { int f0; int f1; int f2; int f0; int f1; int f2; }; }; void fn1 () { void fn1 () { int b = -1; Athena int b = -1; struct S0 f[1]; struct S0 f[1]; if (a) { if (a) { f[0].f0 = b; f[0] = f[b]; } } } } =======DB Entry====== int main () { int main () { g: struct ( int x int x int ) [1] fn1 (); fn1 (); c: int return 0; return 0; ----------------------- } } g[0] = g[c]; å ã

  29. int a; struct S0 { int f0; int f1; int f2; }; void fn1 () { int b = -1; struct S0 f[1]; if (a) { Assertion Violation: f[0] = f[b]; negative index } } int main () { fn1 (); return 0; } https://llvm.org/bugs/show_bug.cgi?id=18615

  30. • Two machines running in 19 months • Seed programs: Csmith[1] Hard to reduce real-world projects • • Statement database: seed program Real-world code cannot be inserted into Csmith seeds • effectively [1] X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding bugs in C compilers. PLDI ‘11

  31. 19 months TOTAL BUGS BUG TYPES COMPILERS Fixed Confirmed Wrong Crash Perf GCC LLVM 3 5 27 32 40 32 69

  32. • Developers fixed our bugs (69/72) • 17/40 GCC bugs are P1 (highest priority) • 3 GCC bugs linked to real-world projects GCC • QtWebKit • glibc •

  33. Run Athena and Orion in parallel on 15 bugs in 1 week Seed Variant Database Recovered Generated Bug ID Affected Versions Affected Opt Levels SLOC SLOC Rows Bugs Variants gcc-59903 4.8, 4.9 -O3 4,694 6,238 1,723 14 23,479 gcc-60116 4.8, 4.9 -Os 11,596 11,843 3,092 367 20,082 gcc-60382 4.8, 4.9 -O3 6,151 21,903 1,989 19 21,267 gcc-61383 4.8, 4.9, 4.10 -O2, -O3 3,298 3,567 1,272 106 32,981 gcc-61452 4.8, 4.9, 4.10, 5.0 -O1, -Os 3,308 3,474 885 0 49,158 gcc-61917 4.9, 4.10, 5.0 -O3 11,820 11,226 3,066 2 32,562 gcc-64495 4.8, 4.9, 4.10, 5.0 -O3 2,767 1,951 517 4 45,896 gcc-64663 4.6, 4.7, 4.8, 4.9, 4.10, 5.0 -O1, -Os, -O2, -O3 11,118 12,160 2,875 0 26,626 llvm-20494 3.2, 3.3, 3.4, 3.5 -O2, -O3 8,080 11,009 1,683 2,660 24,588 llvm-20680 3.5, 3.6 -O3 6,250 7,584 1,753 22 23,438 llvm-21512 3.5, 3.6 -O1, -Os, -O2, -O3 8,455 5,087 3,081 988 21,882 llvm-22086 3.5, 3.6 -Os, -O2, -O3 5,220 8,495 1,711 0 29,279 llvm-22338 3.5, 3.6, 3.7 -O2, -O3 2,923 7,197 1,302 13 19,469 llvm-22382 3.2, 3.3, 3.4, 3.5, 3.6, 3.7 -Os, -O2, -O3 4,813 2,147 1,432 0 29,805 llvm-22704 3.6, 3.7 -O1, -Os, -O2, -O3 3,684 23,250 981 12 28,740

  34. Baseline: coverage of 100 seeds (GCC 34.9%, LLVM 23.5%) 1.6 Coverage Improvements (%) 1.4 1.2 1 0.8 0.6 0.4 0.2 0 Orion 10 Athena 10 Orion 25 Athena 25 Orion 50 Athena 50 Orion 100 Athena 100 Orion & Athena Configurations (# variants) GCC LLVM

  35. seed Athena’s Orion’s space space

  36. Questions?

  37. GCC LLVM TOTAL Fixed 39 30 69 Not-Yet-Fixed 1 2 3 WorksForMe 0 3 3 7 Duplicate 3 4 Invalid 1 0 1 TOTAL 44 39 83

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend