faster stronger c analysis with the clang static analyzer
play

Faster, Stronger C++ Analysis with the Clang Static Analyzer - PowerPoint PPT Presentation

Faster, Stronger C++ Analysis with the Clang Static Analyzer George Karpenkov, Apple Artem Dergachev, Apple Agenda Introduction to Clang Static Analyzer Using coverage-based iteration order Improved C++ constructor and destructor


  1. • Faster, Stronger C++ Analysis with the Clang Static Analyzer George Karpenkov, Apple Artem Dergachev, Apple

  2. Agenda • Introduction to Clang Static Analyzer • Using coverage-based iteration order • Improved C++ constructor and destructor support

  3. Agenda • Introduction to Clang Static Analyzer • Using coverage-based iteration order • Improved C++ constructor and destructor support

  4. Clang Static Analyzer Finds Bugs at Compile Time • Use-after-free bugs • Null pointer dereferences • Uses of uninitialized values • Memory leaks, etc…

  5. Analyzer Visualizes Paths • Inside IDE: Xcode, QtCreator, CodeCompass • From command line: generate HTML • $ scan-build make • http://clang-analyzer.llvm.org

  6. Analyzer Simulates Program Execution • Explores paths through the program • Uses symbols instead of concrete values • Generates reports on errors

  7. A Faster than Light Intro to the Analyzer x = 0 x = 0 int foo( int a) { a int x = 0; a ≠ 0 a = 0 x = 0 x = 0 TRUE FALSE if ( a != 0) x = 1; a ≠ 0 x = 1 return 1/0 x = 1 return 1/ x ; } 💦 CRASH! return 1 return 1/ x Code Control Flow Graph Exploded Graph

  8. Agenda • Introduction to Clang Static Analyzer • Using coverage-based iteration order • Improved C++ constructor and destructor support

  9. Problem: Path is Too Long • XNU (Darwin Kernel): many paths over 400 steps • Bug can be found on the first iteration • Aim: provide shorter , more concise diagnostics

  10. Analyzer Uses Worklist to Generate Exploded Graph worklist = { start } • Start: entry point while worklist : • Successors: node = worklist .pop() successors = execute( node ) • Simulated execution of a statement for successor in successors : • Allows different exploration strategies worklist .push( successor ) • Previously: DFS by default

  11. DFS Exploration Order Leads to Wasted Effort for int main() { cond() i = 0 for ( int i = 0; i < 2; ++ i ) { TRUE FALSE if (cond()) for i = 0 continue ; return 1/0; // 💦 crash cond() i = 1 } TRUE FALSE } for return 1/0 i = 1 EXIT

  12. DFS Exploration Order Leads to Wasted Effort for int main() { cond () i = 0 for ( int i = 0; i < 2; ++ i ) { TRUE FALSE if (cond()) for return 1/0 i = 0 continue ; return 1/0; // 💦 crash cond() i = 1 } TRUE FALSE } for return 1/0 i = 1 EXIT

  13. Problem Often Mitigated by Analyzer Heuristics • Deduplication • If same report is found multiple times, return shortest path • Budget per source location • Paths that visit a location more than 3 times get dropped • Budget per number of inlinings • … • In many unfortunate cases, shortest path not found at all

  14. Solution: Coverage-Based Iteration order • Record the number of times the analyzer visits each location • Use a priority queue: • Prefers source locations analyzer has visited fewer times so far • Finds bugs on first iteration when possible

  15. Coverage-Based Iteration Order int main() { for for ( int i = 0; i < 2; ++ i ) { cond() if (cond()) i = 0 continue; TRUE FALSE return 1/0 ; // 💦 crash return 1/0; } }

  16. Coverage-Based Iteration Order int main() { for for ( int i = 0; i < 2; ++ i ) { cond() if (cond()) i = 0 continue ; TRUE FALSE return 1/0; // 💦 crash return 1/0; } }

  17. Results: 95th Percentile of Path Length 300 95th Percentile of Path Length Before 95th Percentile of Path Length After 225 150 75 0 XNU openSSL postgres Adium sqlite3

  18. Results: Total Bug Reports 16% Increase in Number of Reports Found 1200 # Reports Before # Reports After 900 600 300 0 XNU openSSL postgres Adium sqlite3

  19. Agenda • Introduction to Clang Static Analyzer • Using coverage-based iteration order • Improved C++ constructor and destructor support

  20. Incomplete C++ Support Caused False Positives • Analyzer lost information on object construction • Analyzer lost track of objects before they were destroyed • Temporaries are hard!

  21. Constructor Call = Initialization Bookkeeping + Method Call

  22. 
 Initialization Bookkeeping In C Is Easy typedef struct {...} Point; 
 1. CallExpr 
 Point makePoint(); 
 Call 'makePoint()' to evaluate 
 contents of the structure Point P = makePoint(); 2. DeclStmt 
 DeclStmt 
 `- VarDecl ' P ' 'Point' 
 Put these contents 
 `- CallExpr 'makePoint' 'Point' into ' P '

  23. 
 Initialization Bookkeeping In C++ Is More Complicated struct Point { 
 1. CXXConstructExpr 
 ... 
 Call constructor like a method 
 Point(); 
 on the object P }; 
 Point P ; 2. DeclStmt 
 DeclStmt 
 `- VarDecl ' P ' 'Point' 
 Learn about the existence 
 `- CXXConstructExpr 'Point()' of variable P

  24. 
 Initialization Bookkeeping In C++ Is More Complicated struct Point { 
 2. DeclStmt 
 ... 
 Learn about the existence 
 Point(); 
 of variable P }; 
 Point P ; 1. CXXConstructExpr 
 DeclStmt 
 `- VarDecl ' P ' 'Point' 
 Call constructor like a method 
 `- CXXConstructExpr 'Point()' on the object P

  25. 
 Initialization Bookkeeping In C++ Is More Complicated struct Point { 
 1. DeclStmt 
 ... 
 Learn about the existence 
 Point(); 
 of variable P }; 
 Point P ; 2. CXXConstructExpr 
 DeclStmt 
 `- VarDecl ' P ' 'Point' 
 Call constructor like a method 
 `- CXXConstructExpr 'Point()' on the object P

  26. Initialization Bookkeeping In C++ Is More Complicated • The constructor needs to know what object is being constructed • CXXConstructExpr doesn't tell us everything in advance

  27. 
 
 
 
 
 Initialization Bookkeeping In C++ Takes Many Forms Variables: Heap allocation: Argument values: Point P (1, 2, 3); Point * P = new Point(1, 2, 3); draw(Point(1, 2, 3)); Point P = Point(1, 2, 3); Point * P = new Point[ N + 1]; 
 Point(1, 2, 3) - Point(4, 5, 6); Point P = Point(1); // cast from 1 void draw(Point P = Point(1, 2, 3)); 
 Point P = 1; // implicit cast from 1 
 draw(); // construct P 
 Temporaries: Point(1, 2, 3); Captured values: Constructor initializers: const Point & P = Point(1, 2, 3); const int & x = Point(1, 2, 3). x ; // copy to capture 
 struct Vector { 
 // determine in run-time 
 Point P ; [ P ]{ return P ; }(); Point P ; 
 const Point & P = 
 Vector() : P (1, 2, 3) {} 
 better lunarPhase() ? Point(1, 2, 3) 
 }; IT IS ONLY GETTING WORSE : Point(3, 2, 1); 
 struct Vector { 
 Point P = Point(1, 2, 3); 
 }; 
 Return values: Point getPoint() { 
 Aggregates and brace initializers: return Point(1, 2, 3); // RVO 
 } Point P {1, 2, 3}; Point getPoint() { 
 PointPair PP {Point(1, 2), 
 Point P (1, 2, 3); // NRVO 
 Point(3, 4)}; PointPairPair PPP {{{1, 2}, {3, 4}}, 
 return P ; 
 {{5, 6}, {7, 8}}}; } 
 std::vector<Point> V {{1, 2, 3}}; 


  28. There is a common theme

  29. Need to track the constructed object’s address until the analyzer processes the statement 
 that represents the object’s storage

  30. Solution: Construction Context • Augments CFG constructor call elements • Describes the construction site: • What object is constructed? • Who is responsible for destroying it? • Is it a temporary that requires materialization? • Is the constructor elidable?

  31. Solution: Construction Context • A construction syntax catalog • There are currently 15 classes 
 • Easy to identify and to support

  32. 
 
 
 
 
 Progress made… Variables: Heap allocation: Argument values: Point P (1, 2, 3); Point * P = new Point(1, 2, 3); draw(Point(1, 2, 3)); BEFORE NOW NOW Point P = Point(1, 2, 3); Point * P = new Point[ N + 1]; 
 Point(1, 2, 3) - Point(4, 5, 6); Point P = Point(1); // cast from 1 void draw(Point P = Point(1, 2, 3)); 
 NOW Point P = 1; // implicit cast from 1 
 draw(); // construct P 
 Temporaries: Point(1, 2, 3); Captured values: Constructor initializers: const Point & P = Point(1, 2, 3); const int & x = Point(1, 2, 3). x ; // copy to capture 
 struct Vector { 
 // determine in run-time 
 Point P ; [ P ]{ return P ; }(); Point P ; 
 NOW BEFORE const Point & P = 
 Vector() : P (1, 2, 3) {} 
 lunarPhase() ? Point(1, 2, 3) 
 }; : Point(3, 2, 1); 
 struct Vector { 
 Point P = Point(1, 2, 3); 
 }; 
 Return values: Point getPoint() { 
 Aggregates and brace initializers: NOW return Point(1, 2, 3); // RVO 
 } Point P {1, 2, 3}; BEFORE Point getPoint() { 
 PointPair PP {Point(1, 2), 
 Point P (1, 2, 3); // NRVO 
 Point(3, 4)}; PointPairPair PPP {{{1, 2}, {3, 4}}, 
 return P ; 
 {{5, 6}, {7, 8}}}; } 
 std::vector<Point> V {{1, 2, 3}}; 


Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend