An industrial study ALS pt Adj Prof Pr Emanuelsson Ericsson and - PowerPoint PPT Presentation

Static Analysis methods and tools An industrial study ALS pt Adj Prof Pär Emanuelsson – Ericsson and LiU le Prof Ulf Nilsson – LiU pt

le Outline pt le pt  Why static analysis? t pt  What is it? Underlying technology pt  An example  Some tools (Coverity, KlocWork, PolySpace , …)  Some case studies from Ericsson  Conclusions 2 2013-11-25

le Method used pt le pt Tool comparision based on t pt  White papers pt  Research reports from research groups behind tools  Interviews with Ericsson staff  Interviews with technical staff from tool vendors 3 2013-11-25

le What is SA and pt le what can it be used for? pt  Definition: t pt – Analysis that does not actually run the code pt  Our interest is: – Finding defects (preventing run-time errors) – Finding security vulnerabilities  Other uses – Code optimization (e.g. removing run-time checks in safe languages) – Metrics – Impact analysis 4 2013-11-25

le Pros and cons of static analysis pt le pt  Pros t pt – No test case design needed – pt No test-oracle needed – May detect hard-to-find bugs – Analyzed program need not be complete – Stub writing easier  Cons – Potentially large number of ” false positives ” – Does not relate to functional requirements – Takes programming competence to understand reports 5 2013-11-25

le Comparison to other techniques pt le pt  Compared to Testing t pt – No test case design needed – pt No test-oracle needed – Can find defects that no amount of testing can do  Compared to Formal proofs (e.g. model checking) – More lightweight – SA is much easier to use – SA does not need formal requirements 6 2013-11-25

le Software defects and errors pt le pt  Software defect : an anomaly in code that might t pt manifest itself as an error at run-time pt  Types of defects found by static analysis – Abrupt termination (e.g. division by zero) – Undefined behavior (e.g. array index out of bounds) – Performance degradation (e.g. memory leaks, dead code) – Security vulnerabilities (e.g. buffer overruns, tainted data)  Defects not (easily) found with static analysis – Functional incorrectness – Infinite loops/non-termination – Errors in the environment 7 2013-11-25

le Examples of checkers (C-code) pt le pt  Null pointer dereference t pt  Uninitialized data pt  Buffer/array overruns  Dead code/unused data  Bad return values  Return pointers to local data  Arithmetic operations with undefined result  Arithmetic over-/underflow  Parallel execution bugs  (Non-termination) 8 2013-11-25

le Security vulnerabilities pt le pt  Unsafe system calls t pt  Weak encryption pt  Access problems  Unsafe string operations  Buffer overruns  Race conditions (Time-of-check, time-of-use)  Command injections  Tainted (untrusted) data 9 2013-11-25

le Buffer overflow pt le pt Char dst[256]; t pt Char* s = read_string(); pt Strcpy(dst, s); 10 2013-11-25

le Imprecision of analyses pt le pt  Defects checked for by static analysis are undecidable t pt  Analyses are necessarily imprecise  As a consequence pt – Code complained upon may be correct (false positives) – Code not complained upon may be defective (false negatives)  Classic approaches to static analysis (sound analyses) report all defects checked for (no false negatives), but sometimes produce large amounts of false positives;  Most industrial systems try to eliminate false positives but introduce false negatives as a consequence 11 2013-11-25

le Imprecision vs analysis time pt le pt Precision depends heavily on analysis time t pt  Flow sensitive analysis – pt Takes program control flow into account  Context sensitive analysis – Takes values of global variables and actual parameters of procedure calls into account  Path sensitive analysis – Takes only valid execution paths into account  Value analysis – Value ranges – Value dependencies 12 2013-11-25

le Example pt le 1: f = 1 pt fact(int n) { t n pt 1) int f = 1; 2: n > 0 pt 2) while( n > 0 ) { y 3) f = f * n; 3: f = f * n n = n – 1; 4) } 4: n = n - 1 5) return f; } 5: return f Control Flow Graph (CFG) 13 2013-11-25

le Program states (configurations) pt le pt  A program state is a mapping (function) from program t pt variables to values. For example pt  1 = { n  1, f  0 }  2 = { n  3, f  0 }  3 = { n  5, f  0 } 14 2013-11-25

le Semantic equations pt le pt  We associate a set x i of states with node i of the CFG t pt (the set of states that can be observed upon reaching pt the node) x 1 = {{ n  1, f  0 }, { n  3, f  0 }} % Example x 2 = {  |  ’  x 1 &  (n)=  ’ (n) &  (f)=1 }  {  |  ’  x 4 &  (n)=  ’ (n)-1 &  (f)=  ’ (f) } x 3 = {  |  x 2 &  (n) > 0 } x 4 = {  |  ’  x 3 &  (n)=  ’ (n) &  (f)=  ’ (f)*  ’ (n) } x 5 = {  |  x 2 &  (n)  0 } 15 2013-11-25

le Example run pt le pt Initially x1 = x2 = x3 = x4 = x5 =  t pt  x1 = {{n=1,f=0},{n=3,f=0}} given pt  x2 = {{n=1,f=1},{n=3,f=1}} f=1  x3 = {{n=1,f=1},{n=3,f=1}} n>0  x4 = {{n=1,f=1},{n=3,f=3}} f=f*n  x2 = {{n=0,f=1},{n=1,f=1},{n=2,f=3},{n=3,f=1}} f=1>2&4, n=n-1>1&3  x3 = {{n=1,f=1},{n=2,f=3},{n=3,f=1}} n>0  x4 = {{n=1,f=1},{n=2,f=6},{n=3,f=3}} f=f*n  x2 = {{n=0,f=1},{n=1,f=1},{n=1,f=6},{n=2,f=3},{n=3,f=1}}  x3 = {{n=1,f=1},{n=1,f=6},{n=2,f=3},{n=3,f=1}} n>0  x4 = {{n=1,f=1},{n=1,f=6},{n=2,f=6},{n=3,f=3}} f=f*n  x2 = {{n=0,f=1},{n=0,f=6},{n=1,f=1},{n=1,f=6},{n=2,f=3},{n=3,f=1}}  x3 = {{n=1,f=1},{n=1,f=6},{n=2,f=3},{n=3,f=1}} n>0  x5 = {{n=0,f=1},{n=0,f=6}} n<=0 16 2013-11-25

le Abstract descriptions of data pt le pt t pt ? ? = the set of all integers pt + = the set of all positive integers - + 0 = the set { 0 } 0 - = the set of all negative integers  = the empty set (=unreachable)  17 2013-11-25

le Abstract operations pt le pt Abstract multiplication t pt pt  ? + 0 - Any integer ? ? ? 0 ? > 0 + ? + 0 - = 0 0 0 0 0 0 - ? - 0 + < 0 18 2013-11-25

le Abstract operations pt le pt Abstract subtraction t pt pt ? + 0 - Any integer ? ? ? ? ? > 0 + ? ? + + = 0 0 ? - 0 + - ? - - ? < 0 19 2013-11-25

le Abstract semantic equations pt le pt t pt x 1 = { n = +,f = ? } pt x 2 = { n = lub*(x 1 (n), (x 4 (n) +)), f = lub*(+, x 4 (f)) } x 3 = { n = +, f = x 2 (f) } x 4 = { n = x 3 (n), f = x 3 (f)  x 3 (n)} x 5 = { n = ?, f = x 2 (f) } (*) lub(A,B) is the smallest description that contain both A and B (kind of set union) 20 2013-11-25

le Example abstract run pt le pt Initially x1 = x2 = x3 = x4 = x5 = { n=  , f=  } t pt pt  x1 = { n=(+),f= ? } given  x2 = { n=(+),f=(+) }  x3 = { n=(+),f=(+) }  x4 = { n=(+),f=(+) }  x2 = { n= ?,f=(+) }  x3 = { n=(+),f=(+) }  x5 = { n= (+),f=(+) } 21 2013-11-25

le SA techniques pt le pt 1. Pattern matching t pt 2. Control flow analysis pt 3. Data flow analysis 4. Value analysis 1. Intervals 2. Aliasing analysis 3. Variable dependencies 5. Abstract interpretation 22 2013-11-25

le Examples of dataflow analysis pt le pt  Reaching definitions (which definitions reach a point) t pt  Liveness (variables that are read before definition) pt  Definite assignment (variable is always assigned before read)  Available expressions (already computed expressions)  Constant propagation (replace variable with value) 23 2013-11-25

le Aliasing pt le pt  x [ i ] = 5  x = 5 t pt  x [ j ] = 10  y = 10 pt  = x[i]  = x 24 2013-11-25

le Tool comparison pt le pt t Tool Coverity Klocwork Polyspace Flexelint pt pt Language C/C++/Java C/C++/Java C/C++/ADA C/C++ Program size MLOC MLOC 60KLOC MLOC Soundness Unsound Unsound Sound Unsound False positives few few many many Analysis def,sec def,sec,met def def incrementality yes no no no 25 2013-11-25

le Coverity Prevent pt le pt  Company founded in 2002 t pt  Originates from Dawson Engeler ’ s research at Stanford pt  Well documented through research papers  Commonly viewed as market leading product  Good results from Homeland Security ’ s audit project  Coverity Extend allows user-defined checks (Metal language)  Good explanations of faults  Good support for libraries  Incremental 26 2013-11-25

le Klocwork K7 pt le pt  Company founded by development group at Nortel t pt 2001 pt  Similar to Coverity (in checkers provided)  Besides finding defects: refactoring, code metrics, architecture analysis  Easy to get started and use  Good explanations of faults  Good support for foreign libraries 27 2013-11-25

An industrial study ALS pt Adj Prof Pr Emanuelsson Ericsson and - PowerPoint PPT Presentation

Static Analysis methods and tools An industrial study ALS pt Adj Prof Pr Emanuelsson Ericsson and LiU le Prof Ulf Nilsson LiU pt le Outline pt le pt Why static analysis? t pt What is it? Underlying technology pt An

Bushy Park Industrial Complex Bushy Park Industrial Complex Bushy Park Industrial Complex

MID Mould Industrial Ltd. MID Mould Industrial Ltd. Midday Industrial Ltd. td. Midday

Digital Industrial Revolution Bearing Specialists Association Greg Scheu, President ABB Americas

industrial IOT technologies INDUSTRIAL MARKET INSIGHTS within global industrial sector

Indirect Left Turns Study Indirect Left Turns Study Indirect Left Turns Study Indirect Left

Case study 2 Case study 2 Case study 2 Case study 2 Former Industrial Site, London: How has

PRESENTATION ACCOLADE INDUSTRIAL FUND We invest in smart Industrial Parks for a sustainable

Macro-Industrial Working Group Meeting 2: Industrial updates and Preliminary results Macro

AEO2018 Industrial Working Group meeting 1: planned updates Industrial Working Group Industrial

Take Control Automated Control Industrial Engineering Industrial I.T. C/ Halcn n 20, 6H

ACCOLADE INDUSTRIAL FUND PRESENTATION We invest in smart Industrial Parks for a sustainable

Business Name : Dena Industrial Group Adress : Head office : Toos Industrial Estate, phase no.1,

INDUSTRIAL STRATEGY November 2012 Why do we need industrial strategy Industrial strategy

Welcome to Samara! INDUSTRIAL PARKS PREOBRAZHENKA CHAPAEVSK 1 INDUSTRIAL PARK

Industrial team plans for AEO2014 Macro Industrial Working Group (MIWG) Industrial Team: Kelly

3/12/2009 ODU/CREED 2008 Industrial Market Overview ODU/CREED 2008 Industrial Market Overview

Assessing Genre and Method Variation in Translation Using Computational Techniques Ekaterina

Self-adjoint Wheeler-DeWitt Operators, the Problem of Time and the Wave Function of the Universe

A New Class of N = 2 Topological Amplitudes Stefan Hohenegger ETH Z urich Institute for

Tree Transducers in Machine Translation Andreas Maletti Universitt Stuttgart Institute for

6d strings and exceptional instantons Seok Kim (Seoul National University) Geometric

EFFICIENTSEQUENTIALDECISIONMAKING ALGORITHMSFORCONTAINERINSPECTION OPERATIONS

Simulating Real-world Load Patterns when playback just wont cut it Wayne Roseberry, M

Anonymity Mobile Autumn 2018 Tadayoshi (Yoshi) Kohno yoshi@cs.Washington.edu Thanks to Dan