Many stories, two basic plots. common Fun with Gaussian - PDF document

average Many stories, two basic plots. common � Fun with Gaussian distributions: average A few billion lines of code later: static common (n inf) checking in the real world Andy Chou, Ben Chelf, Seth Hallem Scott McPeak, Bryan Fulton, Charles-Henri Gros, Ken Block, Anuj Goyal, Al Bessey � Social vs Technical: “What part of NO! do you not understand?” Chris Zak – No: you cannot touch the build. & many others – No: we will not change the source. No! Coverity – No: this illegal code is not illegal. Dawson Engler – No: we will not understand your tool. Associate Professor – No: we do not understand static analysis. Stanford One slide of context Caveats � Our tool [~2000]: � My (former) students run the company – The religion: Max bugs, min false errors or manual work. – I am just a voyeur – Inter-procedural, context-sensitive, a bit path-sensitive � Company is only tall for a midget – Aggressively unsound. Annotations viewed as evil. – Focus on things that will matter even more in larger settings settings lock_kernel(); lock kernel(); EDG frontend � This is just “a” way to do things, not THE way if (!de->count) { Linux “missing printk("free!\n"); fs/proc/ Lock checker – Just how we did things; not as claim that they are best return; unlock!” inode.c } unlock_kernel(); � General inferences from one data point = dubious. – Our needs roughly a lowest common denominator(*) – Worked reasonably well. Lots of bugs, papers, tenure. – (*): people building tool for single company need less: – Company successful enough that there is a marketing please speak up when your experience differs! dept. (Next: proof) Our Mission A short history of time [99-07] C++ analysis Security Java Analysis Stanford C analysis Research Concurrency Enterprise Program Management Satisfiability To improve software quality by automatically identifying and t ti ll id tif i d resolving critical defects and security vulnerabilities in your source code 1

Over 1 Billion Lines of Code Coverity Trial Process Test your code quality – Analyze your largest code base – One day set up, two hours for results presentation – Test drive the product at your facility Benefit to your team – Post trial report describing P t t i l t d ibi summary of findings – Sample defects from your code base – Fully functional defect resolution dashboard Trial = a cornerstone verb of company. Overview � “Does your thing work worth a damn on my code?” � Context – Ship “sales engineer” and sales guy to company � Now: – Run over code; next day go over results – A crucial myth. – If bugs good, they (may) buy. If suck… – Some laws of static analysis � First order requirements: – – And how much both matter. And how much both matter – Must *regularly* go in cold, touch nothing, shove 1- 10MLOC through tool, get good results. – Error reports must be clear since won’t understand code – Low false positives, good bugs since can’t cherry pick. � Some features: – $0. Most people can sign for that. Cuts days of � Then: The rest of the talk. negotiation. Sales guy goes farther. – Straight-technology sale. Often buyer=user. A naïve view Initial market analysis: � – “We handle Linux, BSD, we just need a pretty box!” – Obviously naïve. Law of static analysis: cannot check code you – But not for the obvious reasons. do not see. First law of checking: no check = no bug. � – Don’t check a system, a path, a property, then find no bugs. – Two even more basic laws we’d never have guessed mattered. 2

How to find all code? The right solution Simple: intercept and rewrite build commands. Kick off build and intercept all system calls � � – Rewrite “<build command>” as “cov_build <build command>” make –w > & out – Know exact location of compiler, version, options, environ. replay.pl out # replace ‘gcc’ w/ ‘prevent’ In early 2000s more important than quality of analysis? � – In theory: see all compilation calls and all options etc. – Go into company cold, touch nothing, kick off, see all code. Worked fine for a few customers. – Big lever: 10x more code = 10x more bugs � – Then: “make?” – Not bulletproof. Law:“Can’t find code w/o command prompt” – Kept plowing ahead. A typical story of $ transmuting the trite to first order � – “Why do I have to re-install my OS from CD after I run your – On windows: intercept = run compiler in debugger. tool?” – So? – Good question… – Widely-used msoft compiler has a use-after-free bug – Works fine normally. Until run w/ debugger! – Solution? Myth: the C language exists. � Well, not really. The standard is not a compiler. – The language people code in? – Whatever strings their compiler accepts. (Another) Law of static analysis: cannot check Fed illegal code, your frontend will reject it. code you cannot parse – It s your problem. Their compiler certified it. It’s *your* problem. Their compiler “certified” it. � Amplifiers: – Embedded = weird. Msoft: standard conformance = competitive disadvantage. C++ = language standard measured in kilos. Basic LALR law: � – What can be parsed will be written. Promptly. – The inverse of “the strong Whorfian hypothesis” is a empirical fact, given enough monkeys.. A sad movie that will gross exactly $0. Some specific example stories. coreHd coreHdr.h coreHd coreHdr.h some ille illegal l construct uct in int f t foo(int a t a, in int a); unsigned x unsi typedef char int; typedef char ned x @” @”text text”; ”; t; unsigned x unsi ned x = = 0xdead_beef; 0xdead_beef; vo void id x x; File File ileN c ileN.c File File ile1.c ile1 c #incl #i nclude de “co “coreHdr dr.h” #incl #i nclude de “co “coreHdr dr.h” …entire sy …enti e system stem… … … yo yourT urTool “Parse e error: ille illegal l use of …” “invalid su suffix ‘_ ‘_beef eef’ ’ on on “usele less ss type n name in in “stray ‘@’ i ‘@’ in prog ogram” “Deep analysis?! Your tool is so weak it can’t even parse C!” “redefin init itio ion o of para rameter : : ‘a’” integer constan teger constant” � “storage size empty decl empty declar ze o of ‘ arati ‘x’ is ation” is not known” on” 3

Many stories, two basic plots. common Fun with Gaussian - PDF document

average Many stories, two basic plots. common Fun with Gaussian distributions: average A few billion lines of code later: static common (n inf) checking in the real world Andy Chou, Ben Chelf, Seth Hallem Scott McPeak, Bryan Fulton,

L-102.00 1 Building Outline Map L-102.00 SCALE: 1" = 10' Date: 7/17/2017 Date:

Rosalind Dibley: Digital Stories and Me Sharing stories Information about how to tell stories is

Analysis of variance and regression November 13, 2007 SAS graphics Scatter plots

Lecture 4: Visualization Outline Basic plotting commands Types of plots Customizing

West 108th Street Development West Side Federation for Senior & Supportive Housing Dattner

Interpretation of forest plots Part I 1 At the end of this lecture, you should be able to

Quantile plots: New planks in an old campaign Nicholas J. Cox Department of Geography 1

Plotting Dr. Mihail September 25, 2018 (Dr. Mihail) Plots September 25, 2018 1 / 24 Plots

TDR Plots Elizabeth Worcester LBL PWG Meeting May 13, 2019 1 Intro Brief update with new

CS320: Performance Evaluation Plotting data sets Semi log plots Log log plots Analyzing Program

TDR plots updates p L. Escudero for the Pandora Team DUNE FD Sim/Reco meeting 15th of

Intro to R - 4. Base R Plots OIT/SMU Libraries Data Science Workshop Series Michael Hahsler OIT,

CS320: Performance Evaluation Plotting data sets Semi-log plots Log-log plots Analyzing Program

Plotting in 3D and animation Dr. Mihail October 2, 2018 (Dr. Mihail) Plots October 2, 2018 1

Toward Efficient Many-to-Many Broadcast in Dynamic Wireless Networks Fabian Mager , Carsten

Future Scenarios for the Waikato Scenarios Four stories about the future These are not

Cross ring data move 1. Segmentation based protection breaks 2. Kernel level actual data move

Gradual Program Verification (with Implicit Dynamic Frames) Johannes Bader , Karlsruhe Institute of

Experiences and Challenges Scaling PFLOTRAN, a PETSc-based Code For Implicit Solution of

www.Novotechsoftware.com Soil void ratio Soil density sity Soil permeab meabil ility ity

Structural Health Monitoring Mechanical and impact damage

Automation and Programming with Stata Christopher F Baum Boston College and DIW Berlin NCER,

TensorRT 2. Setup of the TensorRT inference engine 2. Setup of the TensorRT inference engine 3. I/O

Improved MOX fuel calculations using new Pu-239, Am-241 and Pu-240 evaluations Wonder 2012 | G.

Many stories, two basic plots. common Fun with Gaussian - PDF document

average Many stories, two basic plots. common Fun with Gaussian distributions: average A few billion lines of code later: static common (n inf) checking in the real world Andy Chou, Ben Chelf, Seth Hallem Scott McPeak, Bryan Fulton,

L-102.00 1 Building Outline Map L-102.00 SCALE: 1&quot; = 10' Date: 7/17/2017 Date:

Rosalind Dibley: Digital Stories and Me Sharing stories Information about how to tell stories is

Analysis of variance and regression November 13, 2007 SAS graphics Scatter plots

Lecture 4: Visualization Outline Basic plotting commands Types of plots Customizing

West 108th Street Development West Side Federation for Senior &amp; Supportive Housing Dattner

Interpretation of forest plots Part I 1 At the end of this lecture, you should be able to

Quantile plots: New planks in an old campaign Nicholas J. Cox Department of Geography 1

Plotting Dr. Mihail September 25, 2018 (Dr. Mihail) Plots September 25, 2018 1 / 24 Plots

TDR Plots Elizabeth Worcester LBL PWG Meeting May 13, 2019 1 Intro Brief update with new

CS320: Performance Evaluation Plotting data sets Semi log plots Log log plots Analyzing Program

TDR plots updates p L. Escudero for the Pandora Team DUNE FD Sim/Reco meeting 15th of

Intro to R - 4. Base R Plots OIT/SMU Libraries Data Science Workshop Series Michael Hahsler OIT,

CS320: Performance Evaluation Plotting data sets Semi-log plots Log-log plots Analyzing Program

Plotting in 3D and animation Dr. Mihail October 2, 2018 (Dr. Mihail) Plots October 2, 2018 1

Toward Efficient Many-to-Many Broadcast in Dynamic Wireless Networks Fabian Mager , Carsten

Future Scenarios for the Waikato Scenarios Four stories about the future These are not

Cross ring data move 1. Segmentation based protection breaks 2. Kernel level actual data move

Gradual Program Verification (with Implicit Dynamic Frames) Johannes Bader , Karlsruhe Institute of

Experiences and Challenges Scaling PFLOTRAN, a PETSc-based Code For Implicit Solution of

www.Novotechsoftware.com Soil void ratio Soil density sity Soil permeab meabil ility ity

Structural Health Monitoring Mechanical and impact damage

Automation and Programming with Stata Christopher F Baum Boston College and DIW Berlin NCER,

TensorRT 2. Setup of the TensorRT inference engine 2. Setup of the TensorRT inference engine 3. I/O

Improved MOX fuel calculations using new Pu-239, Am-241 and Pu-240 evaluations Wonder 2012 | G.

L-102.00 1 Building Outline Map L-102.00 SCALE: 1" = 10' Date: 7/17/2017 Date:

West 108th Street Development West Side Federation for Senior & Supportive Housing Dattner