many stories two basic plots
play

Many stories, two basic plots. common Fun with Gaussian - PDF document

average Many stories, two basic plots. common Fun with Gaussian distributions: average A few billion lines of code later: static common (n inf) checking in the real world Andy Chou, Ben Chelf, Seth Hallem Scott McPeak, Bryan Fulton,


  1. average Many stories, two basic plots. common � Fun with Gaussian distributions: average A few billion lines of code later: static common (n inf) checking in the real world Andy Chou, Ben Chelf, Seth Hallem Scott McPeak, Bryan Fulton, Charles-Henri Gros, Ken Block, Anuj Goyal, Al Bessey � Social vs Technical: “What part of NO! do you not understand?” Chris Zak – No: you cannot touch the build. & many others – No: we will not change the source. No! Coverity – No: this illegal code is not illegal. Dawson Engler – No: we will not understand your tool. Associate Professor – No: we do not understand static analysis. Stanford One slide of context Caveats � Our tool [~2000]: � My (former) students run the company – The religion: Max bugs, min false errors or manual work. – I am just a voyeur – Inter-procedural, context-sensitive, a bit path-sensitive � Company is only tall for a midget – Aggressively unsound. Annotations viewed as evil. – Focus on things that will matter even more in larger settings settings lock_kernel(); lock kernel(); EDG frontend � This is just “a” way to do things, not THE way if (!de->count) { Linux “missing printk("free!\n"); fs/proc/ Lock checker – Just how we did things; not as claim that they are best return; unlock!” inode.c } unlock_kernel(); � General inferences from one data point = dubious. – Our needs roughly a lowest common denominator(*) – Worked reasonably well. Lots of bugs, papers, tenure. – (*): people building tool for single company need less: – Company successful enough that there is a marketing please speak up when your experience differs! dept. (Next: proof) Our Mission A short history of time [99-07] C++ analysis Security Java Analysis Stanford C analysis Research Concurrency Enterprise Program Management Satisfiability To improve software quality by automatically identifying and t ti ll id tif i d resolving critical defects and security vulnerabilities in your source code 1

  2. Over 1 Billion Lines of Code Coverity Trial Process Test your code quality – Analyze your largest code base – One day set up, two hours for results presentation – Test drive the product at your facility Benefit to your team – Post trial report describing P t t i l t d ibi summary of findings – Sample defects from your code base – Fully functional defect resolution dashboard Trial = a cornerstone verb of company. Overview � “Does your thing work worth a damn on my code?” � Context – Ship “sales engineer” and sales guy to company � Now: – Run over code; next day go over results – A crucial myth. – If bugs good, they (may) buy. If suck… – Some laws of static analysis � First order requirements: – – And how much both matter. And how much both matter – Must *regularly* go in cold, touch nothing, shove 1- 10MLOC through tool, get good results. – Error reports must be clear since won’t understand code – Low false positives, good bugs since can’t cherry pick. � Some features: – $0. Most people can sign for that. Cuts days of � Then: The rest of the talk. negotiation. Sales guy goes farther. – Straight-technology sale. Often buyer=user. A naïve view Initial market analysis: � – “We handle Linux, BSD, we just need a pretty box!” – Obviously naïve. Law of static analysis: cannot check code you – But not for the obvious reasons. do not see. First law of checking: no check = no bug. � – Don’t check a system, a path, a property, then find no bugs. – Two even more basic laws we’d never have guessed mattered. 2

  3. How to find all code? The right solution Simple: intercept and rewrite build commands. Kick off build and intercept all system calls � � – Rewrite “<build command>” as “cov_build <build command>” make –w > & out – Know exact location of compiler, version, options, environ. replay.pl out # replace ‘gcc’ w/ ‘prevent’ In early 2000s more important than quality of analysis? � – In theory: see all compilation calls and all options etc. – Go into company cold, touch nothing, kick off, see all code. Worked fine for a few customers. – Big lever: 10x more code = 10x more bugs � – Then: “make?” – Not bulletproof. Law:“Can’t find code w/o command prompt” – Kept plowing ahead. A typical story of $ transmuting the trite to first order � – “Why do I have to re-install my OS from CD after I run your – On windows: intercept = run compiler in debugger. tool?” – So? – Good question… – Widely-used msoft compiler has a use-after-free bug – Works fine normally. Until run w/ debugger! – Solution? Myth: the C language exists. � Well, not really. The standard is not a compiler. – The language people code in? – Whatever strings their compiler accepts. (Another) Law of static analysis: cannot check Fed illegal code, your frontend will reject it. code you cannot parse – It s your problem. Their compiler certified it. It’s *your* problem. Their compiler “certified” it. � Amplifiers: – Embedded = weird. Msoft: standard conformance = competitive disadvantage. C++ = language standard measured in kilos. Basic LALR law: � – What can be parsed will be written. Promptly. – The inverse of “the strong Whorfian hypothesis” is a empirical fact, given enough monkeys.. A sad movie that will gross exactly $0. Some specific example stories. coreHd coreHdr.h coreHd coreHdr.h some ille illegal l construct uct in int f t foo(int a t a, in int a); unsigned x unsi typedef char int; typedef char ned x @” @”text text”; ”; t; unsigned x unsi ned x = = 0xdead_beef; 0xdead_beef; vo void id x x; File File ileN c ileN.c File File ile1.c ile1 c #incl #i nclude de “co “coreHdr dr.h” #incl #i nclude de “co “coreHdr dr.h” …entire sy …enti e system stem… … … yo yourT urTool “Parse e error: ille illegal l use of …” “invalid su suffix ‘_ ‘_beef eef’ ’ on on “usele less ss type n name in in “stray ‘@’ i ‘@’ in prog ogram” “Deep analysis?! Your tool is so weak it can’t even parse C!” “redefin init itio ion o of para rameter : : ‘a’” integer constan teger constant” � “storage size empty decl empty declar ze o of ‘ arati ‘x’ is ation” is not known” on” 3

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend