How risky is the Cyber Independent Testing Lab software you use? { - - PowerPoint PPT Presentation

how risky is the
SMART_READER_LITE
LIVE PREVIEW

How risky is the Cyber Independent Testing Lab software you use? { - - PowerPoint PPT Presentation

How risky is the Cyber Independent Testing Lab software you use? { Sarah Zatko , Tim Carstens , Patrick Stach , Parker Thompson , mudge } @ CITL https://shmoo18.cyber-itl.org A non-profit organization based in USA Founded by Sarah


slide-1
SLIDE 1

How risky is the software you use?

https://shmoo18.cyber-itl.org

Cyber Independent Testing Lab { Sarah Zatko , Tim Carstens , Patrick Stach , Parker Thompson , mudge } @ CITL

slide-2
SLIDE 2

We are CITL

  • A non-profit organization based in USA
  • Founded by Sarah Zatko & mudge
  • Mission: to improve the state of software

security by providing the public with accurate reporting on the security of popular software

  • Funding from the Ford Foundation
  • Partners with Consumer Reports

https://www.consumerreports.org

& The Digital Standard

https://thedigitalstandard.org

slide-3
SLIDE 3
slide-4
SLIDE 4

Something like this, but for software security.

slide-5
SLIDE 5

How do you do this for software security?

slide-6
SLIDE 6
slide-7
SLIDE 7

Scores & Histograms

Hardened Gentoo Ubuntu 16 LTS Samsung UN55KS9000 LG 55UH8500

slide-8
SLIDE 8

Security Today: You can lead the pack by mastering the fundamentals.

Visio LG Samsung Ubuntu P55-E1

49UJ7700 UN55KS9000

16.04 # binaries 504 1740 4243 4991 aslr 98% 67% 80% 100% stack DEP 99% 99%* 99%* 99% 64 bit 0% 0% 0% 98% RELRO 100% 4% 9% 96% stack guards 68% 1% 57% 79% fully fortified 7% 0% 6% 11% partial fort 43% 1% 37% 42% has good 3% 3% 25% 4% has risky 68% 66% 67% 67% has bad 28% 34% 23% 28% has ick 3% 5% 5% 3%

slide-9
SLIDE 9

Our goals

1. Remain independent of vendor influence 2. Automated, comparable, quantitative analysis 3. Act as a user watchdog

  • Non-goal: find and disclose vulnerabilities
  • Non-goal: tell software vendors what to do
  • Non-goal: perform free security testing for

vendors

slide-10
SLIDE 10

Three big questions

  • 1. What works?
  • 2. How do you recognize when it’s being done?
  • 3. Who’s doing it?
slide-11
SLIDE 11

The basic idea

slide-12
SLIDE 12

Information Theory Perspective

  • Given a piece of software, we can ask
  • 1. Overall, how secure is it?
  • 2. What are all of its vulnerabilities?
  • (1) appears to ask for less-info than (2)
  • Our Question:

Develop an heuristic which can efficiently answer (1) but not necessarily (2)

slide-13
SLIDE 13

Step One: Static Measurements

  • Complexity
  • Functions called
  • Safety features

Years in the field give us a good starting point – look for the same things we’d look at when trying to pick a soft target to exploit. But, this field doesn’t know enough about impact/effectiveness of best practices.

slide-14
SLIDE 14

Early Promise

Browser “Underground” Exploit Price Microsoft Edge $80,000 Google Chrome $80,000 Apple Safari $50,000 Mozilla Firefox $30,000

slide-15
SLIDE 15

Step 2: Fuzzing! Lots of it.

  • Fuzzing provides a testable, recognized way to roughly measure

software’s “security”

  • The more robust software is when fuzzed, the less likely it is to be

exploitable

  • If we could fuzz everything, we wouldn't’ t even necessarily need the

heuristics

  • But we can’t, so
slide-16
SLIDE 16

Step 3: Profit! Bayes! (1/3)

  • For some software s, we know that we can’t compute

P(s is secure)

  • As a surrogate, we can compute probabilities of different fuzzing
  • utcomes, like:

Ph,k = P(h units of fuzzing against s yields < k unique crashes)

slide-17
SLIDE 17

Step 3: Profit! Bayes!(2/3)

  • Fuzzing is expensive, so we “go Bayesian”
  • Let M be an observable property of software
  • Examples: is compatible with RELRO, has “low complexity,” etc
  • For random s in S, consider the conditional probabilities

Ph,k(M) = P(h fuzzing on s yields < k unique crashes | M is true of s )

  • What we want:

Which M have Ph,k(M) > 0.5 for large log(h)/k ? Which indicators (M) can be used to predict fuzzing performance?

slide-18
SLIDE 18

Step 3: Profit! Bayes! (3/3)

Indicators might not be causal, and that’s OK:

  • It could be that M’s presence literally prevents crashes
  • But it could also be that M is mostly only found in software written by

teams who ship reliable software

  • If you’re looking for security, what difference does it make?
slide-19
SLIDE 19

Indicator Minerals

Want to find:

  • Diamond (US Geological Survey)

Look for:

  • Garnet

(Moha112100 @ Wikipedia)

  • Diopside

(Rob Lavinsky)

  • Chromite

(Weinrich Minerals, Inc.)

slide-20
SLIDE 20

Step 4: Reports While we work on gathering data and developing our model, we’re also

  • Developing reports
  • Building relationships with partner organizations

like Consumer Reports

  • Looking for security orgs to share data with
slide-21
SLIDE 21

The Progression of CITL Tech

Static (Prototype) Static (Extensible) AFL CITL-fuzz NEW FUZZER Today First Data First reports Final Model & Reports

slide-22
SLIDE 22

Applied Static Analysis

  • Lots of architectures: x86-*, ARM-*, MIPS-*
  • Lots of operating systems: Windows, Linux, OS X
  • Lots of binary formats: PE, ELF, MachO
  • Each with their own app-armoring features
  • Lots of versions of each of the above!
slide-23
SLIDE 23

OS Comparisons

  • Windows lags in stack guards, but has

good usage of CFI

  • Linux does more source fortification

than OSX

  • Windows has the best function hygiene
  • Linux’s function hygiene is slightly

worse than OSX’s

Ubuntu Windows OSX 16.04 10 10.13.1 64 bit 97% 66% 77% aslr 100% 99% 100% dep 99% 98% 100% stack_guards 79% 40% 73% fully fortified 11% 2% partial fort 42% 33% cfi 92% good 4% 19% 29% risky 67% 30% 60% bad 28% 3% 24% ick 3% 0% 2%

slide-24
SLIDE 24

Linux Browsers – Ubuntu 16.04

  • Scores are all very close, Firefox wins

by a nose in static analysis

  • Chrome’s sandbox isn’t factored into

score yet

  • All have inconsistent function hygiene
  • Opera takes a hit for lack of RELRO
  • Chrome lags behind in fortification use

Chrome Firefox Opera version 63.0.3239.13 57.0.4 50.0.2762.4 64bit 100% 100% 100% aslr 100% 100% 100% dep 100% 100% 100% relro 86% 100% 11% stack_guards 86% 87% 100% partial fortification 29% 70% 56% functions good 12% 4% 22% risky 86% 91% 100% bad 62% 61% 89% scores 5th % 35 64 43 50th % 58 78 48 95th % 71 86 65

slide-25
SLIDE 25

OSX Browsers

  • Firefox and Opera had all binaries 64

bit with ASLR, Stack DEP

  • Firefox also made most use of stack

guards and fortification

  • Chrome is the only one to enable

Heap protection flag

  • Safari isn’t using source fortification

much

  • Scores are very close, all near 95th

percentile for High Sierra (71)

  • Same general outcome as in Linux

Chrome Firefox Opera Safari 63.0.3239.13 57.0.4 50.0.2762.45 11.0.1 count 9 19 8 25 64bit 89% 100% 100% 88% aslr 89% 100% 100% 100% dep 100% 100% 100% 100% heap 11% 0% 0% 0% stack_guards 78% 95% 88% 68% partial fortification 33% 47% 38% 4% good 33% 37% 25% 8% risky 89% 95% 100% 44% bad 44% 68% 38% 8% scores 5th % 33 43 38 24 50th % 51 56 51 51 95th % 63 71 63 64

slide-26
SLIDE 26

Windows 10 Browsers

  • Scores are very close, but Edge wins by

a hair

  • 95th percentile is 64 for Win 10
  • Chrome has more 32 bit binaries than

the others

  • Edge is the only one with 100% CFI
  • Chrome and Opera do better on stack

guards

  • Firefox takes a hit because it excels in

neither, has more risky functions

Chrome Edge Firefox Opera version 63.0.3239 41.16299 57.0.4 50.0.2762 count 31 7 31 16 64bit 62% 100% 94% 100% dep 100% 100% 100% 100% aslr 100% 100% 100% 100% cfi 13% 100% 13% 38% stack guards 94% 57% 61% 94% functions good 0% 0% 3% 0% risky 9% 0% 16% 0% bad 9% 0% 0% 0% scores 5th % 23 44 7.5 44 50th % 44 64 44 44 95th % 64 64 44 64

slide-27
SLIDE 27

OSX Time Progression

  • Looked at four versions from 10.10.5 through 10.13.1
  • 7.7% increase in percent of binaries that are 64 bit
  • 2% increase in use of stack guards, good functions
  • Heap protection decrease correlates with ASLR increase?
  • High Sierra shows significant decrease in # of binaries (~400 fewer)

OSX OSX OSX OSX total 10.10.5 10.11.6 10.12.6 10.13.1 change # binaries 6449 6456 7017 6622 64bit 69% 71% 73% 77% +8 aslr 99% 99% 100%* 100%* +1 heap 5% 5% 4% 4% -1 stack_guards 71% 71% 72% 73% +2 good functions 27% 27% 27% 29% +2 risky functions 62% 62% 60% 60% -2 bad functions 25% 25% 24% 24% -1

slide-28
SLIDE 28

Safari Time Progression

  • New binaries introduced in High Sierra generally decreased

performance

  • Overall increases in 64bit and stack guards, but not consistently
  • Function hygiene got a bit worse, especially in High Sierra
  • Partial source fortification introduced in HS

Safari total in OSX 10.10.5 10.11.6 10.12.6 10.13.1 change # binaries 9 13 22 25 64bit 83% 92% 86% 88% +5* stack_guards 67% 69% 73% 68% +1 partial fortification 0% 0% 0% 4% +4 good functions 17% 15% 9% 8% -9 risky 50% 36% 38% 44% -6* bad 0% 8% 5% 8% +8

slide-29
SLIDE 29

Mining Useful Spectre Gadgets

  • Focus on BTB poisoning aka Variant 2 widgets
  • Use DFA to locate this pattern:
  • Op reg1,[base (+index)]
  • Base or Index either attacker controlled or useful data
  • … (anything that doesn’t destroy data in reg1)
  • Op [base (+index)],reg2 or Op reg2,[base (+index)]
  • Where base or index are reg1
  • Tl;dr: load, load or store
slide-30
SLIDE 30

Mining Useful Spectre Gadgets

slide-31
SLIDE 31

CITL: Impact

  • We’ve been reporting bugs
  • Firefox on OSX was missing ASLR (they fixed it quick!)
  • Several patches & bugs submitted to LLVM & Qemu
  • We’ve inspired others
  • Big shout-out to the Fedora Red Team
  • We’ve partnered to cover broader domains
  • Consumer Reports

https://www.consumerreports.org

  • The Digital Standard

https://thedigitalstandard.org

slide-32
SLIDE 32

CITL: Today and Tomorrow

  • We are building the tooling necessary to

compute the surrogate security scores at-scale

  • In the mean time, our static analyzers are

already making surprising discoveries: see our recent talks at DEFCON/Blackhat

  • Advice to software vendors:

Make sure your software employs every exploit mitigation our TAB has ever heard of!

slide-33
SLIDE 33

https://shmoo18.cyber-itl.org

Cyber Independent Testing Lab

{ Sarah Zatko , Tim Carstens , Parker Thompson , Patrick Stach , mudge } @ CITL