CHESS Computers and Humans Exploring Software Security Mr. Dustin - - PowerPoint PPT Presentation

chess
SMART_READER_LITE
LIVE PREVIEW

CHESS Computers and Humans Exploring Software Security Mr. Dustin - - PowerPoint PPT Presentation

CHESS Computers and Humans Exploring Software Security Mr. Dustin Fraze 4/19/2018 1 Approved for public release; distribution is unlimited. CHESS Develop computer-human systems to rapidly discover all classes of vulnerability in complex


slide-1
SLIDE 1

CHESS

Computers and Humans Exploring Software Security

  • Mr. Dustin Fraze

4/19/2018

1

Approved for public release; distribution is unlimited.

slide-2
SLIDE 2

Develop computer-human systems to rapidly discover all classes of vulnerability in complex software

CHESS

2

Approved for public release; distribution is unlimited.

slide-3
SLIDE 3

3

Limits of Current Approaches

Approach Vulnerability Discovery Speed Vulnerability Discovery Accuracy Representative Software Complexity Human Low Low Web Browser Computer High Low Small Test Corpora Computer-Human Experiments1,2 High Moderate Small Test Corpora CHESS High High Web Browser

1Muntean et al. “Automated Detection of Information Flow Vulnerabilities in UML State Charts and C Code”, http://ieeexplore.ieee.org/document/7322134/, 2015 2Shoshitaishvili et al. “Rise of the HaCRS: Augmenting Autonomous Cyber Reasoning Systems with Human Assistance”, https://arxiv.org/abs/1708.02749, 2017

Approved for public release; distribution is unlimited.

slide-4
SLIDE 4

Source Code Resource Mgmt Errors Data/Code Injection Data Misuse Logic Errors Path Traversal Cryptographic Issues Access Control Errors Memory Corruption Arithmetic Errors Information Disclosure Input Validation Authentication Issues

Vulnerabilities

Binary Expert Hackers 1,000+ FTE hrs 1,000,000+ Complexity

Today’s Approach to Vulnerability Discovery

4

Ablon, Lily “Zero Days, Thousands of Nights”, https://www.rand.org/pubs/research_reports/RR1751.html, 2017 Muntean et al. “Automated Detection of Information Flow Vulnerabilities in UML State Charts and C Code”, http://ieeexplore.ieee.org/document/7322134/, 2015 Shoshitaishvili et al. “Rise of the HaCRS: Augmenting Autonomous Cyber Reasoning Systems with Human Assistance”, https://arxiv.org/abs/1708.02749, 2017

Automation Human

Approved for public release; distribution is unlimited.

slide-5
SLIDE 5

Memory Corruption Arithmetic Errors Information Disclosure

Vulnerabilities

Cyber Reasoning System Fuzzing Symbolic Execution Static Analysis SAT/SMT Solvers

Vulnerability Discovery with CGC

Binary 1,000+ Complexity

5

Ablon, Lily “Zero Days, Thousands of Nights”, https://www.rand.org/pubs/research_reports/RR1751.html, 2017 Muntean et al. “Automated Detection of Information Flow Vulnerabilities in UML State Charts and C Code”, http://ieeexplore.ieee.org/document/7322134/, 2015 Shoshitaishvili et al. “Rise of the HaCRS: Augmenting Autonomous Cyber Reasoning Systems with Human Assistance”, https://arxiv.org/abs/1708.02749, 2017

Automation Human

Approved for public release; distribution is unlimited.

slide-6
SLIDE 6

Ablon, Lily “Zero Days, Thousands of Nights”, https://www.rand.org/pubs/research_reports/RR1751.html, 2017 Muntean et al. “Automated Detection of Information Flow Vulnerabilities in UML State Charts and C Code”, http://ieeexplore.ieee.org/document/7322134/, 2015 Shoshitaishvili et al. “Rise of the HaCRS: Augmenting Autonomous Cyber Reasoning Systems with Human Assistance”, https://arxiv.org/abs/1708.02749, 2017

Cryptographic Issues Access Control Errors

Vulnerabilities

Source Code UML Analysis

Experimental Vulnerability Discovery with Novice Hackers

UML Generation Novice Hackers < 1 FTE hrs 1,000+ Complexity

6

UML: Unified Modeling Language Automation Human Vulnerability Discovery Accuracy 0% → 94%

Approved for public release; distribution is unlimited.

slide-7
SLIDE 7

Cryptographic Issues Access Control Errors Memory Corruption Arithmetic Errors Information Disclosure

Vulnerabilities

Cyber Reasoning System Fuzzing Symbolic Execution Static Analysis SAT/SMT Solvers Source Code UML Analysis Binary UML Generation Novice Hackers < 1 FTE hrs 1,000+ Complexity Non- Hackers 335 FTE hrs 1,000+ Complexity

7

Ablon, Lily “Zero Days, Thousands of Nights”, https://www.rand.org/pubs/research_reports/RR1751.html, 2017 Muntean et al. “Automated Detection of Information Flow Vulnerabilities in UML State Charts and C Code”, http://ieeexplore.ieee.org/document/7322134/, 2015 Shoshitaishvili et al. “Rise of the HaCRS: Augmenting Autonomous Cyber Reasoning Systems with Human Assistance”, https://arxiv.org/abs/1708.02749, 2017

UML: Unified Modeling Language Vulnerability Discovery Accuracy 0% → 94% Vulnerability Discovery Accuracy 42% → 66%

Experimental Vulnerability Discovery with Non-Experts

Automation Human

Approved for public release; distribution is unlimited.

slide-8
SLIDE 8

Collaborative Vulnerability Discovery with CHESS

8

Automation Human

TA1 Human Collaboration TA2 Vulnerability Discovery

TA3 Voice of the Offense Resource Mgmt Errors Data/Code Injection Data Misuse Logic Errors Path Traversal Cryptographic Issues Access Control Errors Memory Corruption Arithmetic Errors Information Disclosure Input Validation Authentication Issues

Vulnerabilities

Source Code Context Processor Expert Hackers Novice Hackers Non- Hackers Representation Generator Vulnerability Detector Info Gap Detector Cyber Reasoning System Binary Proof of Vulnerability TA4 Control Team Expert Hackers TA5 Integration, Test and Evaluation Representation For Humans Annotated Representation

Approved for public release; distribution is unlimited.

slide-9
SLIDE 9

TA1 Human Collaboration

Challenges Possible Approaches

Identify and generate representations that communicate information gaps to humans

  • UML Diagrams (Class, Activity, etc.)
  • Control Flow Graphs
  • Hilbert Curves for Cyclic Activity

Capture and process the insights humans generate by reasoning over the representations

  • Annotation/Label Sets
  • Instrumented Program Interaction
  • Human Mental Model Analysis

1. Process identified information gaps into human-understandable representations 2. Summarize and minimize software artifact data 3. Interact with human teammates using generated representations 4. Capture contextual insights from human 5. Process human feedback into machine- ingestible formats

9

Context Processor Expert Hackers Novice Hackers Non- Hackers Representation Generator Representation For Humans Annotated Representation

Approved for public release; distribution is unlimited.

slide-10
SLIDE 10

TA1 Human Collaboration

Strong Proposals will:

  • Reduce the cognitive load and effort required by human collaborators
  • Explore new representations and methods of human-computer

interaction for capturing human insights

  • Empower non-expert collaborators (novice hackers, non-hackers)
  • Scale from single computer-human collaboration to N:N team

collaboration

  • Address any relevant HSR issues (data collection, data anonymization,

test subject recruitment, etc.) Strong Proposals will NOT:

  • Involve invasive medical technology
  • Only improve performance of expert hackers

10

Approved for public release; distribution is unlimited.

slide-11
SLIDE 11

Challenges Possible Approaches

Identify information required to discover classes of vulnerabilities not addressed by automation

  • Type Usage
  • Semantic Metadata
  • Complexity Inference

Extend CRS technology to scale up and reason

  • ver new and existing representations
  • Compilation Instrumentation
  • Type Chain Analysis

Develop new vulnerability detection techniques to leverage human-provided insights

  • Object/Data Type Classification
  • Function Call Context
  • Semantic Concreteness/Clustering

TA2 Vulnerability Discovery

1. Analyze source code and related software artifacts for potential vulnerabilities 2. Identify regions of uncertainty and other

  • bstacles to automated analysis in

source code and related software artifacts 3. Identify vulnerabilities in target categories 4. Generate Proofs of Vulnerability (PoV) and patches Source Code Cyber Reasoning System Info Gap Detector Vulnerability Detector PoV Binary

11

Approved for public release; distribution is unlimited.

slide-12
SLIDE 12

TA2 Vulnerability Discovery

Strong Proposals will:

  • Identify vulnerability discovery techniques that may benefit from human

collaborator insights

  • Address vulnerability classes in a thorough and scalable manner
  • Generate patches that address underlying vulnerabilities completely and

specifically

  • Scale from single computer-human collaboration to N:N team

collaboration Strong Proposals will NOT:

  • Identify vulnerabilities inserted in challenge sets via diffing
  • Focus only on memory corruption and arithmetic errors
  • Rely primarily on fuzzing for vulnerability discovery

12

Approved for public release; distribution is unlimited.

slide-13
SLIDE 13

Source Code PoV PoV Spec. Binary Source Code Binary Source Code Source Code Vulnerable Patched 1. Develop challenge problems with vulnerabilities across all required classes and scaling from 10K to 1M+ complexity 2. Develop a source code patch for each challenge problem vulnerability 3. Develop a binary patch for each challenge problem vulnerability 4. Create a proof of vulnerability (PoV) specification for each vulnerability class 5. Develop a PoV for each challenge problem vulnerability

TA3 Voice of the Offense

Challenges Possible Approaches

Develop challenge problems scaling to 1M+ complexity

  • Large-scale Automated Vulnerability

Addition (LAVA) Ensure challenge problems are representative of required vulnerability classes

  • Vulnerability test corpora (Juliet, CGC,

OSS-FUZZ, etc.)

  • Public n-day databases

13

Vulnerability Injection

Approved for public release; distribution is unlimited.

slide-14
SLIDE 14

TA3 Voice of the Offense

Strong Proposals will:

  • Ensure challenge set coverage of all vulnerability classes
  • Scale challenge sets to be representative of large, complex codebases

Strong Proposals will NOT:

  • Allow challenge set vulnerabilities to impact production software
  • Search for 0-day vulnerabilities in production software

14

Approved for public release; distribution is unlimited.

slide-15
SLIDE 15

TA4 Control Team

15

Tasks

Create an expert hacker performance baseline against TA3 challenge problems Ensure CHESS R&D teams are aware of edge of the art techniques in software reverse engineering and exploitation 1. Leverage state of the art tools to find vulnerabilities in source code and binary challenge problems developed by TA 3 2. Develop a PoV for each vulnerability discovered in the challenge problems according to the provided PoV specification 3. Collect feedback during evaluations for post- evaluation review by the Symbiosis TA 4. Identify divergent and/or conflicting evaluation performance between the Control Team and CHESS system Evaluator PoV Spec. Binary Source Code Control Team PoV Symbiosis TA

Approved for public release; distribution is unlimited.

slide-16
SLIDE 16

TA4 Control Team

Strong Proposals will:

  • Demonstrate expertise in the state of the art in vulnerability discovery
  • Address both source-assisted and binary vulnerability discovery

Strong Proposals will NOT:

  • Identify vulnerabilities inserted in challenge sets via diffing

16

Approved for public release; distribution is unlimited.

slide-17
SLIDE 17

TA5 Integration, Test and Evaluation

17

Tasks

Integrate technology and techniques from TA1 and TA2 into a single platform for evaluation and transition Design and execute tests to measure CHESS system performance against TA3 challenge problems 1. Integrate components from TA1 and TA2 into a single working platform 2. Promote collaboration between performers 3. Evaluate integrated CHESS system performance against TA3 challenge problems 4. Recruit human collaborators for evaluations 5. Demonstrate and transition CHESS technology to identified industry and government partners

Approved for public release; distribution is unlimited.

slide-18
SLIDE 18

TA5 Integration, Test and Evaluation

Strong Proposals will:

  • Integrate CHESS system components in a continuous and collaborative

manner

  • Develop instrumented testbed environments for evaluations
  • Promote collaboration between all CHESS performers
  • Address any relevant HSR issues (data collection, data anonymization,

test subject recruitment, etc.) Strong Proposals will NOT:

  • Allow challenge set vulnerabilities to impact production software

18

Approved for public release; distribution is unlimited.

slide-19
SLIDE 19

CHESS Metrics

Phase Duration Phase 1 18 months Phase 2 12 months Phase 3 12 months Vulnerability Discovery Speed As fast as control 10x faster than control 100x faster than control Vulnerability Discovery Accuracy with Source Code 70% 85% 99% Vulnerability Discovery Accuracy without Source Code 50% 75% 99% Software Complexity Messaging App (10K) PDF Parser (150K) Web Browser (1M)

19

Approved for public release; distribution is unlimited.

slide-20
SLIDE 20

CHESS Schedule

Phase 1

18 months

Messaging App Phase 2

12 months

PDF Parser

Initial context extraction Hackathons Demonstrations Evaluations

TA2: Vulnerability Discovery TA1: Human Collaboration TA3: Voice of the Offense TA4: Control Team TA5: Integration, Test and Evaluation

Integration framework development Engagement strategy research and development Challenge problem development Source code vulnerability discovery Initial workflow decomposition Context extraction scaling and refinement Workflow decomposition scaling and refinement Integration framework scaling and refinement Challenge problem scaling research Binary vulnerability discovery 20

Phase 3

12 months

Web Browser

Approved for public release; distribution is unlimited.

slide-21
SLIDE 21

www.darpa.mil

21

Approved for public release; distribution is unlimited.