CREST Workshop Rick Schantz, Partha Pal, Aaron Paulos, Joe Loyall, - - PowerPoint PPT Presentation

crest workshop rick schantz partha pal aaron paulos joe
SMART_READER_LITE
LIVE PREVIEW

CREST Workshop Rick Schantz, Partha Pal, Aaron Paulos, Joe Loyall, - - PowerPoint PPT Presentation

Multiple Views on Multiplicity Computing: Opportunities Viewed through a Cyber-Security Lens CREST Workshop Rick Schantz, Partha Pal, Aaron Paulos, Joe Loyall, Kurt Rohloff Distributed Systems Technology Group March 23, 2012 1982: R&D


slide-1
SLIDE 1

Multiple Views on Multiplicity Computing: Opportunities Viewed through a Cyber-Security Lens

CREST Workshop

Rick Schantz, Partha Pal, Aaron Paulos, Joe Loyall, Kurt Rohloff Distributed Systems Technology Group

March 23, 2012

slide-2
SLIDE 2

1982: R&D Computing Landscape

2

Multiplicity emerging …

slide-3
SLIDE 3

1982: Heterogeneity, Specialization Among Plenty (or so it seemed at the time)

3

slide-4
SLIDE 4

1990s Integrated Adaptive System Concept

System-wide QoS

Distribution Middleware QoS Network QoS Common Middleware Services QoS Operating System QoS Application or Domain-specific QoS

Contract

QoS Adaptive Control

Contract

QoS Adaptive Control

Contract

QoS Adaptive Control

ACE/TAO RT ORB ACE/TAO RT ORB ACE/TAO RT ORB IntServ/RSVP Operating System IntServ/RSVP Operating System IntServ/RSVP Operating System IntServ/RSVP Operating System IntServ/RSVP Operating System IntServ/RSVP Operating System

Contract

QoS Adaptive Control

Contract

QoS Adaptive Control

ACE/TAO RT ORB ACE/TAO RT ORB

slide-5
SLIDE 5

Dynamic Quality of Service is a Key Aspect of Mission Critical Distributed Systems

  • Capture QoS aspects
  • f mission

requirements

  • Effectively utilize

available resources for mission effectiveness

  • Manage the resources

that could become bottlenecks

  • Mediate conflicting

demands for resources

  • Dynamically reallocate

as conditions change

QoS management for distributed systems strives to provide a predictable high level of mission effectiveness and user satisfaction within available resources.

5

Utility Resources

Gracefully handle degraded and hostile situations Effectively utilize resources

slide-6
SLIDE 6

Allocating Resources According to Utility

  • How to determine mission

utility?

  • Each mission has multiple

sets of tasks called application strings.

– Take weighted sum of string utilities – Weighting for relative importance of strings.

  • String utility
  • Quality of Service Factors:

– Timeliness – Availability – Quality – Throughput

s j N i s j m i

UA w UA

i

1

System Utility Mission Utility Mission Utility Mission Utility String Utility String Utility String Utility

) , , , ( Th q a T F UAs

j 

  • Maximize end-user value!
  • Dynamically adjust resource

allocation.

  • Continuous end-to-end

improvement.

  • Robust to variations in

system behavior.

  • Maximize utility across

deployed missions.

  • Gracefully handle resource

failures.

slide-7
SLIDE 7

Information Supplier/ Consumer Information Supplier/ Consumer

Multi-Layered End-to-End QoS Management

End-to-end QoS management must

– Manage all the resources that can affect QoS, i.e., anything that could be a bottleneck at any time during the operation of the system (e.g., CPU, bandwidth, memory, power, sensors, …) – Shape the data and processing to fit the available resources and the mission needs

  • What can be delivered/processed
  • What is important to deliver/process

– Includes capturing mission requirements, monitoring resource usage, controlling resource knobs, and runtime reallocation/adaptation Information Supplier Information Consumer

Network

Control and Monitor CPU Processing

– CPU Reservation or CPU priority and scheduling – Have versions that work with CPU broker, RT CORBA, RTARM

Control and Monitor Network Bandwidth

– Set DiffServ CodePoints (per ORB, component server, thread, stream, or message) – Work with DSCP directly or with higher level bandwidth brokers – Priority-based (Diffserv) or reservation-based (RSVP)

Dynamic QoS realized by

  • Assembly of QoS components
  • Paths through QoS components
  • Parameterization of QoS components
  • Adaptive algorithms in QoS components

Coordinated QoS Management

Shape and Monitor Data and Application Behavior

– Shape the data to fit the resources and the requirements – Insert using components, objects, wrappers, aspect weaving, or intercepters – Library that includes scaling, compression, fragmentation, tiling, pacing, cropping, format change

System resource managers allocate available resources based on mission requirements, participants, roles, and priorities Local resource managers decide how best to utilize the resource allocation to meet mission requirements

slide-8
SLIDE 8

QoS Administration Information Services QoS Manager (ISQM)

QoSPolicyContext; PreferenceContext Policy actions

Task Manager LQM Service Task queues

Insert task Extract task Get thread to assign to task

Thread Pool

Info instances Client IDs (broker, filter, read IO only) Insert info Extract info

Pluggable Policy Store

  • Authent. token;

Orchestration instance Policy

QoSContext

Context attributes

Task Creation

Operation task object Operation

Client

  • Diss. queues

Status information Metrics

Xlayer

QoS Context Information instance (via Information Channel)

Bandwidth Manager

BW allocation Parsed policy values

Mission Management QoS Display

  • Dissem. Mgr

LQM Service Client Monitoring Service Task (Broker, Read Info, Filter, Query, Archive) Rate Limiting Control Client

Status information

Submission Mgr LQM Service

Information instance (via Information Channel)

Filter Mgr

2000s Multi-Layered QoS Management for Service-Oriented Distributed Information Systems

QoS Administration Aggregate QoS Management Local QoS Management QoS Mechanisms

Mission-level QoS policies

  • Roles, importance, deadlines,

user prefs.

Mission-level QoS policies

  • Roles, importance, deadlines,

user prefs.

QoS enforcement mechanisms

  • Differentiated service
  • Thread and queue control
  • Rate control, compression,

filtering, replacement

QoS enforcement mechanisms

  • Differentiated service
  • Thread and queue control
  • Rate control, compression,

filtering, replacement

QoS management across multiple users

  • Fairness, resource

allocations, importance

QoS management across multiple users

  • Fairness, resource

allocations, importance

Enforce QoS policies at local decision points

  • Priorities of operations and

information

  • Resource access and

process/info shaping

Enforce QoS policies at local decision points

  • Priorities of operations and

information

  • Resource access and

process/info shaping

slide-9
SLIDE 9

From Protection to Auto-Adaptive to Survivable and Self-Regenerative Systems

No system is perfectly secure– only adequately secured with respect to the perceived threat.

Prevent Intrusions Prevent Intrusions

(Access Controls, Cryptography, Trusted Computing Base)

1st Generation: Protection

Cryptography Trusted Computing Base

Access Control & Physical Security

Detect Intrusions, Limit Damage

(Firewalls, Intrusion Detection Systems, Virtual Private Networks, PKI)

2nd Generation: Detection

But intrusions will occur

Firewalls Intrusion Detection Systems Boundary Controllers VPNs PKI

But some attacks will succeed

Tolerate Attacks Tolerate Attacks

(Redundancy, Diversity, Deception, Wrappers, Proof-Carrying Code, Proactive Secret Sharing)

3rd Generation: Intrusion Tolerance and Survivability

Intrusion Tolerance Big Board View of Attacks Real-Time Situation Awareness & Response Graceful Degradation Hardened Operating System

9

slide-10
SLIDE 10

Survivability and Intrusion Tolerance

Premise

  • The number & sophistication of cyber

attacks is increasing – some of these attacks will succeed

Philosophy

  • Operate through attacks by using a

layered defense-in-depth concept

  • Accept some degradation
  • Protect (C,I, A) of most valuable

assets (information, services, …)

  • Move faster than the intruder

Approach

  • “Defense Enabling” Distributed

Applications

  • Survivability architecture

Detect

Attacks

Protect

React

  • Exploring beyond degradation-- regain, recoup, regroup and even improve
  • Semi-automated: Survivability architecture captures a lot of low level (and sometimes

uncertain and incomplete) information – utilizes advanced reasoning and machine learning

10

slide-11
SLIDE 11

Slowly Advancing from Defending to Tolerance to Survivability toward Regeneration

11

Self-Regenerative Survivable systems Survivable and Secure Systems Adaptive Distributed Object Middleware 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 1999 1998 1997 DARPA AFRL DHS/HSARPA AQuA OIT APOD: Applications that Participate In Their Own Defense

ITUA: Intrusion Tolerance Through Unpredictable Adaptation DPASA: Designing Adaptation And Protection into a Survivability Architecture

QuOIN CSISM*

Red Team Assessments

*Cognitive Support for Intelligent Survivability Mgmt

Unpredictability Unpredictability Byzantine FT Byzantine FT Survivability Architectures and Survivability Architectures and IMSes Cognitive Survivability Management Cognitive Survivability Management Autonomic Defense Autonomic Defense Defense Enabling Defense Enabling

Focus Area

APS: Advanced Protected Services

2010 2011 2012

Survivable SOA Systems Survivable SOA-based Systems

slide-12
SLIDE 12

Achievements So Far (2009)

Military (USAF) Joint Battlespace Infosphere (JBI) information management system exemplar made survivable and subjected to sustained attacks over several weeks by multiple independent red teams

Results

  • The system survived 75% of attacks
  • Of those that succeeded,
  • Average time to failure was 45 minutes
  • Vs. immediately in the unprotected system
  • Minimum of 10 minutes to failure
  • Required combinations of attacks
  • Adaptive defenses added 5-20%
  • verhead to call latency

Challenge: Develop automated mechanism that would interpret the reports and decide the effective course of action CSISM Approach: 3 level decision making- reactive, deliberate and learned; use theorem proving and coherence to reason about accusatory and evidentiary information contained in reported events

Results

  • Possible to minimize expert involvement
  • Reasoning about accusatory and evidentiary

information wrt encoded knowledge

  • Made correct decision in ~75% cases in red

team exercises

  • Compute intensive
  • Integrating learned responses online needs

additional research

12

slide-13
SLIDE 13

Elements of Cyber-Defensive Ideas

  • Common threads that runs through our intrusion

tolerance and survivability work:

– Adaptation for security

  • Like in nature, services migrate; change behavior, structure

and configuration in order to survive

– Unpredictability

  • Changing and taking unexpected actions yield advantages

– Intelligent behavior

  • Like high order life forms, cognitive capabilities are

introduced to survivable systems for interpreting reported events and making decisions

– Evolution

  • Learning to improve defenses over time

13

slide-14
SLIDE 14

Slide courtesy Dr. Howard Shrobe, DARPA

2010 DARPA CRASH PROGRAM

slide-15
SLIDE 15

Slide courtesy Dr. Howard Shrobe, DARPA

slide-16
SLIDE 16

Advanced Adaptive Applications (A3) Key Objectives

  • An execution environment supporting innately

and adaptively resilient applications

– The protected application is harder to attack, harder to make unavailable, and harder to repeat past successful attacks – Isolation from other computation, dedicated to the survival of the protected application – Reusable, cost-effective defense near the application and part of defense in depth strategy

Demonstrate application centric adaptation for survival – make the “application” survivable and resilient against novel attacks

16

slide-17
SLIDE 17

The A3 Vision: Integration of 3 Concepts

17

Host hardware layer Host hardware layer OS layer OS layer

App App

  • 3. Advanced State Management for

containerized applications to enable various forms of restarts (recovery-focused adaptation) Containerization to isolate application execution Mediated channels enables the defense to

  • bserve and control the application’s interaction

with devices on its own terms

  • 1. Crumple Zone enforces application

specific preventive adaptation on container’s interaction through mediated channels

  • 2. Replay with Modification on

top of mediated containers to facilitate immunity-focused adaptation HW layer HW layer OS layer OS layer

App App

Precious state Precious state Disposable state Disposable state

slide-18
SLIDE 18

What is a hard problem: Novel Attacks

  • Behavior invariants (e.g., deployer provided constraint

such as this web service should never make an outbound connection) or something more drastic (e.g., a segfault) indicates something went wrong

– But the real attack likely happened in the past – Attacker has been successfully executing his tasks – And until now, we had no clue

  • How deal with the aftermath of such attacks?

18

4 e.g., A is corrupt when f(x,y,z)= true e.g., rollback and restart, but to which past state? time Undesired condition Attacker objective was achieved at tX but we did not realize until tZ

tX tZ

Observed by the CZ policies Work toward immunity RwM Experimentation

slide-19
SLIDE 19

Crumple Zone: VM-based Realization

19

  • Each container is essentially a DomU VM
  • Channels are pathways from the application to devices (Disk, UI, Network)

Crumple Zone(CZ) are VMs interposed on basic channels

Policy & Control

Xen DDVM Guest VM-1 Ethernet Bridge

VM-3 acts as the logical intermediate hop between VM-1 and DDVM

Crumple Zones, enforcing policies on mediated channels are built on specialized guest VMs like VM-2 and VM-3

Policy & Control

Guest VM-3 Guest VM-2

VM-2 acts as the backend to VM-1 and frontend to DDVM for block devices

NW Interaction Storage Interaction Xen interrupts and signaling

App

Guest OS

APPVM NW CZVM ST CZVM

Only the Xen hypervisor and Dom0 is treated as TCB A3 Conglomerate: the collection of VMs dedicated to the defense of a protected application

slide-20
SLIDE 20

Replay With Modification: Motivation

  • In a clean slate resilient and survivable host system context, it

should be possible to

– Reproduce application’s past execution

  • With different levels of fidelity and control in a repeatable manner

– Explore alternate execution history

  • Alternate line leading to an immune conglomerate
  • Exploration of multiple lines unveiling details of novel attack faster
  • RwM is A3’s contribution to address novel attacks

– If an immune conglomerate is found, then that attack is ineffective – Provides an infrastructure as well as the collection of recorded information and supporting tools for analysts and cyber defenders to analyze a zero day attack and develop a countermeasure

  • 2 levels of replay: Deterministic VM replay and Application Level
  • Claim: synergistic combination is helpful in experiment-based failure

diagnosis and patch identification

20

slide-21
SLIDE 21

Multi-Compiler Variants: Utilizing A Diversity Generator

21

This is what is happening inside the diversity generator

Binary Rewriting Binary Rewriting Configuration Generator Configuration Generator Compilation with transforms Compilation with transforms Set of transforms, each with its own purpose SRC SRC V’’ Compilation with aspects Compilation with aspects SRC SRC Aspect Specified Aspect + Multiple semantically equivalent object code variants with different vulnerability profile Object code variants with new defensive behavior (e.g., add a new filter in Apache/PHP) V’’ V Set of tools and gadgets V’’ Object code variants with added checks and reporting

JVM SEL IPTables File System Block Storage Permission Permission Quota Quota Collect statistic Collect statistic

Spec

P’’ P’’

Modifications to CZ policies EXECUTE and TRANSFORM aspects CZ INSPECT aspect CZ

slide-22
SLIDE 22

Multiplicity?

22

OS

Key Key Application

Traditional

  • Cpu
  • OS
  • Memory
  • Network

connection

Processor & Memory

Cyber security becomes an obvious context

Processor & Memory

OS

Key Key Application Dubious Dubious Application Dubious Application Dubious Application Dubious Application Dubious Application

Now/Emerging

  • Multiple cores, with powerful cpus
  • Powerful “feature rich” OS
  • Mega memory
  • High bandwidth always on network connectivity
slide-23
SLIDE 23

Multiplicity?

23

Record and replay, experiment-based diagnosis, patching and recovery! Use diversity generator to create polymorphic components that exhibit different vulnerability profile Suddenly resources may not be that bountiful!

Processor & Memory

Host OS (Hypervisor) Guest OS Guest OS Guest OS Guest OS

Dom0 Dom0

Guest OS Guest OS Guest OS Guest OS Guest OS Guest OS

Crumple Zone

Application

Crumple Zone

Application

Diversity Generator Experimen t Controller spawn use

slide-24
SLIDE 24

Multiplicity?

24

But wait– clouds are gathering steam! Recorded information, Replay experiments, Diversity generation, Experiment- based diagnosis and patching all can potentially be done in the cloud! But have we come full circle? Do we really trust the cloud with our critical data and computation?

Processor & Memory

Host OS (Hypervisor) Guest OS Guest OS Guest OS Guest OS

Dom0 Dom0

Guest OS Guest OS

Crumple Zone

Application

Guest OS Guest OS Guest OS Guest OS

Crumple Zone

Application

Experimen t Controller Diversity Generator spawn use

Diversity Generator

Guest OS Guest OS

Crumple Zone Application

Guest OS Guest OS

Crumple Zone Application

spawn use

slide-25
SLIDE 25

Fully Homomorphic Computing Computing Directly on Encrypted Information

25

slide-26
SLIDE 26

Noise in Ciphertexts

  • Ciphertexts are a combination of noise, the

public key and a message.

  • The public key is a combination of noise and the

secret key.

  • EvalMult operations “multiply” the noise in the

ciphertext.

  • Decryption operations strip away the noise.

Huge Amounts of Data and Computation Beget Special Purpose Solutions

26

slide-27
SLIDE 27

FPGA-based Lattice FPGA-based Lattice Crypto Primitives

Computation Flow On Untrusted Host

FHE Operations FHE Operations Encrypt, EvalAdd, EvalMult, Recrypt CPU-Based CPU-Based Primitives SIPHER CPU libraries SIPHER CPU libraries Selection of CPU libraries for lattice-based primitives Selection of CPU libraries for lattice-based primitives Source Program Circuit Rep. of Circuit Rep. of Program Calls to FHE operations Calls to FHE operations Translation of source program to circuit representation Translation of source program to circuit representation SIPHER FPGA Circuits SIPHER FPGA Circuits Selection of FPGA circuits for lattice-based primitives Selection of FPGA circuits for lattice-based primitives GPU-Based GPU-Based Primitives SIPHER GPU libraries SIPHER GPU libraries Selection of GPU libraries for lattice-based primitives Selection of GPU libraries for lattice-based primitives Selection implementation for FHE Evaluations Selection of calls to FPGA, CPU or GPU implementation for FHE Evaluations

High Level Languages Complexity  Speed  Middleware Abstraction Layers Low Level Implementation Complexity  Speed

Data Encrypted with FHE Scheme Untrusted host supports running of program on encrypted data FHE Operations filter down to appropriate FPGA, CPU or GPU implementation based on available resources.

slide-28
SLIDE 28

Asymmetric Operation Location Considerations

slide-29
SLIDE 29

Whew!

  • The Big Bang (of Higher Performance

Networked Diversity) Continues to Inflate

  • Lot’s of Bottom Up Momentum Building across a

number of planes to use that advancing Multiplicity

  • Needs coupling with more Top Down concept-of-
  • peration/theory weaving
  • And Plenty More to Do to Keep Us Busy for a

Long Time

29