Optimal Defense Policies for Partially Erik Miehling Observable - - PowerPoint PPT Presentation

optimal defense policies for partially
SMART_READER_LITE
LIVE PREVIEW

Optimal Defense Policies for Partially Erik Miehling Observable - - PowerPoint PPT Presentation

University of Michigan - Ann Arbor Optimal Defense Policies for Partially Erik Miehling Observable Spreading Processes on Mohammad Rasouli Demosthenis Teneketzis Bayesian Attack Graphs Second ACM Workshop on Moving Target Defense (MTD 2015)


slide-1
SLIDE 1

University of Michigan - Ann Arbor Erik Miehling Mohammad Rasouli Demosthenis Teneketzis

Second ACM Workshop on Moving Target Defense (MTD 2015) Denver, Colorado — October 12-15, 2015

Optimal Defense Policies for Partially Observable Spreading Processes on Bayesian Attack Graphs

1

slide-2
SLIDE 2

Motivation

❖ Factors from information security:

Confidentiality (C) — Ensuring data does not get into the wrong hands Integrity (I) — Maintaining accuracy/ trustworthiness of information Availability (A) — Ensuring that data is always available to trusted users

❖ Modern systems are becoming increasingly

connected to improve operational efficiency and flexibility

❖ This convenience comes at the cost of

introducing many vulnerabilities

2

C I A

CIA triad

Reactiveness — should respond to observed changes in the system condition Predictiveness — forecast, and prepare for, where the system will be in the future

❖ Factors from MTD:

slide-3
SLIDE 3

The Conflict Environment

❖ We consider a dynamic setting where a network is continually being

subjected to attacks with the objective of compromising some target resources through exploits.

❖ The defender can control services in the network to prevent the

attacker from reaching the target resources.

❖ Philosophy: Automated approach to defending the network. ❖ Defense actions are observation-driven

3

slide-4
SLIDE 4

Contribution

❖ Contribution: We introduce a formal model which allows us to

study dependencies between vulnerabilities in a system and to define (and compute) optimal defense policies.

❖ Aspects of our defense model ❖ Progressive attacks — recent exploits build upon previous exploits,

progressively degrading the system

❖ Dynamic defense — defender is choosing the best action based on

new information

❖ Partial knowledge — the defender is uncertain about the security of

the network at any given time

❖ It is both reactive and predictive

4

slide-5
SLIDE 5

Attack Graphs

❖ Insufficient to look at single

vulnerabilities when protecting a network

❖ Attackers combine

vulnerabilities to penetrate the network

❖ Attack graphs model how multiple vulnerabilities can be combined

and exploited by an attacker

❖ Explicitly takes into account paths that the attacker can take to

reach the critical exploitation

❖ Can use tools such as CAULDRON* to generate attack graphs

5 *Jajodia et al. 2010-11

slide-6
SLIDE 6

Bayesian Attack Graphs

❖ Nodes, , represent attributes

N

: critical (root) nodes : leaf nodes

❖ Bayesian attack graph: G = {N, θ, E, P}

1

2

3

4 5

6 7 8

9 10 11

12 13 14

15 16 17 18 19 20 α12 α13

α14

NL = {1, 5, 7, 8, 11, 12, 16, 17, 20} NC = {9, 14} ⊆ NR = {2, 9, 14, 18}

Attributes: Types: Exploits: Probabilities:

❖ Types,

: AND attributes

θ

❖ Edges, , represent exploits

: exploits

❖ Probabilities,

: exploit probabilities : OR attributes

E P

NL ⊆ N NC ⊆ NR ⊆ N E = (i, j)i,j∈N P = (αij)(i,j)∈E N∧ ⊆ N \ NL N∨ ⊆ N \ NL

N∧ = {2, 3, 6, 9, 10, 13, 18, 19} N∨ = {4, 14, 15} E = {(1, 2), (1, 3), . . . , (20, 19)} P = {α1,2, α1,3, . . . , α20,19}

6

slide-7
SLIDE 7

❖ The attacker’s behavior is assumed to

follow a probabilistic spreading process

Spreading Process

t = τ

At time :

probability that exploit will be discovered/taken by the attacker (public knowledge)

❖ Contagion seed and spread: At each time t

  • 1. Each leaf attribute is enabled with

probability

  • 2. Contagion spreads according to

predecessor rules

Xi

τ = 1

j Xl

τ = 0

k

αil αjl αkl

Xi

t = 0

Xi

t = 1

(disabled) (enabled)

❖ Each attribute (node) can be in one

  • f two states

i ∈ N αi ∈ [0, 1]

7

slide-8
SLIDE 8

❖ The type of the attribute dictates the

nature of the spreading process

Spreading Process — Predecessor Rules

❖ For AND attributes, e.g. node l set of direct predecessors

t = τ

At time :

Xi

τ = 1

j Xl

τ = 0

m n k

αil αjl αkl αmk αnk

P(Xl

t+1 = 1|Xl t = 0, Xt) =

8 < : Q

p∈ ¯ Dl

αpl if ^

p∈ ¯ Dl

Xp

t = 1

  • therwise

8

slide-9
SLIDE 9

❖ The type of the attribute dictates the

nature of the spreading process

Spreading Process — Predecessor Rules

❖ For AND attributes, e.g. node l ❖ For OR attributes, e.g. node k set of direct predecessors

t = τ

At time :

Xi

τ = 1

j Xl

τ = 0

m n k

αil αjl αkl αmk αnk

P(Xl

t+1 = 1|Xl t = 0, Xt) =

8 < : Q

p∈ ¯ Dl

αpl if ^

p∈ ¯ Dl

Xp

t = 1

  • therwise

P(Xk

t+1 = 1|Xk t = 0, Xt) =

8 < : 1 − Q

p∈ ¯ Dk

(1 − αpk) if _

p∈ ¯ Dk

Xp

t = 1

  • therwise

9

slide-10
SLIDE 10

Defender’s Observations

❖ Defender thus observes a subset of enabled

nodes that have been discovered at each time-step

disabled attribute enabled & undetected enabled & detected

❖ Defender only partially observes this process

Yt ∈ {0, 1}N

❖ Rationale: defender may not know the full

capability of the attacker at any given time

Assumption: No false positives can occur. βi = P(Y i

t = 1|Xi t = 1)

probability

  • f detection

P(Y i

t = 1|Xi t = 0) = 0

10

slide-11
SLIDE 11

Defender’s Countermeasures

❖ The defender uses network services as countermeasures ❖ Existence of exploits depend on services ❖ For example

❖ Secure Shell (SSH) ❖ File Transfer Protocol (FTP) ❖ Port scanning ❖ etc.

❖ Defender can thus temporarily block or disable these services to

stop the attacker from progressing through the network

11

slide-12
SLIDE 12

Defender’s Countermeasures

u1

1 2 3 4 5 6

7

8 9

10 11 12

13 14 15

16 17 18

19 20

❖ Suppose there are a set of

services

❖ Taking action corresponds

to disabling service

❖ disables a subset of the

nodes

{u1, . . . , uM}

M m

um um

Wum

Xi = 0, i ∈ Wum Wu1 = {1}

12

ut ∈ U = ℘({u1, . . . , uM})

❖ Action at time t

slide-13
SLIDE 13

Defender’s Countermeasures

u2

1 2 3 4 5 6

7

8 9

10 11 12

13 14 15

16 17 18

19 20

u2

{u1, . . . , uM}

M m

um um Xi = 0, i ∈ Wum Wu2 = {5, 17}

13 ❖ Suppose there are a set of

services

❖ Taking action corresponds

to disabling service

❖ disables a subset of the

nodes Wum

ut ∈ U = ℘({u1, . . . , uM})

❖ Action at time t

slide-14
SLIDE 14

Defender’s Countermeasures

u3

1 2 3 4 5 6

7

8 9

10 11 12

13 14 15

16 17 18

19 20

{u1, . . . , uM}

M m

um um Xi = 0, i ∈ Wum Wu3 = {13}

14 ❖ Suppose there are a set of

services

❖ Taking action corresponds

to disabling service

❖ disables a subset of the

nodes Wum

ut ∈ U = ℘({u1, . . . , uM})

❖ Action at time t

slide-15
SLIDE 15

Defender’s Countermeasures

1 2 3 4 5 6

7

8 9

10 11 12

13 14 15

16 17 18

19 20

u1 u2 u2

u3

u4 u4 u4 u5 u5 u6

ut ∈ U = ℘({u1, . . . , uM}) {u1, . . . , uM}

M m

um um Xi = 0, i ∈ Wum

❖ Action at time t

Assumption: All leaf nodes are covered by at least one service.

15 ❖ Suppose there are a set of

services

❖ Taking action corresponds

to disabling service

❖ disables a subset of the

nodes Wum

slide-16
SLIDE 16

Cost Function

❖ Confidentiality &

Integrity factor: state is less costly than state

❖ Cost of taking action in state :

u ∈ U C(x, u) x ∈ X

ˆ x x

16

ˆ u

u C(·, u) < C(·, ˆ u)

❖ Availability factor: an action that has a higher negative impact

  • n availability than another action should satisfy:

C(x, ·) < C(ˆ x, ·)

ˆ x x

slide-17
SLIDE 17

Monotone States

Assumption: The only feasible states are monotone.

1

2

3

4

5

6 1 2

3

4

5

6

1

2 3

4

5

6

1

2 3

4

5

6

Assumption: The attack graph only contains AND nodes.

(monotone states under general topology) (monotone states under AND topology)

simplifying assumption

1 2

3

4

5

6 1

2 3

4

5 6

1 2

3

4

5 6

17

slide-18
SLIDE 18

Defender’s Information States

❖ Define the history up to time t as ❖ We capture by an information state

Ht

❖ Information state obeys the update rule T : ∆(X) × Y × U → ∆(X)

πt+1 = T (πt, yt+1, ut)

x1 x2 x3 x4 xK

. . .

X

space of monotone states

Ht = (π0, U1, Y1, U2, Y2, . . . , Ut−1, Yt) Π i

t = P(Xt = xi|Ht)

Πt = (Π 1

t , . . . , Π K t ) ∈ ∆(X)

18

slide-19
SLIDE 19

Defender’s Optimization Problem

❖ Choose a control policy that solves

g : ∆(X) → U, g ∈ G min

g∈G Eg

( ∞ X

t=0

ρtC(Πt, Ut)

  • Π0 = π0

) subject to Ut = g(Πt) Πt+1 = T (Πt, Yt+1, Ut)

❖ The above is a Partially Observable Markov Decision Process

(POMDP)

19

slide-20
SLIDE 20

Dynamic Programming Solution

V ∗(π) = min

u∈U

n C(π, u) + ρ X

y∈Y

P π,u

y

V ∗(T (π, y, u))

  • ❖ The optimal policy achieves the minimum expected discounted

total cost.

❖ The Bellman equation for the corresponding value function is:

g∗ V ∗ for all . π ∈ ∆(X)

dimensionality of this term is greatly reduced due to the monotonicity assumption

slide-21
SLIDE 21

Dynamic Programming Solution

Reactive: captures belief

  • f current system state

based on previous states, actions, and observations.

Predictive: continuation cost incorporates the dynamics, taking into account the effect of future possible attacks and defense policies on the specification of the current defense decision.

21

V ∗(π) = min

u∈U

n C(π, u) + ρ X

y∈Y

P π,u

y

V ∗(T (π, y, u))

  • ❖ The optimal policy achieves the minimum expected discounted

total cost.

❖ The Bellman equation for the corresponding value function

g∗ V ∗ for all . π ∈ ∆(X)

slide-22
SLIDE 22

Example

Attributes:

  • 1. Vulnerability in WebDAV on machine 1
  • 2. User access on machine 1
  • 3. Heap corruption via SSH on machine 1
  • 4. Root access on machine 1
  • 5. Buffer overflow on machine 2
  • 6. Root access on machine 2
  • 7. Portscan on machine 2
  • 8. Network topology leakage from

machine 2

  • 9. Buffer overflow on machine 3
  • 10. Root access on machine 3
  • 11. Buffer overflow on machine 4
  • 12. Root access on machine 4

1 2 3 4

5

6 7 8 9 10

11 12

α11,12 α9,10 α3,4 α6,7 α7,8 α2,3 α8,11 α5,6 α10,11 α1,2 α4,9 α8,9

22

slide-23
SLIDE 23

Example - Countermeasures

Attributes:

  • 1. Vulnerability in WebDAV on machine 1
  • 2. User access on machine 1
  • 3. Heap corruption via SSH on machine 1
  • 4. Root access on machine 1
  • 5. Buffer overflow on machine 2
  • 6. Root access on machine 2
  • 7. Portscan on machine 2
  • 8. Network topology leakage from

machine 2

  • 9. Buffer overflow on machine 3
  • 10. Root access on machine 3
  • 11. Buffer overflow on machine 4
  • 12. Root access on machine 4

u2 u1: block WebDAV service

: disconnect machine 2 Countermeasures:

1 2 3

4 5

6 7 8 9

10

11 12

α11,12

α9,10

α10,11 α3,4 α1,2 α6,7 α7,8 α8,9 α2,3 α4,9 α8,11 α5,6

u1

u2

23

slide-24
SLIDE 24

Example

❖ The resulting POMDP contains ❖ 29-states/observations ❖ 4 actions ❖ Solved using pomdp-solve* ❖ We show three different information

states with different corresponding

  • ptimal actions

❖ The optimal policy is intuitive ❖ Disable the service that

corresponds to the attribute that has a high probability of being enabled.

1 5 10 15 20 25 29 0.1 0.2 0.3 0.4 0.5 1 5 10 15 20 25 29 0.1 0.2 0.3 0.4 1 5 10 15 20 25 29 0.1 0.2 0.3

(a) (b) (c)

π(2)

τ

π(3)

τ

π(1)

τ u∗ = u1 u∗ = u2 u∗ = ∅

24 *Cassandra 2003-15 (online)

slide-25
SLIDE 25

Conclusion and Future Work

❖ Summary ❖ Introduced a formal

model (built upon Bayesian attack graphs) for defending a network

❖ Formulated the

problem as a POMDP

❖ Solved the POMDP

for a small example

❖ Future Work ❖ Scaling the problem ❖ Exact POMDP solvers only capable of

handling small examples

❖ Realistic attack graphs are big! ❖ Complexity analysis? ❖ Structural results ❖ Directed acyclic graphs give rise to a

natural partial order

❖ Can we use this to show threshold

properties of the optimal policy?

25

slide-26
SLIDE 26

Thank You!

Questions?

C I A

1 2 3 4

5

6

7

8 9 10

11 12

α11,12 α9,10 α10,11 α3,4 α1,2 α6,7 α7,8

α8,9

α2,3 α4,9

α8,11 α5,6

u1

u2 u3

u4

Xi

τ = 1

j Xl

τ = 0

m n k

αil αjl αkl αmk αnk

26

slide-27
SLIDE 27

Funding Acknowledgments

❖ NSF — Foundations Of Resilient CybEr-

physical Systems (FORCES)

Grant: CNS-1238962

❖ Special thanks to the following funding sources ❖ ARO MURI — Adversarial and Uncertain

Reasoning for Adaptive Cyber Defense: Building the Scientific Foundations

Grant: W911NF-13-1-0421

27

slide-28
SLIDE 28

Information State Update

t t + 1

πt

ut

yt+1 πt+1 attack spread

πt+ πt++

πt+++ t+ t++ t+++

πt+1 = T (πt, yt+1, ut) = T4(T3(T2(T1(πt, ut))), yt+1).

❖ Decompose the information state update into four steps

πt+ = T1(πt, ut) πt++ = T2(πt+) πt+++ = T3(πt++) πt+1 = T4(πt+++, yt+1)

❖ The update can then be written as the composition

(effect of action) (effect of attack) (effect of spread) (effect of observation) 28