Malware Halting 1. Malware 2. Software diversity Part I: Method - - PowerPoint PPT Presentation

malware halting
SMART_READER_LITE
LIVE PREVIEW

Malware Halting 1. Malware 2. Software diversity Part I: Method - - PowerPoint PPT Presentation

Overview Malware Halting 1. Malware 2. Software diversity Part I: Method Development 3. Computer immunization Kjell Jrgen Hole Simula@UiB 4. Epidemiological model 5. Malware halting analysis 6. Malware halting method Last updated


slide-1
SLIDE 1

Malware Halting

Part I: Method Development

Kjell Jørgen Hole Simula@UiB

Last updated 16.05.17

Overview

  • 1. Malware
  • 2. Software diversity
  • 3. Computer “immunization”
  • 4. Epidemiological model
  • 5. Malware halting analysis
  • 6. Malware halting method

2

Malware defined

Malware—malicious software used to

  • disrupt computer operations
  • gather sensitive information, or
  • gain access to private systems

3

Worms Viruses Rootkits Spyware Keyloggers Adware Trojan horses Backdoors Dialers

4

bots Ransomware

slide-2
SLIDE 2

Infectious malware

We’ll concentrate on infectious malware:

  • Viruses—need user intervention to spread
  • Worms—spread automatically

5

Spreading mechanisms (1)

Random scanning selects target IP addresses at random (all nodes are neighbors)

  • used by Code Red and Slammer worms


 Localized scanning selects most hosts in the “local” address space

  • used by Code Red II and Nimda worms

6

Spreading mechanisms (2)

Topological-scanning relies on information contained in infected hosts to locate new targets

  • the information may include (BGP) routing

tables, email addresses, a list of peers, and Uniform Resource Locations (URLs)

  • used by the Morris worm

7

Spreading mechanisms (3)

Hitlist consists of potentially vulnerable machines that are gathered beforehand and targeted first when the worm is released

  • the flash worm gathered all vulnerable

machines into its hitlist

8

slide-3
SLIDE 3

Software diversity

We consider systems of networked computing devices, such as computers, smartphones, and tablets
 Each device downloads software from application stores utilizing compilers with “diversity engines”

9

Software monoculture

(today’s situation)

10

iden%cal
binary
for
all
 users all
users
suscep%ble
to
 iden%cal
exploit exploit

A5acker

Diversity engine

11

So#ware
Developer

creates delivers
to

So#ware

Variants Diversity
Engine within
App
Store

App
Store

creates subsequent
downloaders
receive
func=onally
iden=cal
 but
internally
different
versions
of
the
same
so#ware

Software polyculture

(the future?)

12

different
variants
for
 different
users a
single
exploit
no
longer
 affects
all
users
 iden4cally cost
to
a6acker
rises
 drama4cally exploit

A6acker

slide-4
SLIDE 4

Immunization (1)

Software hardening, or immunization, consists of

  • removal of non-essential software programs
  • secure configuration of remaining programs
  • constant patching, and
  • use of intrusion-detection systems, firewalls,

intrusion-prevention systems, anti-malware programs, and spyware blockers

13

Immunization (2)

  • In extreme cases, trained personnel have to

take a device off-line to wipe its memory before installing new software

14

Pragmatic approach

Despite the protection provided by computer “immunization,” it is nearly impossible to keep every devices free for malware at all times A more realistic goal is to provide a form of “community immunity,” where most devices are protected against malware because there is little opportunity for new outbreaks to spread

15

Combine diversity and immunization

While community immunity usually entails immunization

  • f nearly all entities in a monoculture, we’ll combine

software diversity with the immunization of a small fraction of the computers to halt malware spreading

16

slide-5
SLIDE 5

Epidemiological model

We model viruses and worms as infectious diseases spreading over networks with varying software diversity

17

Infected monoculture

18

Single sick node infects all other nodes Node size proportional to #adjacent edges

Fragile

Hub defined

Hub—network node with many more adjacent edges than the average number of edges per node

  • see large nodes in previous figure
  • the number of adjacent edges is often referred


to as the ‘degree’

www.kjhole.com

Diversity

Nodes of L types have different colors The (software) diversity is equal to number of colors L

20

slide-6
SLIDE 6

www.kjhole.com

Immunization

A white immunized node never gets infected or transmits an infection

21

Immunized polyculture

22

The malware types only spread to three nodes L=2 node types L=2 malware types Eight immunized hubs

Robust

Network model

Simple connected graph defines malware spreading pattern

  • N nodes
  • L≥1 node types
  • one malware type per node type
  • discrete time t = 0,1,2, ...
  • S infected seeds per node type at time t = 0

23

(a)

L = 3 node types S = 1 seed per node type

24

Seeds

slide-7
SLIDE 7

Malware spreading

A sick nodes infect all its neighbors during a single time step t

25

Types of spreading patterns

Homogeneous network—all nodes have degrees k approximately equal to the average degree ⟨k⟩ Inhomogeneous network—a small fraction of nodes, the hubs, have degrees k much larger than the average degree ⟨k⟩

26

Malware halting analysis

To halt malware on networks with several million nodes, we first determine (A) desired distribution of node types, (B) a lower bound on the needed diversity, and (C) the trade-off between diversity and immunization

27

(A) Node type distribution

Let rl be the probability that an arbitrary node is of type l = 1,2, …, L The entropy −∑ rl log rl measures the uncertainty of a node’s assigned type It has maximum value log L when all rl = 1/L

28

slide-8
SLIDE 8

Maximize entropy (1)

When the entropy is maximized, the best spreading strategy for each malware type
 is to select new nodes at random The probability that a spreading mechanism chooses a node of wrong type is 1 − 1/L As L increases, this probability increases and the speed of the malware spreading decreases

29

Maximize entropy (2)

If there is less uncertainty about the distribution of vulnerable nodes, e.g. a few node types occur more

  • ften than the other node types in a network, then

the entropy is smaller and malware writers can create very efficient topological-aware spreading mechanisms

30

Observation 1

Skewed distributions of node types should be avoided because they facilitate rapid malware spreading

31

(B) Needed diversity

32

Example: MMS malware exploits a smartphone’s address book to spread to new phones with the same OS

slide-9
SLIDE 9

MMS malware spreading

33

Phones on email list Phone not

  • n email list

Infected phone

Wang’s network model

Based on location and calling data from 6.2 million mobile subscribers
 Market share determines whether devices with the same OS form a giant component in the calling graph

34

What is a component?

A single-type component is a subset of nodes with the same type such that there

  • is a path between any pair of nodes in the

set, and

  • it is not possible to add another node of the

same type to the set while preserving this property

35

Giant component

A giant component of same-type nodes has size proportional to N
 If a giant component contains a seed, then nearly all nodes in the network will be infected

36

slide-10
SLIDE 10

B

OS1: 75% market share OS2: 25% market share Giant component 80% Giant component 6% Small connected components and single nodes

C

37

Wang’s network model

38

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

m Gm

C D F

MMS

mc

No giant component Finite giant component

2

Wang et al.

Roughly 45% of all phones in US were smartphones in March 2011

39

Android market share

Androids’s share of the total mobile phone market was 0.45 X 0.35 = 0.16 (16%) About 62% of the users utilized Android Gingerbread 2.3.x

  • market share was 0.16 X 0.62 = 0.10 (10%)

40

slide-11
SLIDE 11

41

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

m Gm

C D F

MMS

mc

No giant component Finite giant component

2

Wang et al.

Ginger- bread was here

Observation 2

A malware epidemic can only occur when a network contains a giant component of nodes with the same type

42

Diversity needed to avoid giant component

degree k — a node’s number of adjacent edges average degree ⟨k⟩ = 1/N·∑ ki average-square degree ⟨k2⟩ = 1/N·∑ (ki)2

43

L ≥ ⎡⟨k2⟩ ∕ {2·⟨k⟩}⎤

(C) Diversity vs immunization

The right-hand side of the bound is large when a network contains nodes with square degrees (ki)2 much bigger than the corresponding degrees ki
 If the node degrees are known, then we reduce the lower bound by immunizing the nodes with largest degrees ki

44

slide-12
SLIDE 12

Observation 3

We can immunize a small fraction of all nodes (the hubs) in an inhomogeneous network to reduce the need for diversity

  • because it is expensive to immunize

computers, immunization is of limited
 use in large networks with many hubs

45

Halting method

(based on observations)

The method must handle spreading patterns with

  • unknown and changing topology
  • at least one million nodes
  • unreliable node communication

46

Approach

Since the topology is unknown and communication is unreliable, it is difficult to modify the network structure or ask nodes to change their types based

  • n the types of the neighboring nodes

It is more promising to use a simple method that is

  • robust to varying topologies
  • scale to very large networks

47

Malware halting method

(first version)

  • 1. If practicable, immunize enough large-degree

nodes in a network to create a homogeneous subnet when the immunized nodes and their adjacent edges are removed

  • 2. Ensure that the node diversity of the

homogeneous subnet is large enough to halt multiple simultaneous malware outbreaks

48

slide-13
SLIDE 13

49

Video

Malware Halting

Part II: Simulations and Analyses

Kjell Jørgen Hole Simula@UiB

Overview

  • 1. Malware halting in proximity networks
  • 2. Halting in the Enron email network
  • 3. Halting in Barabási and Albert networks
  • 4. Halting in dense IP networks
  • 5. How to immunize unknown hubs
  • 6. Slowing down ‘advanced persistent threats’

51

Proximity networks

52

Phones within Bluetooth range (~10m) Phone out of Bluetooth range Infected phone

slide-14
SLIDE 14

Proximity network model

  • 1. We generate a proximity network with

average node degree ⟨k⟩ by first placing N nodes uniformly at random on a square

  • 2. An edge is then added between a randomly

chosen node and its closest neighbor in Euclidean distance

53

Network generation

  • 3. More edges are added the same way until the

network has the desired average degree ⟨k⟩

  • self-loops and multiple edges between

nodes are not allowed

54

N=300 ⟨k⟩=5 L=4 S=10

55

Proximity network simulations

Average node degree ⟨k⟩ 5 6 7 8 Minimum needed node types (diversity) L 3 4 4 5 Fraction of infected nodes 3,4 % 3,6 % 4,6 % 4,8 %

N=5000, each fraction averaged over 500 networks with random distribution of node types and seeds, S=10

56

slide-15
SLIDE 15

NetLogo proximity model

57

Demo

Enron email network

Sparse and inhomogeneous email network Nodes represent email addresses belonging to former Enron employees N=36 692, ⟨k⟩=10.02 (183 831 edges) The largest hub has degree 1 383

58

Diversity vs immunization

The lower bound on the required diversity is L ≥ 71 before hub immunization After immunizing the 612 nodes with degrees larger than 0.05% of the total number of edges, the lower bound reduces to L ≥ 9

59

L ≥ ⎡⟨k2⟩ ∕ {2·⟨k⟩}⎤

Enron network simulations

Diversity L 9 10 11 12 Fraction of infected nodes 5,2 % 3,5 % 2,4 % 1,8 %

N=36 692, each fraction averaged over 500 networks with random distribution of node types and seeds, S=10

60

slide-16
SLIDE 16

BA networks

The Barabási and Albert (BA) model grows an inhomogeneous scale-free network

61

Example BA network

62

N=40 L=3 S=1

NetLogo BA 3D model

63

Demo

Dense IP networks

Assume that L types of random scanning malware spread over a complete network with N nodes

  • f degree k = N − 1
  • there are L node types and N/L uniformly

distributed nodes per type

64

slide-17
SLIDE 17

Dense network analysis

Let there be one seed (S=1) per node type Each seed has edges to the other N/L − 1 nodes of the same type Together, the N/L single-type nodes form a star graph with the seed in the center

65

Star graph

66

Seed in the center will infect all

  • ther nodes

Need more node types than malware types

Since the seed will always infect all the peripheral nodes in the star graph, it does not help to increase the number of node types L as long as there is one seed per node type The only way diversity can halt multi-malware

  • utbreaks in dense networks is to use many

more node types then there are malware types

67

Needed diversity in dense IP network (1)

If there are M malware types, then M· N/L nodes will be infected Hence, the diversity L needs to be proportional to N and the number of malware types M must be much smaller than N to prevent a large infection

68

slide-18
SLIDE 18

Needed diversity in dense IP networks (2)

The previous observation is in accordance with the diversity bound, which is equal to L ≥ (N − 1)/2 for k = N − 1

69

Advantage of diversity in dense networks (1)

Software diversity cannot stop malware spreading in dense random scanning networks, but it can slow the spreading

  • the likelihood of selecting a node of

wrong type 1–1/L goes to 1 as the diversity L increases

70

Advantage of diversity in dense networks (2)

If the malware spreading is slow, then it is possible for a (cloud-based) anti-malware solution to tailor the defense to a particular outbreak

71

Unknown hubs

Acquaintance immunization immunizes unknown hubs on a monoculture

72

slide-19
SLIDE 19

Acquaintance immunization

(monoculture version)

Choose a set of nodes uniformly at random and immunize one arbitrary neighbor per node

  • while the original set of nodes is unlikely

to contain the relatively few hubs in an inhomogeneous network, the randomly selected neighbors are much more likely to be hubs, since very many edges are adjacent to high-degree nodes

73 74

Acquaintance immunization

(polyculture version)

  • 1. Assume Nl = N/L nodes per type
  • 2. For some fraction 0 < f < 1,choose a set of

f· Nl nodes of type l uniformly at random such that each node has at least one neighbor of the same type, l=1,2,...,L

  • 3. Immunize one randomly selected neighbor
  • f type l per node in the set f· Nl

75

Observation

When the set of all immunized neighbors f· N = ∑ f· Nl is large enough, the set f· N will contain most of the hubs

76

slide-20
SLIDE 20

77

Immunization example

Immunized nodes and remaining susceptible hubs

APT defined

Advanced Persistent Threat (APT)— targeted effort to obtain or change information by means that are difficult to

  • discover,
  • difficult to remove,
  • and difficult to attribute

78

APT examples

Examples of APTs are state-sponsored attacks on foreign commercial and governmental enterprises to steal industrial and military secrets The attacks are often initiated by well-timed, socially engineered (spear-phishing) emails delivering trojans to individuals with access
 to sensitive information

79

Why malicious email?

Malicious email is leveraged because most enterprises allow email to enter their networks Persistent attackers frequently exploit OS or application vulnerabilities in the targeted systems

80

slide-21
SLIDE 21

Attack description

An attacker first develops a payload to exploit


  • ne or more vulnerabilities

Next, an automated tool such as a PDF or a Microsoft document delivers the payload to
 a few users of a system The payload installs a backdoor or provides remote system access, allowing the attacker to establish a presence inside the trusted system boundary

81

Diversity slows down attacks

If a user and an attacker download the same program from an application store that generate diverse executable files, then the two downloaded files share a common vulnerability with probability 1/L

82

Diversity slows down attacks

When the diversity L is large, the probability of a common vulnerability is small and attackers can no longer reliably analyze their own downloaded program files to exploit vulnerabilities in users’ program files

83

Reverse engineering in a monoculture is easy

84

security
patch
or replacement
so1ware now
safe vulnerable unpatched
so1ware

A8acker
can
compute
an
 exploit
by
comparing
 unpatched
&
patched
 so1ware
versions users
who
haven't
yet
applied
the
patch
are
put
at
risk
 by
release
of
the
patch exploit

slide-22
SLIDE 22

Situation in polyculture

85

It is necessary to create software patches tailored to the different binary versions of the same program An attacker cannot reverse engineer software patches by comparing a particular patch to the corresponding code on a user’s computer because the path and code are unknown to the attacker

Reverse engineering in a polyculture is hard

86

ID
of
my variant custom
patch
for this
version
only custom
patch
is
worthless
 unless
you
have
the
exact
 variant
it
relates
to

Malware Halting

Part III: From Fragility to Antifragility

Kjell Jørgen Hole Simula@UiB

Overview

  • 1. Definition of fragility, robustness, and antifragility
  • 2. Antifragility to malware spreading
  • 3. Network model
  • 4. NetLogo demo

88

slide-23
SLIDE 23

89

A property of a complex adaptive system is fragile if it is easily damaged by internal or external perturbations, robust if it can withstand perturbations (up to a point), and antifragile if it learns from incidents how to become increasingly robust over time

Fragile, robust, and anti- fragile systems

90

Antifragile

91

Please mishandle

Fragile Robust Antifragile

92

slide-24
SLIDE 24

Fragile monoculture

93

A system is fragile to infectious malware when it spreads over the network Initially
 infected node

Robust polyculture

94

A system is robust to infectious malware when there is only limited spreading

Antifragility to malware

A system is antifragile when it “learns” from previous malware outbreaks how to reduce the spreading of future outbreaks

  • the previously studied epidemiological model

is not antifragile because it does not learn

95

+ =

Software diversity Imperfect malware detection

Antifragility to malware spreading

96

slide-25
SLIDE 25

So#ware
Developer

creates delivers
to

So#ware

Variants Diversity
Engine within
App
Store

App
Store

creates subsequent
downloaders
receive
func=onally
iden=cal
 but
internally
different
versions
of
the
same
so#ware

Software diversity

Application stores, e.g. Google Play and iOS App Store, can utilize compilers with “diversity engines”

97

Devices can repeatedly download executables to create time-varying software diversity

Malware detection

Behavior based methods increase the detection rate compared to signature- based methods, but the detection is not perfect

98

Malware detection allows a system to partially learn when new software should be downloaded to a device

New epidemiological model

  • Simple connected graph with N nodes and

maximum L node types

  • Discrete time t = 0,1,...,
  • D = D(t) node types, 1 ≤ D(t) ≤ L, at time step t

99

Software downloads

  • All nodes change type with probability p at

each time t to model automated software downloads

  • each of the L possible node types is

selected with probability 1/L

100

slide-26
SLIDE 26

Malware outbreaks

  • A single susceptible node is infected with

probability q at each time t

  • the node is selected uniformly at random
  • A sick node will infect all its neighbors of the

same type

101

Malware detection

  • Infected nodes change type with probability r

during a time step to model malware detection

  • detection is followed by immediate download
  • f new software, i.e., change of node type
  • any infected node becomes susceptible when

it changes type

102

Malware halting method

(second version)

103

  • 1. From fragile monoculture to robust polyculture
  • using time-varying software diversity
  • 2. From robust to antifragile polyculture
  • using imperfect malware detection

NetLogo model (1)

104

Spike due to monoculture Malware detection on Change in spreading mechanism Immunization of hubs

slide-27
SLIDE 27

NetLogo model (2)

105

Polyculture Polyculture Polyculture Polyculture Monoculture Time-varying polyculture

NetLogo model (3)

106

Monoculture Polyculture Imperfect hub immunization

Demo

107

Fragility vs antifragility

108

Fragile monoculture Antifragile polyculture

Large spreading of malware Nearly no malware spreading Needs continuous repair Self-repairing (up to a point) Must immunise nearly all devices Immunization of hubs No adaption to changes in spreading Adapts to changes

slide-28
SLIDE 28

Further study

  • 1. Watch Nassim Taleb and Daniel Kahneman

discuss antifragility

  • www.youtube.com/watch?v=MMBclvY_EMA
  • 2. Watch Taleb give a talk at Stanford
  • www.youtube.com/watch?v=c6sX5MSdLag
  • 3. Find more talks on the web

109

References (1)

  • M. Franz, E Unibus Pluram: Massive-Scale Software

Diversity as a Defense Mechanism, Proc. New Security Paradigms Workshop 2010 (NSPW 2010), Concord, Massachussetts, USA, Sept. 21–23, 2010, pp. 7–16
 P . Wang, M. C. González,C. A. Hidalgo, A.-L. Barabási, Understanding the Spreading Patterns of Mobile Phone Viruses, Science, vol. 324, 22 May 2009, pp. 1072– 1076

110

References (2)

  • K. J. Hole, Diversity Reduces the Impact
  • f Malware, IEEE Security & Privacy Magazine,
  • vol. 13, no. 3, 2015, pp. 48–54

  • K. J. Hole, Towards Anti-fragility to

Malware Spreading, IEEE Security & Privacy Magazine, vol. 13, no. 4, 2015, pp. 40–46 


111

References (3)

N.N. Taleb, Antifragile: Things That Gain from Disorder, Random House, 2012
 Daniel Bilar, Known Knowns, Known Unknowns and Unknown Unknowns: Anti-virus issues, malicious software and Internet attacks for non-technical audiences, journals.sas.ac.uk/deeslr/ article/view/1880

112

slide-29
SLIDE 29

References (4)

  • R. Cohen, S. Havlin, and D. Ben-Avraham, Efficient

immunization strategies for computer networks and populations, Physical Review Letters,

  • vol. 91, no. 24, Article ID 247901, 2003


113