Empirical Studies in Cybersecurity: Some Challenges Michel Cukier - - PowerPoint PPT Presentation

empirical studies in cybersecurity some challenges
SMART_READER_LITE
LIVE PREVIEW

Empirical Studies in Cybersecurity: Some Challenges Michel Cukier - - PowerPoint PPT Presentation

Empirical Studies in Cybersecurity: Some Challenges Michel Cukier Adding Science to Cybersecurity Empirical studies are needed to add science to cybersecurity Challenges: Security metrics are lacking Security data are not


slide-1
SLIDE 1

Empirical Studies in Cybersecurity: Some Challenges

Michel Cukier

slide-2
SLIDE 2

Adding Science to Cybersecurity

  • Empirical studies are needed to add science

to cybersecurity

  • Challenges:

– Security metrics are lacking – Security data are not publicly available

slide-3
SLIDE 3

Availability of Security Data

  • Few available datasets have issues (e.g.,

MIT LL 98/99)

  • NSF helped initiating collaborations but none

succeeded (2001)

  • NSF workshop on the lack of available data

(2010)

  • DHS PREDICT dataset:

– Context is missing – More datasets will be added over time

slide-4
SLIDE 4

The End?

slide-5
SLIDE 5

A Rare Collaboration

  • Unique relationship with

– G. Sneeringer, Director of Security, and his security team at the Office of Information Technology

  • Access to security related data collected on the UMD

network

  • Development of testbeds for monitoring attackers

Enables unique empirical studies

slide-6
SLIDE 6

Incident Data

  • Incidents:

– Confirmed compromised computers – More than 12,000 records since June 2001

  • Models:

– Software reliability growth models, time series, epidemiological models

  • Questions:

– # incidents: relevant metric? – Impact of time (age, duration)?

slide-7
SLIDE 7

Intrusion Prevention System (IPS) Data

  • Intrusion Prevention System (IPS) alerts:

– IPSs located at the border and inside UMD network – More than 7 million events since September 2006

  • Models:

– Identify outliers, define metrics containing some memory

  • In-house validation
slide-8
SLIDE 8

Network Flows

  • Network flows:

– 130,000 IP addresses monitored (two class B networks belonging to UMD)

  • Tool:

– Goal: increase network visibility – Nfsight (available on sourceforge)

  • In-house validation
  • Next goal:

– An efficient flow-based IDS

slide-9
SLIDE 9

Backend Algorithm

Algorithm:

  • Receive a batch of 5 minutes of flows
  • Pair up unidirectional flows using {src/dst IP/port and protocol}
  • Run heuristics and calculate probabilities for each end point to host a

service

  • Output end point results and bidirectional flows

Request flow: 2009-07-30 09:34:56.321 TCP 10.0.0.1:2455 → 10.1.2.3:80 Reply flow: 2009-07-30 09:34:56.322 TCP 10.1.2.3:80 → 10.0.0.1:2455

Host 1 Host 2

Bi-flow: 2009-07-30 09:34:56.321 TCP 10.0.0.1:2455 → 10.1.2.3:80

Server

10.1.2.3

hosts tcp/80

Client

10.0.0.1

to tcp/80

slide-10
SLIDE 10

Heuristics

Heuristic 0 Timestamp of request < Timestamp of reply [0, …] Heuristic 1 Src port > Dst port {0, 0.5, 1} Heuristic 2 Src port > 1024 > Dst port {0, 0.5, 1} Heuristic 3 Port in /etc/services {0, 0.5, 1} Heuristic 4 # ports related [0, …] Heuristic 5 # IP related [0, …] Heuristic 6 # tuples related [0, …] Heuristic ID Features and Formula Used Output Values

Timing: Port numbers: Fan in/out relationships:

slide-11
SLIDE 11

Front-end

slide-12
SLIDE 12

Case Study: Scanning

Activity

slide-13
SLIDE 13

Case Study: Worm

Outbreak

slide-14
SLIDE 14

Case Study: Distributed

Attacks

slide-15
SLIDE 15

Honeypot (HP) Data

  • Honeypot data:

– Malicious activity collected on more than 1,200 HPs (low and high interaction) – Low interaction HPs deployed at UIUC, AT&T, PJM, France and Morocco – High interaction HPs for study of attacks/attackers

slide-16
SLIDE 16

Details of Experiment

  • Easy access to honeypots though entry point: SSH
  • Multiple honeypots per attacker for an extended period
  • f time: one month
  • Configure honeypots given to one attacker with

increasing network limitations: some ports blocked

  • Collect data such as network traffic, keystrokes entered

and rogue software downloaded

slide-17
SLIDE 17

Configuration Details

  • The network gateway has two network interfaces:

– One in front of the Internet, configured with 40 public IP addresses from the University of Maryland – One configured with a private IP address

  • OpenSSH was modified to reject SSH attempts on

its public IP addresses until the 150th try

  • Up to 40 honeypots can exist in parallel
  • Attackers can deploy up to 3 honeypots
  • Honeypots:

– HP1: no network limitation – HP2: main IRC port blocked (port 6667) – HP3: every port blocked except HTTP, HTTPS, FTP, DNS, and SSH

slide-18
SLIDE 18

Test-bed Architecture

slide-19
SLIDE 19

Attacker Identification

  • Attacker IP address
  • Attacker AS number (identifies network on

the Internet)

  • Attacker actions:

– Rogue software origin – Way of performing specific actions – Files accessed

  • Comparison of keystroke profiles
slide-20
SLIDE 20

Attacker Skills

  • Analyst assesses attacker skill
  • Preferred approach easier to reproduce
  • Criteria based on:

– Is the attacker careful about not being seen? – Does the attacker check the target environment? – How familiar is the attacker with the rogue software? – Is the attacker protecting the compromised target?

slide-21
SLIDE 21

Attacker Skills (Cont.)

Criterion Assessment Hide Ratio of # sessions where attacker hid Restore deleted files Ratio # sessions where deleted files were restored Check presence Ratio # sessions where presence checked Delete downloaded file 0 if downloaded file is not deleted, 1 otherwise Check system 0 if system has never been checked, 1 otherwise Edit configuration file 0 if configuration file has never been edited, 1

  • therwise

Change system 0 if system has never been modified, 1 otherwise Change password 0 if password has never been changed, 1 otherwise Create new user 0 if no new user has been created, 1 otherwise Rogue software adequacy 0 if less than half of the installed rogue software is adequate, 1 otherwise

slide-22
SLIDE 22

Overall Results

  • Experiment run from May 17th, 2010 to November

5th, 2010

Honeypot # sessions # non-empty sessions All 312 211 (68%) HP1 160 110 (69%) HP2 105 74 (70%) HP3 47 27 (57%)

slide-23
SLIDE 23

Who Launched the Attacks?

Top countries brute force: China (34) USA (27) Korea (8) Italy (7) Top countries compromise: Romania (75) Lebanon (32) USA (24) UK (16)

Based on IP Address Based on AS Number

slide-24
SLIDE 24

Analysis as a Function of Attacker Skill

  • Results:

– 95% check presence

  • r system

– 79% delete downloaded file – 77% change the password – 15% create a new user

  • There might be a link between attackers

actions and their skills

All honeypots

59% 21% 95% 79% 95% 46% 49% 77% 15% 56% 0% 20% 40% 60% 80% 100% 1 2 3 4 5 6 7 8 9 10

Criterion ID Percentage of attackers

slide-25
SLIDE 25

Analysis as a Function

  • f Attacker Skill (Cont.)

Create new user

50% 33% 17% 0% 10% 20% 30% 40% 50% 60% 1 2 3 4 5 6 7 8 9

Skill level Percentage of attackers

(a) Hide

4% 4% 13% 39% 13% 22% 4% 0% 10% 20% 30% 40% 50% 1 2 3 4 5 6 7 8 9

Skill level Percentage of attackers

(b)

Average skill level= 7.7 Average skill level= 6.3

slide-26
SLIDE 26

Analysis as a Function

  • f Attacker Skill (Cont.)

Check presence

8% 8% 8% 24% 27% 8% 14% 3% 0% 5% 10% 15% 20% 25% 30% 1 2 3 4 5 6 7 8 9

Skill level Percentage of attackers

(d) Password change

3% 3% 7% 23% 30% 13% 17% 3% 0% 5% 10% 15% 20% 25% 30% 35% 1 2 3 4 5 6 7 8 9

Skill level Percentage of attackers

(c)

Average skill level= 6.0 Average skill level= 5.5

slide-27
SLIDE 27

Why Was the Attack Launched?

  • For the 60 deployed

honeypots, 9 (15%) were targeted by more than one attacker

  • 7 honeypots were

targeted by 2 different attackers, one honeypot by 3 different attackers and 1 honeypot by 5 different attackers

  • Raises the important issue about how access is shared and

why

  • Even though 77% of the attackers changed the password, 15%

did share access with at least 1 other attacker

Average number of attackers per Honeypot type

1.18 1.44 1 1.20

0.5 1 1.5 2

Honeypot type Average number of attackers

HP1 HP2 HP3 All

slide-28
SLIDE 28

Challenges

  • Generalization?

– Replication (same method) – Reproduction (different method) – Re-analysis of data

  • Issues:

– Need collaborations for replication – Need to develop a new method for reproduction – Re-analysis might not be possible

slide-29
SLIDE 29

The End?

slide-30
SLIDE 30

Theories from Social Sciences to Add Science to Cybersecurity

  • For the last year:

– Focus on criminological theories – Collaboration with David Maimon and his research team

  • Consider various criminological theories
  • Identify theories that need to be adapted to

cybersecurity

slide-31
SLIDE 31

New Use of IPS Alerts

  • Application to Routine Activity Theory (RAT):

– Crime is normal and depends on the

  • pportunities available

– If a target is not protected enough, and if the reward is worth it, crime will happen

  • Alerts = Attack attempts (blocked by IPS)
  • Results:

– Number of alerts is linked to daily activity – Origin of attack is linked to user origin

slide-32
SLIDE 32

Use of Honeypot Data

  • Describe attacker/attack:

– Network data – Attacker keystrokes

  • Empirical study:

– Effect of warnings – Various HPs configurations (CPU, memory, disk space)

slide-33
SLIDE 33

Issues

  • Mismatch between what criminological

theories need and what HPs data contain

  • Need statistically significant results (e.g., 6

months, over 120 HPs/week deployed, about 2900 HPs, 3700 sessions)

  • Experiments need to be deployed over a

long period of time: attacks/attackers might evolve

slide-34
SLIDE 34

Some Good News

  • Empirical studies are solid scientific work
  • Developed approaches can be applied at
  • ther locations
  • Results do not need to be identical (e.g.,

crime varies between cities)

slide-35
SLIDE 35

The End!

slide-36
SLIDE 36

Acknowledgments

  • Gerry Sneeringer and the OIT Security team
  • Bertrand Sobesto and Ed Condon
  • Robin Berthier (UIUC) and Danielle Chrun
  • David Maimon and his research team
  • Maryland Cybersecurity Center (MC2)
slide-37
SLIDE 37

Michel Cukier Email: mcukier@umd.edu Phone: 301 314 2804 URL: http://www.enre.umd.edu/faculty/cukier.htm

More Information