Security Data Science Joshua Neil Security Analytics Leader - - PowerPoint PPT Presentation

security data science
SMART_READER_LITE
LIVE PREVIEW

Security Data Science Joshua Neil Security Analytics Leader - - PowerPoint PPT Presentation

Security Data Science Joshua Neil Security Analytics Leader Advanced Security Center EY Powered by Agenda A Background, LANL and the EY/LANL Alliance Traditional Security and Operational Security Data B Science Advanced Analytics:


slide-1
SLIDE 1

Security Data Science

Joshua Neil Security Analytics Leader Advanced Security Center EY

Powered by

slide-2
SLIDE 2

Page 2

Agenda

A Background, LANL and the EY/LANL Alliance B Traditional Security and Operational Security Data Science C Advanced Analytics: PathScan, CodeVision, Credential Analytics

slide-3
SLIDE 3

Page 3

Los Alamos National Lab (LANL)

►National Security Laboratory

► Focused on security science to

protect the nation

►Long history of networking

► First connected to Arpanet in

1983

►Long history of cyber security

► First attack in 1983 ► Over 15 years of data collected ► Many nation state and criminal

attacks

►Long history of cyber R&D

►For defense of LANL’s network

and US DOD networks

►Strong analytics program

Cyber Security Experience

Recent Top Innovations

Impenetrable encryption prevents data from cyber terrorism

National Multipronged HIV vaccine shows promise in monkeys

Tree death worldwide linked to warming climate

ChemCam inspects Mars: can it support life?

Roadrunner firsts pave way for more powerful supercomputing

Space probes predict hazards to protect spacecraft

Portable laser tool to thwart nuclear smugglers

RAPTOR telescope witnesses black hole birth

Liquid-scanning technology boosts airport security

Improved biofuel methods: may be greener, cheaper yet powerful

slide-4
SLIDE 4

Page 4

History of our work

From Government Research to Commercial Application

2000: LANL data science for national security 2007: Begin enterprise security analytics research 2011: PathScan and CodeVision go live on LANL’s network 2012: Other USG deployments 2013: RFP for proposals to commercialize 2014: EY wins rights, team moves to EY from LANL 2015: Commercial deployment and further development

slide-5
SLIDE 5

Page 5

Agenda

A Background, LANL and the EY/LANL Alliance B Traditional Security and Operational Security Data Science C Advanced Analytics: PathScan, CodeVision, Credential Analytics

slide-6
SLIDE 6

Page 6

Signatures and Rules

Traditional Cyber Security is not Statistical

► Rules and signatures

► Network IDS Example: If {bytes leaving network > X MB}, set off

alarm

If X is too large, easy to get around

If X is too small, affects usability of the network

► Antivirus Example: If {executable contains known malware string},

set off alarm

Requires having seen the malware before!

Very easy for attackers to make new 0-day malware

► These brittle approaches are the reason we have so many

breaches today

slide-7
SLIDE 7

Page 7 Playbooks / Use Cases / Recon Playbooks / Use Cases / Phishing Playbooks / Use Cases / Exfil Detect Hunt Respond    New Patterns Incidents Closed PIR New Rules Investigations Analysts & Hunters

Cyber Security Incident Response

Visual Analysis External Assessment of Potential Attackers Cyber Reconnaissance by Fire Active Defence Analysts Threat Management Analysts

Threat Management / Threat Intelligence Platform

Threat Intelligence Collection Threat Intelligence Analysis Kill Chain Mapping Risk Assessment of Critical Assets Continuous Monitoring Anomaly Analysis Countermeasure Deployment Red Team exercises

slide-8
SLIDE 8

Page 8 Platform Support

Data Science

Support IR teams Maintain Infrastructure Statistical Hunting Maintain Visualisation Layer Infrastructure Support

NextGen Cyber Analytics Platform

Visual Analysis Operate Technology Environment Support Operational Cyber Data Scientists Big Data Platform Integrate New Data Sources Research Cyber Data Scientists Deploy new models Develop new models Integrate with CSIRT Playbooks / Use Cases / Recon Playbooks / Use Cases / Phishing Playbooks / Use Cases / Exfil Detect Hunt Respond    New Patterns Incidents Closed PIR New Rules Investigations Analysts & Hunters

Cyber Security Incident Response

Visual Analysis External Assessment of Potential Attackers Cyber Reconnaissance by Fire Active Defence Analysts Threat Management Analysts

Threat Management / Threat Intelligence Platform

Threat Intelligence Collection Threat Intelligence Analysis Kill Chain Mapping Risk Assessment of Critical Assets Continuous Monitoring Anomaly Analysis Countermeasure Deployment Red Team exercises

slide-9
SLIDE 9

Page 9

Data Science for Operational Security

Security Analytics Concepts

Operational Agility through Data Science

Statistical Hunting New Model Development Cyber Analytics Platform

►Lack of Compromise

► Hunt for unknown unknowns ► Ask questions not currently asked by existing tools

►During Compromise

► Analytics to explore around known compromised hosts ► Use changes in behavior to flesh out the attack extent

Statistical Hunting

  • Edge Anomaly Level

High Low

slide-10
SLIDE 10

Page 10

Cyber Security Analytics Platform

Collaboration for a Cyber Security Platform

► Multi-TB data lake of ALL ENTERPRISE DATA (or as much as we can get!) ► Data scientists and hunters interacting with the data in an agile manner ► Industry standard analytics stack (Hadoop, Spark, HDFS etc) ► Continuous monitoring and alerting capability, agile deployment of new analytics ► Rule matching system (Boolean), custom rule creation capability

EY Security Analytics Platform

Threat Intelligence Initial Attack Establish Foothold Enable Persistence Enterprise Recon Move Laterally Escalate Privilege Gather & Encrypt Data Exfiltrate CodeVision/ Endpoint Analytics PathScan PathScan CodeVision/EndPoint Analytics EndPoint Analytics

Analytics Platform

slide-11
SLIDE 11

Page 11

slide-12
SLIDE 12

Page 12

Comprehensive visibility

Statistical scoring along the Kill Chain

Background Research Initial Attack Establish Foothold Enable Persistence Enterprise Recon Move Laterally Escalate Privilege Gather & Encrypt Data Steal Data 3 Attack (Kill) Chain Progression

Probability that email is malicious (p1) Probability that programs or services are malicious (p2) Probability that communication with attacker exists (p3) Probability that reconnaissance behavior exists (p4) Probability that traversal behavior exists (p5) Probability that privilege escalation behavior exists (p6) Probability that staging behavior exists(p7) Probability that exfiltration behavior exists (p8) The overall probability of the above attack existing is a statistical combination of the individual anomaly scores.

slide-13
SLIDE 13

Page 13

Anything you can measure from a network

Data sources for Security Analytics

Typical attack lifecycle Intelligence gathering Initial exploitation Command and control Privilege escalation Data exfiltration Background research Initial attack Establish foothold Enable persistence Enterprise recon Move laterally Escalate privilege Gather and encrypt data Steal data

Windows Event logs Netflow and DNS VPN Active Directory Web Endpoint IDS (Carbon Black)

slide-14
SLIDE 14

Page 14

Endpoint Data

Endpoint data types that can contribute to attack chain analytics

Process Network Operating System

Execution start timestamp Flow data Windows event logs Name and full path Associated user and process WQL/WMI events Author and version Hostname Powershell events Digital signature IP address Authentication events Hash MAC address Session tracking events Size Listening ports Browsing history Parent process PID Packet capture Mounted devices Command-line arguments DNS queries Driver events User and privilege level Packet-level data Application crash data DLL imports Prefetch contents and updates File and registry access Antivirus and similar alerts Mutex access Account and group modification System and library calls Executable PathScan Pilot Engagement – Caterpillar

slide-15
SLIDE 15

Page 15

Agenda

A Background, LANL and the EY/LANL Alliance B Traditional Security and Operational Security Data Science C Advanced Analytics: PathScan, CodeVision, Credential Analytics

slide-16
SLIDE 16

Page 16

Advanced Analytics

Specific examples

PathScan and CodeVision provide capabilities to fill significant gaps in the kill chain

PathScan CodeVision

PathScan is an advanced network analysis tool

Developed by the Dept. of Energy at LANL

Operational since 2011 at LANL

Deployed for several EY clients

And other USG sites

Identifies traversal behavior – key part of the kill chain

Has many successful detections of APT in USG networks

An enterprise malware analysis service in the cloud

Operationalizes 0-day malware detection

Advanced, patent-pending, machine learning techniques

Operational at LANL since 2010

Multi-view, machine learning techniques for 0-day detection

Custom signatures, proprietary analytics, sandboxes, antivirus

Pathscan: Lateral, Recon and Staging CodeVision: 0-day malware EndPoint Analytics in development

slide-17
SLIDE 17

Page 17

PathScan

Traversal Scoring

► Initial compromise gives access to other credentials

stored on the machine

► At least 10% phishing click rates

► Attack “passes the hash” using the credentials to remotely

authenticate to and access additional machines

► Process repeats

slide-18
SLIDE 18

Page 18

PathScan

Modeling edges in a communications graph

► An edge: communication between two computers ► We build a model for every edge in the network

► A function that encodes past behavior and can score current events

► We ask the question:

► What is the probability that the current communication is normal? ► If it is low, we have an anomaly (but not necessarily an attack!)

User Laptop File server

slide-19
SLIDE 19

Page 19

PathScan

Evaluate paths

► Instead of alarming on each edge

► We require more evidence

► By detecting anomalies in a path

► We have more attack evidence ► We are forming a question consistent with lateral

movement

Anomalies along a network path

edge 1 edge 2 edge 3

A C B D

slide-20
SLIDE 20

Page 20

PathScan

Recon and Staging Detection

► Looking for anomalies in a

node’s communication with neighboring nodes

► Changes in a node’s behavior

may be indicative of certain types of malicious activity:

► Insider threat

Snowden created anomalous stars

► Reconnaissance ► Command and Control

Anomalous Behavior

slide-21
SLIDE 21

Page 21

  • Edge Anomaly Level

High Low

PathScan

Many Operational Attack Detections

slide-22
SLIDE 22

Page 22

CodeVision

0-day Detection Using Behavioral Analysis

► Binary received by the CodeVision platform ► Initially undetected by 9 best-in-class antivirus

products

► Immediately detected by multi-view behavioral

analysis

► Confirmed as a true positive 5 days later by antivirus ► CodeVision prevented this 0-day breach

Multiple Views Enable Accurate Detection

► Adversaries use malware throughout the attack

lifecycle

► No single product or method can detect all threats ► Signature-based methods provide insufficient

protection

► Antivirus products prune old signatures to improve

performance

Anti Virus is not Enough CodeVision Approach

►Leverage multi-view machine learning for accurate

malware classification and zero-day detection

►Provide detailed analysis reports and actionable alerts ►Integrate with a variety of data feeds and SIEM tools ►Deploy a flexible, modular and scalable architecture ►Combine and automate best-in-class malware detection

slide-23
SLIDE 23

Page 23

Credential monitoring

Model-based credential behavior analytics

LANL/EY Academic papers

Kent, Alexander D., Lorie M. Liebrock, and Joshua C. Neil. "Authentication graphs: Analyzing user behavior within an enterprise network." Computers & Security 48 (2015): 150-166.

Kent, Alexander D., and Lorie M. Liebrock. "Differentiating user authentication graphs." Security and Privacy Workshops (SPW), 2013 IEEE. IEEE, 2013.

Hagberg, Aric, et al. "Connected Components and Credential Hopping in Authentication Graphs." Signal-Image Technology and Internet-Based Systems (SITIS), 2014 Tenth International Conference on. IEEE, 2014.

LANL has invested years of research into credential monitoring

Analytics to find deviations from normal credential/authentication indicating misuse

Capable of insider and external threat detection

Capable of pass-the-hash identification

Can focus on privileged users or the entire population

Behavioral analytics, not static signatures, to find deviations from norm

slide-24
SLIDE 24

Page 24

► Modern enterprise security needs your help

► Traditional rule based approaches don’t work

► Data science can find the unknown unknowns ► The data is vast, complicated, and full of noise ► A comprehensive approach, aligned to the kill chain is

called for www.ey.com/losalamos http://csr.lanl.gov/ Questions? Joshua.Neil@ey.com

Conclusions