Security Data Science
Joshua Neil Security Analytics Leader Advanced Security Center EY
Powered by
Security Data Science Joshua Neil Security Analytics Leader - - PowerPoint PPT Presentation
Security Data Science Joshua Neil Security Analytics Leader Advanced Security Center EY Powered by Agenda A Background, LANL and the EY/LANL Alliance Traditional Security and Operational Security Data B Science Advanced Analytics:
Powered by
Page 2
Page 3
►National Security Laboratory
► Focused on security science to
protect the nation
►Long history of networking
► First connected to Arpanet in
1983
►Long history of cyber security
► First attack in 1983 ► Over 15 years of data collected ► Many nation state and criminal
attacks
►Long history of cyber R&D
►For defense of LANL’s network
and US DOD networks
►Strong analytics program
Cyber Security Experience
Recent Top Innovations
►
Impenetrable encryption prevents data from cyber terrorism
►
National Multipronged HIV vaccine shows promise in monkeys
►
Tree death worldwide linked to warming climate
►
ChemCam inspects Mars: can it support life?
►
Roadrunner firsts pave way for more powerful supercomputing
►
Space probes predict hazards to protect spacecraft
►
Portable laser tool to thwart nuclear smugglers
►
RAPTOR telescope witnesses black hole birth
►
Liquid-scanning technology boosts airport security
►
Improved biofuel methods: may be greener, cheaper yet powerful
Page 4
2000: LANL data science for national security 2007: Begin enterprise security analytics research 2011: PathScan and CodeVision go live on LANL’s network 2012: Other USG deployments 2013: RFP for proposals to commercialize 2014: EY wins rights, team moves to EY from LANL 2015: Commercial deployment and further development
Page 5
Page 6
► Network IDS Example: If {bytes leaving network > X MB}, set off
►
►
► Antivirus Example: If {executable contains known malware string},
►
►
Page 7 Playbooks / Use Cases / Recon Playbooks / Use Cases / Phishing Playbooks / Use Cases / Exfil Detect Hunt Respond New Patterns Incidents Closed PIR New Rules Investigations Analysts & Hunters
Cyber Security Incident Response
Visual Analysis External Assessment of Potential Attackers Cyber Reconnaissance by Fire Active Defence Analysts Threat Management Analysts
Threat Management / Threat Intelligence Platform
Threat Intelligence Collection Threat Intelligence Analysis Kill Chain Mapping Risk Assessment of Critical Assets Continuous Monitoring Anomaly Analysis Countermeasure Deployment Red Team exercises
Page 8 Platform Support
Data Science
Support IR teams Maintain Infrastructure Statistical Hunting Maintain Visualisation Layer Infrastructure Support
NextGen Cyber Analytics Platform
Visual Analysis Operate Technology Environment Support Operational Cyber Data Scientists Big Data Platform Integrate New Data Sources Research Cyber Data Scientists Deploy new models Develop new models Integrate with CSIRT Playbooks / Use Cases / Recon Playbooks / Use Cases / Phishing Playbooks / Use Cases / Exfil Detect Hunt Respond New Patterns Incidents Closed PIR New Rules Investigations Analysts & Hunters
Cyber Security Incident Response
Visual Analysis External Assessment of Potential Attackers Cyber Reconnaissance by Fire Active Defence Analysts Threat Management Analysts
Threat Management / Threat Intelligence Platform
Threat Intelligence Collection Threat Intelligence Analysis Kill Chain Mapping Risk Assessment of Critical Assets Continuous Monitoring Anomaly Analysis Countermeasure Deployment Red Team exercises
Page 9
Operational Agility through Data Science
Statistical Hunting New Model Development Cyber Analytics Platform
►Lack of Compromise
► Hunt for unknown unknowns ► Ask questions not currently asked by existing tools
►During Compromise
► Analytics to explore around known compromised hosts ► Use changes in behavior to flesh out the attack extent
Statistical Hunting
High Low
Page 10
► Multi-TB data lake of ALL ENTERPRISE DATA (or as much as we can get!) ► Data scientists and hunters interacting with the data in an agile manner ► Industry standard analytics stack (Hadoop, Spark, HDFS etc) ► Continuous monitoring and alerting capability, agile deployment of new analytics ► Rule matching system (Boolean), custom rule creation capability
Threat Intelligence Initial Attack Establish Foothold Enable Persistence Enterprise Recon Move Laterally Escalate Privilege Gather & Encrypt Data Exfiltrate CodeVision/ Endpoint Analytics PathScan PathScan CodeVision/EndPoint Analytics EndPoint Analytics
Analytics Platform
Page 11
Page 12
Background Research Initial Attack Establish Foothold Enable Persistence Enterprise Recon Move Laterally Escalate Privilege Gather & Encrypt Data Steal Data 3 Attack (Kill) Chain Progression
Probability that email is malicious (p1) Probability that programs or services are malicious (p2) Probability that communication with attacker exists (p3) Probability that reconnaissance behavior exists (p4) Probability that traversal behavior exists (p5) Probability that privilege escalation behavior exists (p6) Probability that staging behavior exists(p7) Probability that exfiltration behavior exists (p8) The overall probability of the above attack existing is a statistical combination of the individual anomaly scores.
Page 13
Typical attack lifecycle Intelligence gathering Initial exploitation Command and control Privilege escalation Data exfiltration Background research Initial attack Establish foothold Enable persistence Enterprise recon Move laterally Escalate privilege Gather and encrypt data Steal data
Windows Event logs Netflow and DNS VPN Active Directory Web Endpoint IDS (Carbon Black)
Page 14
Process Network Operating System
Execution start timestamp Flow data Windows event logs Name and full path Associated user and process WQL/WMI events Author and version Hostname Powershell events Digital signature IP address Authentication events Hash MAC address Session tracking events Size Listening ports Browsing history Parent process PID Packet capture Mounted devices Command-line arguments DNS queries Driver events User and privilege level Packet-level data Application crash data DLL imports Prefetch contents and updates File and registry access Antivirus and similar alerts Mutex access Account and group modification System and library calls Executable PathScan Pilot Engagement – Caterpillar
Page 15
Page 16
PathScan and CodeVision provide capabilities to fill significant gaps in the kill chain
PathScan CodeVision
►
PathScan is an advanced network analysis tool
►
Developed by the Dept. of Energy at LANL
►
Operational since 2011 at LANL
►
Deployed for several EY clients
►
And other USG sites
►
Identifies traversal behavior – key part of the kill chain
►
Has many successful detections of APT in USG networks
►
An enterprise malware analysis service in the cloud
►
Operationalizes 0-day malware detection
►
Advanced, patent-pending, machine learning techniques
►
Operational at LANL since 2010
►
Multi-view, machine learning techniques for 0-day detection
►
Custom signatures, proprietary analytics, sandboxes, antivirus
Pathscan: Lateral, Recon and Staging CodeVision: 0-day malware EndPoint Analytics in development
Page 17
► At least 10% phishing click rates
Page 18
► A function that encodes past behavior and can score current events
► What is the probability that the current communication is normal? ► If it is low, we have an anomaly (but not necessarily an attack!)
Page 19
edge 1 edge 2 edge 3
Page 20
► Insider threat
►
► Reconnaissance ► Command and Control
Anomalous Behavior
Page 21
High Low
Page 22
► Binary received by the CodeVision platform ► Initially undetected by 9 best-in-class antivirus
products
► Immediately detected by multi-view behavioral
analysis
► Confirmed as a true positive 5 days later by antivirus ► CodeVision prevented this 0-day breach
Multiple Views Enable Accurate Detection
► Adversaries use malware throughout the attack
lifecycle
► No single product or method can detect all threats ► Signature-based methods provide insufficient
protection
► Antivirus products prune old signatures to improve
performance
Anti Virus is not Enough CodeVision Approach
►Leverage multi-view machine learning for accurate
malware classification and zero-day detection
►Provide detailed analysis reports and actionable alerts ►Integrate with a variety of data feeds and SIEM tools ►Deploy a flexible, modular and scalable architecture ►Combine and automate best-in-class malware detection
Page 23
►
LANL/EY Academic papers
►
Kent, Alexander D., Lorie M. Liebrock, and Joshua C. Neil. "Authentication graphs: Analyzing user behavior within an enterprise network." Computers & Security 48 (2015): 150-166.
►
Kent, Alexander D., and Lorie M. Liebrock. "Differentiating user authentication graphs." Security and Privacy Workshops (SPW), 2013 IEEE. IEEE, 2013.
►
Hagberg, Aric, et al. "Connected Components and Credential Hopping in Authentication Graphs." Signal-Image Technology and Internet-Based Systems (SITIS), 2014 Tenth International Conference on. IEEE, 2014.
►
LANL has invested years of research into credential monitoring
►
Analytics to find deviations from normal credential/authentication indicating misuse
►
Capable of insider and external threat detection
►
Capable of pass-the-hash identification
►
Can focus on privileged users or the entire population
►
Behavioral analytics, not static signatures, to find deviations from norm
Page 24
► Traditional rule based approaches don’t work