System Security: From Discovery to Innovation XiaoFeng Wang James - - PowerPoint PPT Presentation

system security from discovery to innovation
SMART_READER_LITE
LIVE PREVIEW

System Security: From Discovery to Innovation XiaoFeng Wang James - - PowerPoint PPT Presentation

System Security: From Discovery to Innovation XiaoFeng Wang James H. Rudy Professor Indiana University at Bloomington xw7@Indiana.edu http://www.informatics.indiana.edu/xw7/ System Security Research Inherently Interdisciplinary and


slide-1
SLIDE 1

System Security: From Discovery to Innovation

XiaoFeng Wang James H. Rudy Professor Indiana University at Bloomington xw7@Indiana.edu http://www.informatics.indiana.edu/xw7/

slide-2
SLIDE 2

System Security Research

  • Inherently Interdisciplinary and Multi-dimensional
slide-3
SLIDE 3

Follow the Tech Trends

v

slide-4
SLIDE 4

System Security Research

  • Inherent Interdisciplinary and Multi-dimensional
  • Discovery-driven, utility centric
slide-5
SLIDE 5

Sources for Security Innovations

  • Software security
  • E.g., memory attack, jump to libraries => ASLR
  • Mobile security
  • Malware infection => app sandbox + app store vetting
  • OS security
  • E.g., OS-level attacks => TEE (such as Intel’s SGX)
  • Network security
  • E.g., DDoS attacks => Syn-cookie, combined detection and blocking (e.g., AWS shield)
  • Browser security
  • E.g., Cross-origin attacks (such as XSS) => Chrome’s site isolation
  • Data privacy
  • Inference attack => Differential privacy, as integrated in iOS
  • Others:
  • Side channels on mobile systems => closing Procfs on Android; data perturbation on iOS
  • Credential attacks => multi-factor authentication

slide-6
SLIDE 6

Destructive Research

Security research needs wreckers !!!

Find the cracks Wreck “secure” systems Fix the cracks Build a better system Understand the cracks and the fundamentality

slide-7
SLIDE 7

How to Innovate in Security Research

  • Follow the technical tide
  • Current: Mobile security
  • Emerging: IoT/CPS security
  • Future: ML Security, Genome Privacy
  • Understanding new technologies
  • Finding weaknesses
  • Finding utilities and constraints
  • Asking big questions
  • Fundamental causes of the problem?
  • How to do better (under the constraints)?
slide-8
SLIDE 8

Examples: Destructive Research on Mobile and IoT Security

CCS’13, Oakland’15, NDSS’14, 15, CCS’17

slide-9
SLIDE 9

NO Bugs in apps NO implementation flaws in system What can a zero-permission app still learn?

slide-10
SLIDE 10
slide-11
SLIDE 11

Android Public Resource

Adversary Model Usability Goals

Linux Kernel Application Framework Public APIs (Audio Usage, CPU Usage, Running application list) Public files (procfs, sysfs)

slide-12
SLIDE 12

Finding Your Location

Adversary controlled web-server

Zero-permission app monitoring

/proc/net/arp Deliver BSSID through browser

slide-13
SLIDE 13

Why is BSSID Sensitive?

BSSID BSSID BSSID

GPS BSSID to GPS Dataset

slide-14
SLIDE 14

Coverage

slide-15
SLIDE 15

Evaluation

slide-16
SLIDE 16

Another Example: Identity Inference

  • Per-app mobile data usage: yet another piece of public data

Tweet Download 580-720B 541-544B

slide-17
SLIDE 17

Attack

Timestamp1 Timestamp2 Timestamp3 Timestamp4 Timestamp5

People who tweeted at Timestamp2±60s People who tweeted at Timestamp3±60s People who tweeted at Timestamp1±60s

slide-18
SLIDE 18

Identity Recovery

Manual analysis of approx. 4000 twitter accounts

First and last name 79% Location 32% Bio 21%

slide-19
SLIDE 19

Why Identity is Important

slide-20
SLIDE 20

Other Findings

  • Your health/financial information
  • Mobile data usage of Yahoo! Finance and WebMD
  • Your driving routes
  • Monitor the speaker status (on or off) when running

Navigator

  • Stealthiness
  • Monitor running apps
  • Send data through browser when LCD is off
slide-21
SLIDE 21

Our Solution

  • A new policy enforcement framework
  • Each app can specify the permissions for disclosing its mobile

data usage

  • Four settings: NO_Access, Rounding, Aggregation and

NO_Protection

  • Enforced by Android framework
  • Rounding: round the usage to the multiple of a fixed size (e.g.,

256B)

  • Aggregation: release the total usage every hour, day or week
slide-22
SLIDE 22

App Guardian

Demo: http://sit.soic.indiana.edu/en/2015/ 09/11/app-guardian-oarland/ App: https://play.google.com/store/apps/ details?id=edu.iub.seclab.appguar dian

slide-23
SLIDE 23

IoT Devices

  • What you know
  • What are new
slide-24
SLIDE 24

Sensitive Data

  • Those medical devices are in FDA-approved Category II
  • In the same category of X-ray machine, infusion pump, …
  • The data they collect are highly sensitive
  • But can Android protect them?
slide-25
SLIDE 25
slide-26
SLIDE 26

What Goes Wrong here?

  • Android is not designed to protect its external devices
  • No device-app authentication ⇒ misbinding threat
slide-27
SLIDE 27

Our Solution: SEACAT

Policy Manager DAC Policy Manager Service

Policy Module DAC MAC

AVC

Fast Resource-Type Cache

BT stack

slide-28
SLIDE 28

Security by Construction: What is the problem and How to make it work

slide-29
SLIDE 29

What We Learned

slide-30
SLIDE 30

What need to be done

  • Communication
  • Find out whether expected protection has been provided by the

system

  • Challenges: limited documentation, default assumptions, etc.
  • Evolution
  • Individualize policy settings for apps with different protection

demands

  • How to make this happen is a million-dollar question
slide-31
SLIDE 31

A Step Further: Automate Security Analysis

  • Security requirements, utility constraints?
  • Attacker’s resources, information?
  • Vulnerability discovery in complicated systems?
slide-32
SLIDE 32

Towards Data-Driven, Intelligent Security

  • Automatic understanding of the system
  • Knowledge discovery from documents
  • Automatic building of system model
  • Automatic determination of security requirements
  • Automatic analysis of the adversary
  • Cyber threat intelligent gathering and analysis
  • Intelligent vulnerability discovery
  • Knowledge-driven system analysis
slide-33
SLIDE 33

A Baby Step: Semantics-based Fuzzing

slide-34
SLIDE 34

Toward Automated Vulnerability Discovery

  • First an easier problem:

Can we recover a Known vulnerability automatically?

  • Why important?
  • Patching delay => Attack Window
  • Security Implications of Public Bug Information
slide-35
SLIDE 35

Why Hard?

  • Complicated bugs cannot be patched by adding a check
  • Limitations of symbolic execution and constraint solving
  • path explosion
  • limited formula solving capability

whole chunk of code is replaced difficult to formulize how the patch works

slide-36
SLIDE 36

How About Auxiliary Information?

  • Various sources of vulnerability information
  • How experienced attackers benefit from auxiliary information?
  • Question: is it possible to automate this process?
slide-37
SLIDE 37

Semantics-Driven Fuzzing

  • Basic idea:
  • Target program: Linux kernel 4.0+
  • Information sources: CVE reports, Linux git logs
  • Results: 16 vulnerability types beyond input validation

18 successful exploits, 2 unknown vulnerabilities

Retrieve Guide

SemFuzz Exploits

slide-38
SLIDE 38

Guidance for CVE-2017-6347

slide-39
SLIDE 39

Workflow

Stage 1 Stage 2

slide-40
SLIDE 40

Retrieving Critical Variables

Type Name int

  • ffset

struct sk_buff skb …... Type Name struct sock sk unsignedi nt len …...

Symbol Table Parse Tree

slide-41
SLIDE 41

Retrieving System Calls

  • Identifying system call names is insufficient
  • Building a knowledge base
  • goal: keywords in descriptions ==> system call and parameter values
  • source: Linux Programmer Manual (LPM)
  • result: 1082 LPM pages, 373 system calls, 2000+ keywords

==========> syscall: socket, sendto match syscall name MSG_MORE loopback UDP

slide-42
SLIDE 42

MSG_MORE ==> sendto(flags = MSG_MORE) loopback ==> sendto(dest_addr = {INADDR_LOOPBACK}) UDP ==> socket(socket_type = SOCK_DGRAM) r0 = socket(AF_INET, SOCK_DGRAM, 0) sendto(r0, ..., MSG_MORE, {INADDR_LOOPBACK}, …)

slide-43
SLIDE 43

Effective of Semantics-based Fuzzing

  • Result
  • 16% (18/112) trigger the target vulnerability
  • 49% (46/94) reach the vulnerable functions
  • 20% (19/94) reach the patched basic blocks
  • Zero-day vulnerability
  • found when fuzzing CVE-2016-4794
  • new vulnerability appears around the known flaws
  • reported and confirmed
  • Undisclosed vulnerability
  • found when fuzzing CVE-2016-3841
  • similar problems inside equivalent components
  • patched before we reported, but no reports disclosed
slide-44
SLIDE 44

Performance

  • Trigger vulnerability
  • count: 18 (SemFuzz) v.s. 7 (Syzkaller)
  • time: 13.2h (SemFuzz) v.s. 33.9h (Syzkaller)
  • Reach vulnerable functions
  • count: 18 (SemFuzz) v.s. 14 (Syzkaller)
  • time: 1.8h (SemFuzz) v.s. 5.2h (Syzkaller)
slide-45
SLIDE 45

Future of System Security Research

slide-46
SLIDE 46

Where Technologies Go, Opportunities Follow

  • Machine Learning and Security
  • Adversarial learning => secure ML
  • Inference attacks on ML models => privacy-preserving ML
  • Security in Smart Things and CPS
  • Smart-home/smart-city security
  • Industrial control security
  • Smart grid security
  • Biomedical Data Privacy
  • Genomic data privacy (www.humangenomeprivacy.org)
  • Other Omics privacy
  • Others (e.g., blockchain)
slide-47
SLIDE 47

Riding the New Tech Wave

  • Data-centric, Intelligent Security
  • NLP-enhanced protection (e.g., CTI gathering, analysis)
  • AI (ML/reasoning) based protection (e.g., Intelligent CTF)
  • Hardware enhanced protection
  • Scalable TEE-based protection
slide-48
SLIDE 48

Moving Forward

  • Learn
  • Understand it, analyze it and crack it
  • Think
  • Ask BIG question, seek deep insight
  • Do
  • Protect What need to protect
  • Build What will be used
slide-49
SLIDE 49
slide-50
SLIDE 50

Data-Centric Intelligent Security

slide-51
SLIDE 51
slide-52
SLIDE 52
slide-53
SLIDE 53

System Security Lab

  • One of the leading security research groups
  • Ranked 5th in the past 17 years (http://s3.eurecom.fr/~balzarot/notes/top4/index.html)
  • Awards:
  • Award for the outstanding research on privacy enhance technologies (the PET Award)
  • Best practical paper award on IEEE Symposium on Security and Privacy
  • Winner (3rd Place) National Security Innovation Competition
  • Best paper award in applied cybersecurity research CSAW (twice)
  • Funding: around 9 million Federal grant (NSF, NIH, ARO etc.)
  • Prominent systems:
  • SecUP: installed over 220,000 times over 163 countries
  • MassVet: evaluated by over 90 organizations in North America, Asia, Europe, Australia and Africa
  • Impacts:
  • Lead to the change of SSO and Payment development, OS X keychain redesign, iOS

scheme improvement, enhancement of Android upgrade and contributions to the mitigation of side channels on Android

  • iDASH genome privacy challenge: bringing security techniques to the biomedical

community

  • News coverage: CNN, MSNBC, Forbes, the NY times, etc., (thousands of times)
slide-54
SLIDE 54
slide-55
SLIDE 55
slide-56
SLIDE 56
slide-57
SLIDE 57

iDASH Secure Genome Analysis Competition

slide-58
SLIDE 58
slide-59
SLIDE 59

INSPiRES Collaboration Lab

  • Open, Inclusive and Collaborative
  • Including those who I advised and who are still collaborating with me
  • Indiana University
  • 10 PhD students now
  • 2 Postdocs and 1 visitor
  • Chinese Academia of Science
  • 1 faculty, 4 PhD students, 8 MS students
  • Chinese University of Hong Kong
  • 1 faculty 6 PhD students
  • Other Universities
  • 3 PhD students from UIUC, 1 from Georgia Tech, 1 from CMU, 1 from Tsinghua and 1 from

Peking University

  • Industry collaborators (Samsung, EMC, Google, etc.)
slide-60
SLIDE 60

Our Research

  • System Security New Directions
  • Security analysis of emerging computing systems
  • Mobile and cloud security
  • IoT security
  • TEE (SGX, trust zone)
  • Semantics-based intelligent code analysis
  • Cognitive Security and Cyber Crimes
  • Automatic Cyber Threat Intelligence collection and analysis
  • Automatic threat responses
  • Measurement and investigation of underground business
  • Data Privacy, Intelligent Security Protection
  • Human genome privacy (iDASH Secure Genome Analysis Competition)
  • Learning and security
slide-61
SLIDE 61

STUDENT RECRUITMENT

WE ARE LOOKING FOR INNOVATORS, NOT NATIVE SPEAKERS

  • GRE CAN BE WAIVERED
  • TOEFL around 90 (University Requirement)
  • For Undergraduates
  • Are you TOP in your class? Are you Passionate about Research?

Do you Want to Innovate? Do you Want to Publish?

  • If so, WE CAN ADMIT YOU END OF YOUR THIRD YEAR
  • For MS Students
  • Can you think? Can you execute your idea? Are you doing research?
  • DO YOU WANT TO CHANGE THE WORLD?
slide-62
SLIDE 62

Visitors

  • For PhD students:
  • Do you have idea but do not know how to make it

HAPPEN?

  • Do you want to know HOW to do world-class system

security research?

  • For Junior Faculty:
  • You are welcome to visit my group and forge collaboration
slide-63
SLIDE 63

TALK TO ME xw7@indiana.edu www.informatics.indiana.edu/xw7

slide-64
SLIDE 64

SmartAuth: User-Centered Authorization for the Internet of Things

slide-65
SLIDE 65

Smart-home apps improve quality of life

slide-66
SLIDE 66

Smart-home apps improve quality of life, but are risky

slide-67
SLIDE 67

Users have limited information about what is going on

User Install apps Functionalities explained to the user

slide-68
SLIDE 68

Users have limited information about what is going on

User Install apps Deploy apps Send notifications Smarthome Cloud Send notifications Smarthome Hub Functionalities explained to the user Operations that the app indeed perform

slide-69
SLIDE 69

Can we notify users about the most important information?

For behaviors related to functionality, we don’t have to. We should notify them about unexpected behaviors. This app doesn’t need to control the lock!

slide-70
SLIDE 70
  • Security and privacy implications depend on context
  • Same sensor in bedroom vs. outside has very different implications
  • Behaviors in code cannot be mapped directly to high-level

functionality in description

  • Need to support cross-device scenarios

Challenges

slide-71
SLIDE 71

Previous solutions will not work

Solution Context- aware Automatic Usable Security Manifest Permission No Yes No No Prompt Permission Yes No No No SmartAuth Yes Yes Yes Yes

slide-72
SLIDE 72

Redesign the authorization system

Goals: Security and Privacy: Share minimum data and capabilities for desired functionality IoT specific: Cross-device, context-based, automatic control Usability: Assist user to make well-informed decisions, minimize user burdens Performance: Lightweight and compatible

slide-73
SLIDE 73

SmartAuth overview

Program Analyzer Content Inspector (NLPe.g, [2]) Consistency Checker Authorization creator Policy Enforcer

App Source Code App Description Security Policy

[2] The Stanford Parser: A statistical parser

slide-74
SLIDE 74

An example – program analyzer

section("Bathroom humidity sensor") { input "bathroom", "capability.relativeHumidityMeasurement", title: "Which humidity sensor?" } if (shower.value.toInteger() > relHum) { coffee.on()

slide-75
SLIDE 75

An example – NLP and behavior correlation

Entity : Coffee machine Shower Description analysis Program analysis Entity : Switch Humidity sensor Lock Context clue: Bathroom for the humidity sensor Coffee for the switch Condition: Taking a shower Action: Turn on the coffee machine Condition: Humidity reading > threshold Action: Turn on the switch Unlock the door

Triggers Triggers

slide-76
SLIDE 76

Interface generation

 Match users’ mental models

 Less burden for users  Alarm users about unexpected behaviors

 Survey users’ perspectives of installing smart-home apps  Iterative design with pilot studies

slide-77
SLIDE 77

Enforcement

slide-78
SLIDE 78
  • 3.9% false positives in policy identification, no false negatives
  • Not undermine app functionalities
  • Performance:

Effectiveness

slide-79
SLIDE 79

Users make better decisions with SmartAuth

slide-80
SLIDE 80

Darken Behind Me

slide-81
SLIDE 81
  • Goal: Bridge semantic gap between what a user sees (app descriptions) and

what an app’s code actually does

  • NLP to understand descriptions and code annotations
  • Program analysis to understand code
  • Match insights from NLP to program analysis
  • Users much more likely to choose safer apps with SmartAuth
  • Working with Samsung for deployment

Takeaways