System Security: From Discovery to Innovation XiaoFeng Wang James - - PowerPoint PPT Presentation
System Security: From Discovery to Innovation XiaoFeng Wang James - - PowerPoint PPT Presentation
System Security: From Discovery to Innovation XiaoFeng Wang James H. Rudy Professor Indiana University at Bloomington xw7@Indiana.edu http://www.informatics.indiana.edu/xw7/ System Security Research Inherently Interdisciplinary and
System Security Research
- Inherently Interdisciplinary and Multi-dimensional
Follow the Tech Trends
v
System Security Research
- Inherent Interdisciplinary and Multi-dimensional
- Discovery-driven, utility centric
Sources for Security Innovations
- Software security
- E.g., memory attack, jump to libraries => ASLR
- Mobile security
- Malware infection => app sandbox + app store vetting
- OS security
- E.g., OS-level attacks => TEE (such as Intel’s SGX)
- Network security
- E.g., DDoS attacks => Syn-cookie, combined detection and blocking (e.g., AWS shield)
- Browser security
- E.g., Cross-origin attacks (such as XSS) => Chrome’s site isolation
- Data privacy
- Inference attack => Differential privacy, as integrated in iOS
- Others:
- Side channels on mobile systems => closing Procfs on Android; data perturbation on iOS
- Credential attacks => multi-factor authentication
…
Destructive Research
Security research needs wreckers !!!
Find the cracks Wreck “secure” systems Fix the cracks Build a better system Understand the cracks and the fundamentality
How to Innovate in Security Research
- Follow the technical tide
- Current: Mobile security
- Emerging: IoT/CPS security
- Future: ML Security, Genome Privacy
- Understanding new technologies
- Finding weaknesses
- Finding utilities and constraints
- Asking big questions
- Fundamental causes of the problem?
- How to do better (under the constraints)?
Examples: Destructive Research on Mobile and IoT Security
CCS’13, Oakland’15, NDSS’14, 15, CCS’17
NO Bugs in apps NO implementation flaws in system What can a zero-permission app still learn?
Android Public Resource
Adversary Model Usability Goals
Linux Kernel Application Framework Public APIs (Audio Usage, CPU Usage, Running application list) Public files (procfs, sysfs)
Finding Your Location
Adversary controlled web-server
Zero-permission app monitoring
/proc/net/arp Deliver BSSID through browser
Why is BSSID Sensitive?
BSSID BSSID BSSID
GPS BSSID to GPS Dataset
Coverage
Evaluation
Another Example: Identity Inference
- Per-app mobile data usage: yet another piece of public data
Tweet Download 580-720B 541-544B
Attack
Timestamp1 Timestamp2 Timestamp3 Timestamp4 Timestamp5
People who tweeted at Timestamp2±60s People who tweeted at Timestamp3±60s People who tweeted at Timestamp1±60s
Identity Recovery
Manual analysis of approx. 4000 twitter accounts
First and last name 79% Location 32% Bio 21%
Why Identity is Important
Other Findings
- Your health/financial information
- Mobile data usage of Yahoo! Finance and WebMD
- Your driving routes
- Monitor the speaker status (on or off) when running
Navigator
- Stealthiness
- Monitor running apps
- Send data through browser when LCD is off
Our Solution
- A new policy enforcement framework
- Each app can specify the permissions for disclosing its mobile
data usage
- Four settings: NO_Access, Rounding, Aggregation and
NO_Protection
- Enforced by Android framework
- Rounding: round the usage to the multiple of a fixed size (e.g.,
256B)
- Aggregation: release the total usage every hour, day or week
App Guardian
Demo: http://sit.soic.indiana.edu/en/2015/ 09/11/app-guardian-oarland/ App: https://play.google.com/store/apps/ details?id=edu.iub.seclab.appguar dian
IoT Devices
- What you know
- What are new
Sensitive Data
- Those medical devices are in FDA-approved Category II
- In the same category of X-ray machine, infusion pump, …
- The data they collect are highly sensitive
- But can Android protect them?
What Goes Wrong here?
- Android is not designed to protect its external devices
- No device-app authentication ⇒ misbinding threat
Our Solution: SEACAT
Policy Manager DAC Policy Manager Service
Policy Module DAC MAC
AVC
Fast Resource-Type Cache
BT stack
Security by Construction: What is the problem and How to make it work
What We Learned
What need to be done
- Communication
- Find out whether expected protection has been provided by the
system
- Challenges: limited documentation, default assumptions, etc.
- Evolution
- Individualize policy settings for apps with different protection
demands
- How to make this happen is a million-dollar question
A Step Further: Automate Security Analysis
- Security requirements, utility constraints?
- Attacker’s resources, information?
- Vulnerability discovery in complicated systems?
Towards Data-Driven, Intelligent Security
- Automatic understanding of the system
- Knowledge discovery from documents
- Automatic building of system model
- Automatic determination of security requirements
- Automatic analysis of the adversary
- Cyber threat intelligent gathering and analysis
- Intelligent vulnerability discovery
- Knowledge-driven system analysis
A Baby Step: Semantics-based Fuzzing
Toward Automated Vulnerability Discovery
- First an easier problem:
Can we recover a Known vulnerability automatically?
- Why important?
- Patching delay => Attack Window
- Security Implications of Public Bug Information
Why Hard?
- Complicated bugs cannot be patched by adding a check
- Limitations of symbolic execution and constraint solving
- path explosion
- limited formula solving capability
whole chunk of code is replaced difficult to formulize how the patch works
How About Auxiliary Information?
- Various sources of vulnerability information
- How experienced attackers benefit from auxiliary information?
- Question: is it possible to automate this process?
Semantics-Driven Fuzzing
- Basic idea:
- Target program: Linux kernel 4.0+
- Information sources: CVE reports, Linux git logs
- Results: 16 vulnerability types beyond input validation
18 successful exploits, 2 unknown vulnerabilities
Retrieve Guide
SemFuzz Exploits
Guidance for CVE-2017-6347
Workflow
Stage 1 Stage 2
Retrieving Critical Variables
Type Name int
- ffset
struct sk_buff skb …... Type Name struct sock sk unsignedi nt len …...
Symbol Table Parse Tree
Retrieving System Calls
- Identifying system call names is insufficient
- Building a knowledge base
- goal: keywords in descriptions ==> system call and parameter values
- source: Linux Programmer Manual (LPM)
- result: 1082 LPM pages, 373 system calls, 2000+ keywords
==========> syscall: socket, sendto match syscall name MSG_MORE loopback UDP
MSG_MORE ==> sendto(flags = MSG_MORE) loopback ==> sendto(dest_addr = {INADDR_LOOPBACK}) UDP ==> socket(socket_type = SOCK_DGRAM) r0 = socket(AF_INET, SOCK_DGRAM, 0) sendto(r0, ..., MSG_MORE, {INADDR_LOOPBACK}, …)
Effective of Semantics-based Fuzzing
- Result
- 16% (18/112) trigger the target vulnerability
- 49% (46/94) reach the vulnerable functions
- 20% (19/94) reach the patched basic blocks
- Zero-day vulnerability
- found when fuzzing CVE-2016-4794
- new vulnerability appears around the known flaws
- reported and confirmed
- Undisclosed vulnerability
- found when fuzzing CVE-2016-3841
- similar problems inside equivalent components
- patched before we reported, but no reports disclosed
Performance
- Trigger vulnerability
- count: 18 (SemFuzz) v.s. 7 (Syzkaller)
- time: 13.2h (SemFuzz) v.s. 33.9h (Syzkaller)
- Reach vulnerable functions
- count: 18 (SemFuzz) v.s. 14 (Syzkaller)
- time: 1.8h (SemFuzz) v.s. 5.2h (Syzkaller)
Future of System Security Research
Where Technologies Go, Opportunities Follow
- Machine Learning and Security
- Adversarial learning => secure ML
- Inference attacks on ML models => privacy-preserving ML
- Security in Smart Things and CPS
- Smart-home/smart-city security
- Industrial control security
- Smart grid security
- Biomedical Data Privacy
- Genomic data privacy (www.humangenomeprivacy.org)
- Other Omics privacy
- Others (e.g., blockchain)
Riding the New Tech Wave
- Data-centric, Intelligent Security
- NLP-enhanced protection (e.g., CTI gathering, analysis)
- AI (ML/reasoning) based protection (e.g., Intelligent CTF)
- Hardware enhanced protection
- Scalable TEE-based protection
Moving Forward
- Learn
- Understand it, analyze it and crack it
- Think
- Ask BIG question, seek deep insight
- Do
- Protect What need to protect
- Build What will be used
Data-Centric Intelligent Security
System Security Lab
- One of the leading security research groups
- Ranked 5th in the past 17 years (http://s3.eurecom.fr/~balzarot/notes/top4/index.html)
- Awards:
- Award for the outstanding research on privacy enhance technologies (the PET Award)
- Best practical paper award on IEEE Symposium on Security and Privacy
- Winner (3rd Place) National Security Innovation Competition
- Best paper award in applied cybersecurity research CSAW (twice)
- Funding: around 9 million Federal grant (NSF, NIH, ARO etc.)
- Prominent systems:
- SecUP: installed over 220,000 times over 163 countries
- MassVet: evaluated by over 90 organizations in North America, Asia, Europe, Australia and Africa
- Impacts:
- Lead to the change of SSO and Payment development, OS X keychain redesign, iOS
scheme improvement, enhancement of Android upgrade and contributions to the mitigation of side channels on Android
- iDASH genome privacy challenge: bringing security techniques to the biomedical
community
- News coverage: CNN, MSNBC, Forbes, the NY times, etc., (thousands of times)
iDASH Secure Genome Analysis Competition
INSPiRES Collaboration Lab
- Open, Inclusive and Collaborative
- Including those who I advised and who are still collaborating with me
- Indiana University
- 10 PhD students now
- 2 Postdocs and 1 visitor
- Chinese Academia of Science
- 1 faculty, 4 PhD students, 8 MS students
- Chinese University of Hong Kong
- 1 faculty 6 PhD students
- Other Universities
- 3 PhD students from UIUC, 1 from Georgia Tech, 1 from CMU, 1 from Tsinghua and 1 from
Peking University
- Industry collaborators (Samsung, EMC, Google, etc.)
Our Research
- System Security New Directions
- Security analysis of emerging computing systems
- Mobile and cloud security
- IoT security
- TEE (SGX, trust zone)
- Semantics-based intelligent code analysis
- Cognitive Security and Cyber Crimes
- Automatic Cyber Threat Intelligence collection and analysis
- Automatic threat responses
- Measurement and investigation of underground business
- Data Privacy, Intelligent Security Protection
- Human genome privacy (iDASH Secure Genome Analysis Competition)
- Learning and security
STUDENT RECRUITMENT
WE ARE LOOKING FOR INNOVATORS, NOT NATIVE SPEAKERS
- GRE CAN BE WAIVERED
- TOEFL around 90 (University Requirement)
- For Undergraduates
- Are you TOP in your class? Are you Passionate about Research?
Do you Want to Innovate? Do you Want to Publish?
- If so, WE CAN ADMIT YOU END OF YOUR THIRD YEAR
- For MS Students
- Can you think? Can you execute your idea? Are you doing research?
- DO YOU WANT TO CHANGE THE WORLD?
Visitors
- For PhD students:
- Do you have idea but do not know how to make it
HAPPEN?
- Do you want to know HOW to do world-class system
security research?
- For Junior Faculty:
- You are welcome to visit my group and forge collaboration
TALK TO ME xw7@indiana.edu www.informatics.indiana.edu/xw7
SmartAuth: User-Centered Authorization for the Internet of Things
Smart-home apps improve quality of life
Smart-home apps improve quality of life, but are risky
Users have limited information about what is going on
User Install apps Functionalities explained to the user
Users have limited information about what is going on
User Install apps Deploy apps Send notifications Smarthome Cloud Send notifications Smarthome Hub Functionalities explained to the user Operations that the app indeed perform
Can we notify users about the most important information?
For behaviors related to functionality, we don’t have to. We should notify them about unexpected behaviors. This app doesn’t need to control the lock!
- Security and privacy implications depend on context
- Same sensor in bedroom vs. outside has very different implications
- Behaviors in code cannot be mapped directly to high-level
functionality in description
- Need to support cross-device scenarios
Challenges
Previous solutions will not work
Solution Context- aware Automatic Usable Security Manifest Permission No Yes No No Prompt Permission Yes No No No SmartAuth Yes Yes Yes Yes
Redesign the authorization system
Goals: Security and Privacy: Share minimum data and capabilities for desired functionality IoT specific: Cross-device, context-based, automatic control Usability: Assist user to make well-informed decisions, minimize user burdens Performance: Lightweight and compatible
SmartAuth overview
Program Analyzer Content Inspector (NLPe.g, [2]) Consistency Checker Authorization creator Policy Enforcer
App Source Code App Description Security Policy
[2] The Stanford Parser: A statistical parser
An example – program analyzer
section("Bathroom humidity sensor") { input "bathroom", "capability.relativeHumidityMeasurement", title: "Which humidity sensor?" } if (shower.value.toInteger() > relHum) { coffee.on()
An example – NLP and behavior correlation
Entity : Coffee machine Shower Description analysis Program analysis Entity : Switch Humidity sensor Lock Context clue: Bathroom for the humidity sensor Coffee for the switch Condition: Taking a shower Action: Turn on the coffee machine Condition: Humidity reading > threshold Action: Turn on the switch Unlock the door
Triggers Triggers
Interface generation
Match users’ mental models
Less burden for users Alarm users about unexpected behaviors
Survey users’ perspectives of installing smart-home apps Iterative design with pilot studies
Enforcement
- 3.9% false positives in policy identification, no false negatives
- Not undermine app functionalities
- Performance:
Effectiveness
Users make better decisions with SmartAuth
Darken Behind Me
- Goal: Bridge semantic gap between what a user sees (app descriptions) and
what an app’s code actually does
- NLP to understand descriptions and code annotations
- Program analysis to understand code
- Match insights from NLP to program analysis
- Users much more likely to choose safer apps with SmartAuth
- Working with Samsung for deployment