UT DALLAS UT DALLAS
Erik Jonsson School of Engineering & Computer Science FEARLESS engineering
Sedic: Privacy-Aware Data Intensive Computing on Hybrid Clouds
- K. Zhang, X. Zhou, Y. Chen, X. Wang, Y. Ruan
Sedic: Privacy-Aware Data Intensive Computing on Hybrid Clouds K. - - PowerPoint PPT Presentation
UT DALLAS UT DALLAS Erik Jonsson School of Engineering & Computer Science Sedic: Privacy-Aware Data Intensive Computing on Hybrid Clouds K. Zhang, X. Zhou, Y. Chen, X. Wang, Y. Ruan FEARLESS engineering Motivation Rapid growth of
UT DALLAS UT DALLAS
Erik Jonsson School of Engineering & Computer Science FEARLESS engineering
FEARLESS engineering
Motivation
demand
– Amazon EC2, EMR, etc.
HIPAA
– Prohibitively expensive – Hard to scale
FEARLESS engineering
Motivation
– Split computations – Send computations over non-sensitive info to public cloud – Send computations over sensitive info
– Designed for a single cloud – Unaware of data with multiple security levels – Manual splitting of processing required
Public Private Hybrid
FEARLESS engineering
Sedic – Objectives
– Only public data is given to a commercial cloud
– Move as much computation to the public cloud as possible while respecting a user’s privacy
– Preserve MapReduce scalability while keeping a low privacy protection overhead
– Since it is expensive
– Preserve end-user’s MapReduce experience
FEARLESS engineering
Sedic – Design Overview
FEARLESS engineering
Sedic – Design
FEARLESS engineering
Sedic – Data Labeling and Replication
Labeled Identified Sensitive
Data Labeling Data Replication
FEARLESS engineering
Sedic – Map Task Management
FEARLESS engineering
Sedic – Reduction Planning
– Very large inter-cloud communication
inter-cloud data transfer
– Scheduler stops assigning Map’s to public clouds once limit is reached – Constrains amount of public cloud computation
– Leverage associative and commutative properties of fold loop’s in Reduce
clouds
FEARLESS engineering
Sedic – Automatic Reducer Analysis and Transformation
FEARLESS engineering
Conclusions
sensitive data
that allow public clouds to process data