dimensional time series release
play

Dimensional Time Series Release for Traffic Monitoring Liyue Fan , - PowerPoint PPT Presentation

DBSec13 Differentially Private Multi- Dimensional Time Series Release for Traffic Monitoring Liyue Fan , Li Xiong, Vaidy Sunderam Department of Math & Computer Science Emory University 9/4/2013 DBSec'13: Privacy Preserving Traffic


  1. DBSec’13 Differentially Private Multi- Dimensional Time Series Release for Traffic Monitoring Liyue Fan , Li Xiong, Vaidy Sunderam Department of Math & Computer Science Emory University

  2. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 2 Outline • Traffic Monitoring • User Privacy • Challenges • Proposed Solutions • Temporal Estimation • Spatial Estimation • Empirical Evaluation

  3. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 3 Monitoring Traffic • Congestions/Trending places/Everyday life • How many cars are there? Where are they? Monital Metropol, Brazil Google Traffic View

  4. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 4 Traffic Monitoring • Real-time GPS data traffic histogram • At any timestamp: Aggregate 2D Histogram Real-time user location

  5. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 5 User Privacy • User privacy should be protected when releasing their data! • Real-time location data is sensitive • pleaserobme.com • GPS traces are identifying • “We study fifteen months of human mobility data for one and a half million individuals and find that human mobility traces are highly unique. … in a dataset where the location of an individual is specified hourly, and with a spatial resolution equal to that given by the carrier's antennas, four spatio-temporal points are enough to uniquely identify 95% of the individuals .” De Montjoye, Yves-Alexandre, Cesar A. Hidalgo, Michel Verleysen, and Vincent D. Blondel. "Unique in the Crowd: The Privacy Bounds of Human Mobility." Scientific Reports 3 (2013)

  6. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 6 Differentially Private Data Sharing

  7. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 7 Differential Privacy (in a nutshell) • Rigorous definition • D oesn’t stipulate the prior knowledge of the attacker • Upon seeing the published data, an attacker should gain little knowledge about any specific individual. • α -Differential Privacy[BLR08] • Smaller α values ( 𝛽 < 1 ) indicate stronger privacy guarantee Privacy Budget

  8. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 8 Static α -Differential Privacy • Laplace perturbation 𝐵 𝐸 = 𝑔 𝐸 + 𝑀𝑏𝑞(∆𝑔 Dataset D 𝛽 ) 𝑒 strong privacy → high Query f perturbation noise • Global Sensitivity 𝐸,𝐸 ′ 𝑔 𝐸 − 𝑔(𝐸 ′ ) 1 ∆𝑔 = max Laplace Perturbation 𝑑 1 :1 𝑑 2 :0 𝑑 1 :2 𝑑 2 :1 A(𝐸) : 𝑔(𝐸) : 𝑑 3 :5 𝑑 4 :3 𝑑 3 :3 𝑑 4 :4 1 𝑑 𝑗 = 𝑑 𝑗 + Lap( 𝛽 ) Δ𝑔 = 1

  9. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 9 Composability of Differential Privacy • Sequential Composition [McSherry10] • Let 𝐵 𝑙 each provide 𝛽 𝑙 -differential privacy. A sequence of 𝐵 𝑙 (𝐸) over dataset 𝐸 provides 𝛽 𝑙 -differential privacy. • Timestamp k = 0, … 𝑈 − 1 • 𝑔 𝑙 (𝐸) : 2D cell histogram at time 𝑙 𝛽 • 𝐵 𝑙 (𝐸) : released 2D histogram that satisfies 𝑈 -DP • 𝐵 0 𝐸 , … , 𝐵 𝑈−1 (𝐸) satisfies 𝛽 -DP

  10. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 10 Baseline Solution: LPA • Laplace Perturbation Algorithm • For each timestamp k: 𝑈 𝛽 ) 𝑒 • Release 𝐵 𝑙 𝐸 = 𝑔 𝑙 (𝐸) + 𝑀𝑏𝑞( • High perturbation noise for long time-series, i.e. when T is large • Low utility output since data is sparse Relative error 𝑑 1 :1 𝑑 2 :0 𝑑 1 :2 𝑑 2 :1 𝑑 1 : 50% 𝑑 2 : 100% 𝑑 3 :5 𝑑 4 :3 𝑑 3 :3 𝑑 4 :4 • Fact: location data is VERY sparse.

  11. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 11 Two Proposed Solutions Utilize time series model • Temporal Estimation for each cell and posterior estimation to reduce perturbation error. 𝑑 1 𝑑 2 𝑑 3 𝑑 4 • Spatial Estimation within each partition 1 1 0 0 Group similar cells together 1 2 1 0 to overcome data sparsity. 2 3 4 4 3 3 6 10

  12. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 12 Framework Domain knowledge: known Sparse or Dense label for each cell. Differentially Private Raw Series Modeling/Partitioning Series Laplace Estimation Perturbation Doesn’t incur extra differential privacy cost

  13. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 13 Temporal Estimation • For each cell, its count series { 𝑦 𝑙 }, k = 0, … 𝑈 − 1 • e.g. { 3,3,4,5,4,3,2,…} • Process Model 𝑦 𝑙+1 = 𝑦 𝑙 + 𝜕 𝜕~ℕ(0, 𝑅) Small value for Sparse cells; Large value for Dense cells. • Measurement Model 𝑨 𝑙 = 𝑦 𝑙 + 𝜉 𝜉~𝑀𝑏𝑞(𝑈 𝛽) • Goal: given 𝑨 𝑙 and the above models, estimate 𝑦 𝑙 .

  14. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 14 Temporal Estimation(cont.) • Estimation algorithm based on the Kalman filter O(1) computation per timestamp 𝑈 2 • Gaussian approx 𝜉~ℕ(0, 𝑆) , 𝑆 ∝ 𝛽 2 Model-based Prediction Posterior Estimate/Output Linearly combine prediction and measurement Fan and Xiong CIKM’12, TKDE’13

  15. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 15 Temporal Estimation Example • For cell c , at time k : • Suppose 𝑦 𝑙 = 4 − , e.g. 2 • Prediction 𝑦 𝑙 • Measurement/Laplace perturbed value 𝑨 𝑙 , e.g. 8 • Posterior estimation 𝑦 𝑙 , e.g. 3 • Impact of perturbation noise is reduced by taking into account of the process model and prediction!

  16. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 16 Spatial Estimation • Goal: group cells to overcome data sparsity. • First partition the space until each partition contains Sparse or Dense cells only S S S S • Topdown algorithm based on QuadTree S S S S • Data independency and efficiency S S S S S S D D • For each timestamp k : Δ𝑔 ′ • 𝑔 ′ 𝑙 = 1 𝑙 𝐸 : partition counts 𝑈 𝛽 ) 𝑒 ′ • 𝐵′ 𝑙 𝐸 = 𝑔′ 𝑙 (𝐸) + 𝑀𝑏𝑞( • Release 𝑔 𝑙 (𝐸) estimated from 𝐵′ 𝑙 𝐸 • Each cell is visited O(1) times at each timestamp.

  17. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 17 Spatial Estimation Example • At time k Perturbation noise is evenly 1 1 0 0 distributed to every cell 1 2 1 0 within the partition. Original Cell Histogram 𝒈 𝒍 𝑬 : 2 3 4 4 3 3 6 10 1 1 0 0 5 1 6 0 1 1 0 0 4 4 5 3 3 3 5 3 11 12 6 10 6 11 3 3 6 11 Partition Laplace Estimated Cell Histogram 𝒈 ′ 𝒍 (𝑬) Perturbed 𝑩 ′𝒍 𝑬 𝒍 𝑬 Histogram 𝒈

  18. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 18 Evaluation: Data • Generated moving objects on a road network • City of Oldenburg, Germany • 500K objects at the beginning • 25K new objects at every timestamp • total time: 100 timestamps • Two-dimensional 1024 by 1024 grid over the city map • Each cell represents 400 m 2 • Record object locations at cell resolution • 95% cells are labeled Sparse ! http://iapg.jade-hs.de/personen/brinkhoff/generator/

  19. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 19 Temporal Estimation 400 300 200 cell count 100 0 -100 -200 orig -300 Laplace -400 Kalman -500 1 11 21 31 41 51 time

  20. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 20 Spatial Partitions Oldenburg Road Network Partitions by QuadTree

  21. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 21 Evaluation: Utility vs. Privacy • Utility of each cell: Average Relative Error of released series • For each 𝛽 value, median utility among each class is plotted DFT: Rastogi and Nath , SIGMOD’10

  22. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 23 Evaluation: Range Queries • How many objects are in the area of m by m cells at every timestamp? • For each m , 100 areas are randomly selected and evaluated.

  23. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 24 Evaluation: Runtime • Overall runtime is plotted in millisecond.

  24. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 25 Conclusion • Difficult when time series is long and data is sparse! • Domain knowledge can be used for temporal modeling as well as spatial partitioning. • Output utility is improved with same privacy guarantee. • We don’t observe extra time cost by our solutions. • Ongoing work: • Utilize rich information in spatio-temporal data. • Model learning and parameter learning. • Contact: liyue.fan@emory.edu • AIMS Group: www.mathcs.emory.edu/aims

  25. 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 26 Q&A

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend