Sequence Data
- Continuous Aggregates
- Distance-based sampling
- Transformation-based
- Model-based filtering and sampling
- Frequent sequential patterns
Sequence Data Continuous Aggregates Distance-based sampling - - PowerPoint PPT Presentation
Sequence Data Continuous Aggregates Distance-based sampling Transformation-based Model-based filtering and sampling Frequent sequential patterns CS573 Data Privacy and Security Differential Privacy Sequence Data Li Xiong
t1 t2 t3
a
100 90 100
b
20 50 20
c
20 10 20
Haoran Li, Li Xiong, Xiaoqian Jiang, Jinfei Liu. Differentially Private Histogram Publication for Dynamic Datasets: An Adaptive Sampling Approach. CIKM 2015
Perturbed
𝜉 is no
𝑗 𝜀(𝑦𝑙 − 𝑦𝑙 𝑗 ) 𝑂 𝑗=1
𝑗 , 𝜌𝑙 𝑗 }1 𝑂 is a set of weighted samples/particles.
17
Δ−𝜊 𝜊 )
21
Liyue Fan, Li Xiong, Vaidy Sunderam. Differentially Private Multi-Dimensional Time- Series Release for Traffic Monitoring. DBSec, 2013 (best student paper award)
S Xu, S Su, X Cheng, Z Li, L Xiong. Differentially Private Frequent Sequence Mining via Sampling-based Candidate Pruning. ICDE 2015
ID 100 200 300 400 500 Record a→c→d b→c→d a→b→c→e→d d→b a→d→c→d Database D Sequence {a} {b} {c} {d} Sup. 3 3 4 4 {e} 1 C1: cand 1-seqs Sequence {a} {b} {c} {d} Sup. 3 3 4 4 F1: freq 1-seqs
Sequence {a→a} {a→b} {a→c} {a→d} Sup. 1 3 3 {b→a} {b→b} {b→c} {b→d} 2 2 1 {c→a} {c→b} {c→c} {c→d} 4 {d→a} {d→b} {d→c} {d→d} 1 1 C2: cand 2-seqs Sequence {a→c} {a→d} {c→d} Sup. 3 3 4 F3: freq 2-seqs
Scan D Scan D Scan D
Sequence {a→a} {a→b} {a→c} {a→d} {b→a} {b→b} {b→c} {b→d} {c→a} {c→b} {c→c} {c→d} {d→a} {d→b} {d→c} {d→d} C2: cand 2-seqs
Sequence {a→b→c} C3: cand 3-seqs Sequence {a→b→c} Sup. 3 F3: freq 3-seqs
ID 100 200 300 400 500 Record a→c→d b→c→d a→b→c→e→d d→b a→d→c→d Database D
Sequence {a} {b} {c} {d} Sup. 3 3 4 4 {e} 1 C1: cand 1-seqs noise 0.2
0.4
0.8
Sequence {a→a} {a→c} {a→d} {c→a} {c→c} {c→d} {d→a} {d→c} {d→d} C2: cand 2-seqs Sequence {a→a} {a→c} {a→d} Sup. 3 3 {c→a} {c→c} {c→d} 4 {d→a} {d→c} {d→d} 1 C2: cand 2-seqs noise 0.2 0.3 0.2
0.8 0.2 0.3 2.1
Scan D Scan D
Sequence {a→c→d} C3: cand 3-seqs {a→d→c}
noise 0.3 Sequence {a→c→d} Sup. 3 {a→d→c} 1 C3: cand 3-seqs
Scan D
Sequence {a} {c} {d} Noisy Sup. 3.2 4.4 3.5 F1: freq 1-seqs
Sequence {a→c} {a→d} {c→d} Noisy Sup. 3.3 3.2 4.2 F2: freq 2-seqs {d→c} 3.1
Sequence {a→c→d} Noisy Sup. 3 F3: freq 3-seqs
Lap(|C2| / ε2) Lap(|C1| / ε1) Lap(|C3| / ε3)
Original Database
mth sample database 2nd sample database 1st sample database …… Partition
kth sample database
Original Database Compute noisy support
Laplace Mechanism
kth sample database
Original Database Compute noisy support
Laplace Mechanism
Transformed Sample Database kth Sample Database Local noisy support
Laplace Mechanism
Transformed Sample Database kth Sample Database Local noisy support
Laplace Mechanism
'
X x x x
MSNBC: F-score MSNBC: RE BIBLE: F-score BIBLE: RE House_Power: F-score House_Power: RE
MSNBC House_Power