Privacy Preserving Distributed ID3 Algorithm
Nan Meng
University of Hong Kong u3003637@connect.hku.hk
April 29, 2016
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 1 / 29
Privacy Preserving Distributed ID3 Algorithm Nan Meng University of - - PowerPoint PPT Presentation
Privacy Preserving Distributed ID3 Algorithm Nan Meng University of Hong Kong u3003637@connect.hku.hk April 29, 2016 Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 1 / 29 Overview Introduction
Nan Meng
University of Hong Kong u3003637@connect.hku.hk
April 29, 2016
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 1 / 29
∗ Introduction ∗ Problem Definition ∗ Solution ∗ Result ∗ Conclusion
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 2 / 29
Figure: Lindell’s definition Figure: Agrawal’s definition
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 3 / 29
dataset, and is typically used in the data mining.
minimum
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 4 / 29
Table: Play Golf Dataset
Outlook Temp Humidity Windy Play Golf Rainy Hot High FALSE No Rainy Hot High TRUE No Overcast Hot High FALSE Yes Sunny Mild High FALSE Yes Rainy Mild Normal TRUE Yes Overcast Cool Normal TRUE Yes Rainy Mild High FALSE No Rainy Cool Normal FALSE Yes Sunny Mild Normal FALSE Yes
⇒
Alice
⇒
Bob
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 5 / 29
Table: Alice
Outlook Temp Humidity Windy Play Golf Rainy Hot High FALSE No Rainy Hot High TRUE No Overcast Hot High FALSE Yes Sunny Mild High FALSE Yes Rainy Mild Normal TRUE Yes
Table: Bob
Outlook Temp Humidity Windy Play Golf Overcast Cool Normal TRUE Yes Rainy Mild High FALSE No Rainy Cool Normal FALSE Yes Sunny Mild Normal FALSE Yes
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 6 / 29
Table: Alice
Outlook Temp Humidity Windy Play Golf Rainy Hot High FALSE No Rainy Hot High TRUE No Overcast Hot High FALSE Yes Sunny Mild High FALSE Yes Rainy Mild Normal TRUE Yes
Table: Bob
Outlook Temp Humidity Windy Play Golf Overcast Cool Normal TRUE Yes Rainy Mild High FALSE No Rainy Cool Normal FALSE Yes Sunny Mild Normal FALSE Yes
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 7 / 29
define the problem. For example, Compute the entropy of Rainy.
Table: Alice
Outlook Temp Humidity Windy Play Golf Rainy Hot High FALSE No Rainy Hot High TRUE No Overcast Hot High FALSE Yes Sunny Mild High FALSE Yes Rainy Mild Normal TRUE Yes
3 records, 2 No, 1 Yes
Table: Bob
Outlook Temp Humidity Windy Play Golf Overcast Cool Normal TRUE Yes Rainy Mild High FALSE No Rainy Cool Normal FALSE Yes Sunny Mild Normal FALSE Yes
2 records, 1 No, 1 Yes
Entropy(Rainy) = − 2+1 3+2 log2( 2+1 3+2 )
− 1+1 3+2 log2( 1+1 3+2 )
= − 3
5 log2( 3 5 ) − 2 5log2( 2 5 ) Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 8 / 29
Table: Alice
Outlook Temp Humidity Windy Play Golf Rainy Hot High FALSE No Rainy Hot High TRUE No Overcast Hot High FALSE Yes Sunny Mild High FALSE Yes Rainy Mild Normal TRUE Yes
3 records, 2 No, 1 Yes
Table: Bob
Outlook Temp Humidity Windy Play Golf Overcast Cool Normal TRUE Yes Rainy Mild High FALSE No Rainy Cool Normal FALSE Yes Sunny Mild Normal FALSE Yes
2 records, 1 No, 1 Yes
Entropy(Rainy) = − 2+1
3+2 log2( 2+1 3+2 ) − 1+1 3+2 log2( 1+1 3+2 )
= − 3
5 log2( 3 5 ) − 2 5 log2( 2 5 ) Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 9 / 29
Table: Alice
Outlook Temp Humidity Windy Play Golf Rainy Hot High FALSE No Rainy Hot High TRUE No Overcast Hot High FALSE Yes Sunny Mild High FALSE Yes Rainy Mild Normal TRUE Yes
3 records, 2 No, 1 Yes
Table: Bob
Outlook Temp Humidity Windy Play Golf Overcast Cool Normal TRUE Yes Rainy Mild High FALSE No Rainy Cool Normal FALSE Yes Sunny Mild Normal FALSE Yes
2 records, 1 No, 1 Yes
− 2+1
3+2 log2( 2+1 3+2 ) Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 10 / 29
Table: Alice
Outlook Temp Humidity Windy Play Golf Rainy Hot High FALSE No Rainy Hot High TRUE No Overcast Hot High FALSE Yes Sunny Mild High FALSE Yes Rainy Mild Normal TRUE Yes
3 records, 2 No, 1 Yes
Table: Bob
Outlook Temp Humidity Windy Play Golf Overcast Cool Normal TRUE Yes Rainy Mild High FALSE No Rainy Cool Normal FALSE Yes Sunny Mild Normal FALSE Yes
2 records, 1 No, 1 Yes −2+1 3+2log2 ( 2+1
3+2)
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 11 / 29
Table: Alice
Outlook Temp Humidity Windy Play Golf Rainy Hot High FALSE No Rainy Hot High TRUE No Overcast Hot High FALSE Yes Sunny Mild High FALSE Yes Rainy Mild Normal TRUE Yes
3 records, 2 No, 1 Yes
Table: Bob
Outlook Temp Humidity Windy Play Golf Overcast Cool Normal TRUE Yes Rainy Mild High FALSE No Rainy Cool Normal FALSE Yes Sunny Mild Normal FALSE Yes
2 records, 1 No, 1 Yes
2+1 3+2
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 12 / 29
Table: Alice
Outlook Temp Humidity Windy Play Golf Rainy Hot High FALSE No Rainy Hot High TRUE No Overcast Hot High FALSE Yes Sunny Mild High FALSE Yes Rainy Mild Normal TRUE Yes
a records, x No, 1 Yes
Table: Bob
Outlook Temp Humidity Windy Play Golf Overcast Cool Normal TRUE Yes Rainy Mild High FALSE No Rainy Cool Normal FALSE Yes Sunny Mild Normal FALSE Yes
b records, y No, 1 Yes
x+y a+b
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 13 / 29
a+b without reveal a, x, b, y.
Alice Bob a record, x No b record, y No Enc(a) Enc(x) Enc(b) Enc(y) x+y a+b
Enc(·) – Encryption Algorithm
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 14 / 29
PPWAP
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 15 / 29
Enc(m).
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 16 / 29
Enc(m1 + m2) = Enc(m1) · Enc(m2).
Enc(m2).
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 17 / 29
Privacy Preserving Weighted Average Protocol
Alice Bob Enc(a) Enc(x) Enc(b) Enc(y) x+y a+b
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 18 / 29
Alice
Encryption(a, PK) : Enc(a) Encryption(x, PK) : Enc(x) 2.
Bob Enc(a) Enc(x) Random integer z Enc(a)z, Enc(x)z
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 19 / 29
Alice
Encryption(a, PK) : Enc(a) Encryption(x, PK) : Enc(x) 2.
Bob Enc(a) Enc(x) Random integer z Enc(a)z, Enc(x)z Enc(a)z = Enc(a)...Enc(a) = Enc(a+a+...+a) = Enc(za)
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 20 / 29
Alice
Encryption(a, PK) : Enc(a) Encryption(x, PK) : Enc(x) 2.
Bob Enc(a) Enc(x) Random integer z Enc(a)z, Enc(x)z Enc(za), Enc(zx)
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 21 / 29
Alice 3.
Enc(za + zb) Enc(zx + zy)
za + zb Decryption(Enc(zx + zy), SK) : zx + zy
Bob
Encryption(b, PK) : Enc(b) ⇒ Enc(zb) Encryption(y, PK) : Enc(y) ⇒ Enc(zy) Enc(za + zb) = Enc(za) Enc(zb)Enc(zx + zy) = Enc(zx)Enc(zy) zx+zy za+zb = x+y a+b
⇒
x+y a+b
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 22 / 29
Figure: Two-party Jointly Decision Tree Algorithm.
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 23 / 29
∗ Dataset size ∗ Length of Key in encryption algorithm ∗ Number of parties
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 24 / 29
Figure: Welcome Graphical User Interface.
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 25 / 29
Figure: Result of single-party ID3 algorithm on tic-tac-toe2 dataset.
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 26 / 29
∗ PPWAP can be extend to multi-party, supports Multi-party distributed ID3 algorithm.
∗ The scheme became safer and more complex.
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 27 / 29
Φ(n) is (p − 1)(q − 1).
gcd(L(gλmod n2), n) = 1, where L(t) = (t − 1)/n and λ(n) = lcm(p − 1, q − 1).
composed of (p, q, λ).
ciphertext c is given by:
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 28 / 29
Nan Meng (Imaging Systems Laboratory) Two-party Jointly Decision Tree April 29, 2016 29 / 29