Secure Joins with MapReduce Xavier Bultel 1 Radu Ciucanu 2 Matthieu - - PowerPoint PPT Presentation

secure joins with mapreduce
SMART_READER_LITE
LIVE PREVIEW

Secure Joins with MapReduce Xavier Bultel 1 Radu Ciucanu 2 Matthieu - - PowerPoint PPT Presentation

Secure Joins with MapReduce Xavier Bultel 1 Radu Ciucanu 2 Matthieu Giraud 3 Pascal Lafourcade 3 Lihua Ye 4 1 IRISA, Univerist e de Rennes 1, France 2 INSA, Universit e Orl eans, France 3 LIMOS, Universit e Clermont Auvergne, France 4


slide-1
SLIDE 1

Secure Joins with MapReduce

Xavier Bultel1 Radu Ciucanu2 Matthieu Giraud3 Pascal Lafourcade3 Lihua Ye4

1IRISA, Univerist´

e de Rennes 1, France

2INSA, Universit´

e Orl´ eans, France

3LIMOS, Universit´

e Clermont Auvergne, France

4Harbin Institute of Technology, China

Foundations & Practice of Security – November 13, 2018

slide-2
SLIDE 2

Joins

Name City Alice Montreal Bob London Cesar Tokyo ⊲ ⊳ Name Disease Alice Diabetes Bob Flu Bob Cancer = Name City Disease Alice Montreal Diabetes Bob London Flu Bob London Cancer

slide-3
SLIDE 3

Cascade Joins

R1 = Name City Alice Montreal Bob London Cesar Tokyo R2 = Name Disease Alice Diabetes Bob Flu Bob Cancer R3 = Disease Specialist Cancer Hopkins Diabetes Jude

slide-4
SLIDE 4

Cascade Joins

R1 = Name City Alice Montreal Bob London Cesar Tokyo R2 = Name Disease Alice Diabetes Bob Flu Bob Cancer R3 = Disease Specialist Cancer Hopkins Diabetes Jude 1 R1 ⊲ ⊳ R2 = Name City Disease Alice Montreal Diabetes Bob London Flu Bob London Cancer

slide-5
SLIDE 5

Cascade Joins

R1 = Name City Alice Montreal Bob London Cesar Tokyo R2 = Name Disease Alice Diabetes Bob Flu Bob Cancer R3 = Disease Specialist Cancer Hopkins Diabetes Jude 1 R1 ⊲ ⊳ R2 = Name City Disease Alice Montreal Diabetes Bob London Flu Bob London Cancer 2 (R1 ⊲ ⊳ R2) ⊲ ⊳ R3 = Name City Disease Specialist Alice Montreal Diabetes Jude Bob London Cancer Hopkins

slide-6
SLIDE 6

Hypercube Joins

Relation R1: t1 = (Alice, Montreal) t2 = (Bob, London) t3 = (Eve, Tokyo) Relation R2: t4 = (Alice, Diabetes) t5 = (Bob, Flu) t6 = (Bob, Cancer) Relation R3: t7 = (Cancer, Hopkins) t8 = (Diabetes, Jude) Eve Alice, Bob Diab., Flu Cancer (0, 0) (0, 1) (1, 0) (1, 1) Name Disease (R1, t3) (R3, t8) (R1, t3) (R3, t7) (R1, t1) (R1, t2) (R2, t6 ) (R3, t7) (R2, t4 ) (R2, t5) (R1, t1) (R1, t2) (R3, t8)

slide-7
SLIDE 7

MapReduce

Partitioning input data Scheduling program execution

  • n machines

Performing the shuffle Handling machine failures Programmer gives: Input files Map and Reduce

Input 1 Map 1 | Input 2 Map 2 | Input 3 Map 3 | Reduce 1 Reduce 2 Output 1 Output 2 Shuffle

slide-8
SLIDE 8

Joins with MapReduce

Cascade Joins n relations ⇒ n − 1 MapReduce rounds

R1 R2 Q2 R3 Q3 R4 Qn−1 Rn Qn User U R1 ⊲ ⊳ . . . ⊲ ⊳ Rn Public Cloud User’s Domain 1st round 2nd round n-1th round

slide-9
SLIDE 9

Joins with MapReduce

Hypercube Joins n relations ⇒ 1 MapReduce round

R1, R2, R3 Public Cloud User’s Domain User U R1 ⊲ ⊳ R2 ⊲ ⊳ R3

slide-10
SLIDE 10

Security Model

Cloud is honest-but-curious Data owner R1, . . . , Rn Cloud ⊲ ⊳i Ri User Security properties Secrecy of R1, . . . , Rn and ⊲ ⊳i Ri User queries ⊲ ⊳i Ri but cannot learn R1, . . . , Rn

slide-11
SLIDE 11

Contributions

Secure MapReduce Algorithms Cascade Hypercube Secure-Private (SP) approach Cloud nodes do not learn R1, . . . , Rn Cloud nodes do not learn ⊲ ⊳i Ri Collision-Resistant-Secure-Private (CRSP) approach Prevent collision between cloud and user

slide-12
SLIDE 12

Outline

1 Cryptographic tools 2 Secure Joins with MapReduce 3 Security & Performances 4 Conclusion

slide-13
SLIDE 13

Outline

1 Cryptographic tools 2 Secure Joins with MapReduce 3 Security & Performances 4 Conclusion

slide-14
SLIDE 14

Pseudo-Random Function

Definition f : K × D → R Deterministic Indistinguishable from a random function Notation fk(m) = f (k, m)

slide-15
SLIDE 15

Public-Key Encryption

Definition (pk, sk) ← G(λ) c ← Epk(m) m ← Dsk(c) Dsk(Epk(m)) = m Notation {m} = Epk(m)

slide-16
SLIDE 16

Outline

1 Cryptographic tools 2 Secure Joins with MapReduce 3 Security & Performances 4 Conclusion

slide-17
SLIDE 17

SP Preprocessing

Example R1 = Name City Alice Montreal Bob London Cesar Tokyo ⇒ ˆ R1 = fk (Name) {Name} {City} fk (Alice) {Alice} {Montreal} fk (Bob) {Bob} {London} fk (Cesar) {Cesar} {Tokyo} R2 = Name Disease Alice Diabetes Bob Flu Bob Cancer ⇒ ˆ R2 = fk (Name) fk (Disease) {Disease} fk (Alice) fk (Diabetes) {Diabetes} fk (Bob) fk (Flu) {Flu} fk (Bob) fk (Cancer) {Cancer} R3 = Disease Specialist Cancer Hopkins Diabetes Jude ⇒ ˆ R3 = fk (Disease) {Specialist} fk (Cancer) {Hopkins} fk (Diabetes) {Jude}

slide-18
SLIDE 18

SP Cascade ( ˆ R1 ⊲ ⊳ ˆ R2) ⊲ ⊳ ˆ R3

  • fk(Name)

{Name} {City} fk(Alice) {Alice} {Montreal} fk(Bob) {Bob} {London} fk(Cesar) {Cesar} {Tokyo} ⊲ ⊳ fk(Name) fk(Disease) {Disease} fk(Alice) fk(Diab.) {Diab.} fk(Bob) fk(Flu) {Flu} fk(Bob) fk(Cancer) {Cancer}

⊳ fk(Disease) {Specialist} fk(Cancer) {Hopkins} fk(Diab.) {Jude}

slide-19
SLIDE 19

SP Cascade ( ˆ R1 ⊲ ⊳ ˆ R2) ⊲ ⊳ ˆ R3

  • fk(Name)

{Name} {City} fk(Alice) {Alice} {Montreal} fk(Bob) {Bob} {London} fk(Cesar) {Cesar} {Tokyo} ⊲ ⊳ fk(Name) fk(Disease) {Disease} fk(Alice) fk(Diab.) {Diab.} fk(Bob) fk(Flu) {Flu} fk(Bob) fk(Cancer) {Cancer}

⊳ fk(Disease) {Specialist} fk(Cancer) {Hopkins} fk(Diab.) {Jude}

  • fk(Name)

{Name} {City} fk(Alice) {Alice} {Montreal} fk(Bob) {Bob} {London} fk(Cesar) {Cesar} {Tokyo} ⊲ ⊳ fk(Name) fk(Disease) {Disease} fk(Alice) fk(Diab.) {Diab.} fk(Bob) fk(Flu) {Flu} fk(Bob) fk(Cancer) {Cancer}

⊳ fk(Disease) {Specialist} fk(Cancer) {Hopkins} fk(Diab.) {Jude}

slide-20
SLIDE 20

SP Cascade ( ˆ R1 ⊲ ⊳ ˆ R2) ⊲ ⊳ ˆ R3

  • fk(Name)

{Name} {City} fk(Alice) {Alice} {Montreal} fk(Bob) {Bob} {London} fk(Cesar) {Cesar} {Tokyo} ⊲ ⊳ fk(Name) fk(Disease) {Disease} fk(Alice) fk(Diab.) {Diab.} fk(Bob) fk(Flu) {Flu} fk(Bob) fk(Cancer) {Cancer}

⊳ fk(Disease) {Specialist} fk(Cancer) {Hopkins} fk(Diab.) {Jude}

  • fk(Name)

{Name} {City} fk(Alice) {Alice} {Montreal} fk(Bob) {Bob} {London} fk(Cesar) {Cesar} {Tokyo} ⊲ ⊳ fk(Name) fk(Disease) {Disease} fk(Alice) fk(Diab.) {Diab.} fk(Bob) fk(Flu) {Flu} fk(Bob) fk(Cancer) {Cancer}

⊳ fk(Disease) {Specialist} fk(Cancer) {Hopkins} fk(Diab.) {Jude} fk(Name) {Name} {City} fk(Disease) {Disease} fk(Alice) {Alice} {Montreal} fk(Diab.) {Diab.} fk(Bob) {Bob} {London} fk(Flu) {Flu} fk(Bob) {Bob} {London} fk(Cancer) {Cancer} ⊲ ⊳ fk(Disease) {Specialist} fk(Cancer) {Hopkins} fk(Diab.) {Jude}

slide-21
SLIDE 21

SP Cascade ( ˆ R1 ⊲ ⊳ ˆ R2) ⊲ ⊳ ˆ R3

  • fk(Name)

{Name} {City} fk(Alice) {Alice} {Montreal} fk(Bob) {Bob} {London} fk(Cesar) {Cesar} {Tokyo} ⊲ ⊳ fk(Name) fk(Disease) {Disease} fk(Alice) fk(Diab.) {Diab.} fk(Bob) fk(Flu) {Flu} fk(Bob) fk(Cancer) {Cancer}

⊳ fk(Disease) {Specialist} fk(Cancer) {Hopkins} fk(Diab.) {Jude}

  • fk(Name)

{Name} {City} fk(Alice) {Alice} {Montreal} fk(Bob) {Bob} {London} fk(Cesar) {Cesar} {Tokyo} ⊲ ⊳ fk(Name) fk(Disease) {Disease} fk(Alice) fk(Diab.) {Diab.} fk(Bob) fk(Flu) {Flu} fk(Bob) fk(Cancer) {Cancer}

⊳ fk(Disease) {Specialist} fk(Cancer) {Hopkins} fk(Diab.) {Jude} fk(Name) {Name} {City} fk(Disease) {Disease} fk(Alice) {Alice} {Montreal} fk(Diab.) {Diab.} fk(Bob) {Bob} {London} fk(Flu) {Flu} fk(Bob) {Bob} {London} fk(Cancer) {Cancer} ⊲ ⊳ fk(Disease) {Specialist} fk(Cancer) {Hopkins} fk(Diab.) {Jude}

fk(Name) {Name} {City} fk(Disease) {Disease} {Specialist} fk(Alice) {Alice} {Montreal} fk(Diab.) {Diab.} {Jude} fk(Bob) {Bob} {London} fk(Cancer) {Cancer} {Hopkins}

slide-22
SLIDE 22

SP Cascade

Map function If i = 1: emit

  • πQf

1∩Rf 2(t), (Q1, tr)

  • Else: emit
  • πQf

i ∩Rf i+1(t), (Ri+1, tq)

  • Reduce function

If i = n − 1: emit

  • πQf

i+1∩Rf i+2(tr × tq), tr × tq

  • Else: emit (tr × tq, tr × tq)
slide-23
SLIDE 23

SP Hypercube

Relation R1: t1 = (fk (Alice), {Alice}, {Montreal}) t2 = (fk (Bob), {Bob}, {London}) t3 = (fk (Eve), {Eve}, {Tokyo}) Relation R2: t4 = (fk (Alice), fk (Diab.), {Diab.}) t5 = (fk (Bob), fk (Flu), {Flu}) t6 = (fk (Bob), fk (Cancer), {Cancer}) Relation R3: t7 = (fk (Cancer), {Hopkins}) t8 = (fk (Diab.), {Jude}) fk (Eve) fk (Alice), fk (Bob) fk (Diab.), fk (Flu) fk (Cancer) (0, 0) (0, 1) (1, 0) (1, 1) Name Disease (R1, t3) (R3, t8) (R1, t3) (R3, t7) (R1, t1) (R1, t2) (R2, t6 ) (R3, t7) (R2, t4 ) (R2, t5) (R1, t1) (R1, t2) (R3, t8)

slide-24
SLIDE 24

SP Hypercube

Map function emit

  • (h1(πX f

1 (tr)), . . . , hd(πX f d (tr))), tr

  • Reduce function

emit (t, t)

slide-25
SLIDE 25

CRSP Approach

Ri Data owner Public Cloud Proxy R1 ⋊ ⋉ · · · ⋊ ⋉ Rn User EpkP({m}) = EpkP(EpkU(m))

slide-26
SLIDE 26

CRSP Preprocessing

Example R1 = Name City Alice Montreal Bob London Cesar Tokyo ⇒ ˆ R1 = fk (Name) EpkP ({Name}) EpkP ({City}) fk (Alice) EpkP ({Alice}) EpkP ({Montreal}) fk (Bob) EpkP ({Bob}) EpkP ({London}) fk (Cesar) EpkP ({Cesar}) EpkP ({Tokyo}) R2 = Name Disease Alice Diabetes Bob Flu Bob Cancer ⇒ ˆ R2 = fk (Name) fk (Disease) EpkP ({Disease}) fk (Alice) fk (Diabetes) EpkP ({Diabetes}) fk (Bob) fk (Flu) EpkP ({Flu}) fk (Bob) fk (Cancer) EpkP ({Cancer}) R3 = Disease Specialist Cancer Hopkins Diabetes Jude ⇒ ˆ R3 = fk (Disease) EpkP ({Specialist}) fk (Cancer) EpkP ({Hopkins}) fk (Diabetes) EpkP ({Jude})

slide-27
SLIDE 27

Outline

1 Cryptographic tools 2 Secure Joins with MapReduce 3 Security & Performances 4 Conclusion

slide-28
SLIDE 28

Performances

Minutes 10 20 30 40 50 60 70 80 660 1,036 1,412 1,788 2,164 Number of tuples CRSP Cascade CRSP Hypercube SP Cascade SP Hypercube Cascade Hypercube

Hadoop implementation 1 master + 3 data nodes 4/2 CPUs @ 2.4GHz 8/4Gb RAM Higgs Twitter dataset RSA-OAEP 2048 bits AES-CTR 128 bits

slide-29
SLIDE 29

Outline

1 Cryptographic tools 2 Secure Joins with MapReduce 3 Security & Performances 4 Conclusion

slide-30
SLIDE 30

Conclusion

Secure Cascade and Hypercube algorithms (SP & CRSP) Honest-but-curious adversay Practical implementation Future Works Avoid leakage on same values Security in standard model Cloud-User Collision resistant without trusted-third party

slide-31
SLIDE 31

Thank you for your attention!

Any questions?

Montreal by Pascal Lafourcade (flickr.com/pascalafourcade).

Keep in touch email matthieu.giraud@uca.fr web http://sancy.univ-bpclermont.fr/~giraud/