Secure Intersection with MapReduce R. Ciucanu 1 M. Giraud 2 P. - - PowerPoint PPT Presentation

secure intersection with mapreduce
SMART_READER_LITE
LIVE PREVIEW

Secure Intersection with MapReduce R. Ciucanu 1 M. Giraud 2 P. - - PowerPoint PPT Presentation

Secure Intersection with MapReduce R. Ciucanu 1 M. Giraud 2 P. Lafourcade 2 L. Ye 3 1 LIFO, INSA Centre Val de Loire Universit e dOrl eans 2 3 School of Computer Science and Technology Harbin Institute of Technology, China 26 July 2019


slide-1
SLIDE 1

1/29

Secure Intersection with MapReduce

  • R. Ciucanu1
  • M. Giraud2
  • P. Lafourcade2
  • L. Ye3

1LIFO, INSA Centre Val de Loire

Universit´ e d’Orl´ eans

2 3School of Computer Science and Technology

Harbin Institute of Technology, China

26 July 2019 @ SECRYPT, Prague

slide-2
SLIDE 2

2/29

Big Data

Cloud Service Provider (CSP)

slide-3
SLIDE 3

3/29

Model 1

Application Avoid double submissions in conferences Mutual Private Set Intersection (PSI) Participants List A B Result A ∩ B A ∩ B

slide-4
SLIDE 4

4/29

Model 2

Application FBI wants to detect suspicious passengers of an airline company One-way PSI Passengers List A B Result A ∩ B ∅

slide-5
SLIDE 5

5/29

Model 3

Application Interpol wants the most dangerous persons from FBI and MI6 Our PSI Model Suspects Lists A B Result ∅ ∅ A ∩ B

slide-6
SLIDE 6

6/29

Example

Suspects Lists Alice Cesar Mallory Oscar Bob Mallory Intersection List Mallory

slide-7
SLIDE 7

7/29

Outline

Motivations MapReduce Intersection with MapReduce Security Model and Cryptographic Tools Secure Intersection with MapReduce Performance Evaluation Conclusion

slide-8
SLIDE 8

8/29

Outline

Motivations MapReduce Intersection with MapReduce Security Model and Cryptographic Tools Secure Intersection with MapReduce Performance Evaluation Conclusion

slide-9
SLIDE 9

9/29

MapReduce1

MapReduce Environment Take care of ◮ Partitioning input data ◮ Scheduling program execution on a set of machines ◮ Handling machine failures Programmer Specify ◮ Map and Reduce functions ◮ Input files

  • 1J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on

Large Clusters. In the proceedings of OSDI 2004.

slide-10
SLIDE 10

10/29

MapReduce Example

Input 1 Map 1 Input 2 Map 2 Input 3 Map 3 Reduce 1 Reduce 2 Output 1 Output 2 Shuffle

slide-11
SLIDE 11

11/29

MapReduce in 3 Steps

  • 1. Map tasks

Input: ID of chunk Output: key-value pairs

  • 2. Master Controller

◮ Key-value pairs aggregated and sorted by key ◮ Pairs with same key sent to the same Reduce task

  • 3. Reduce tasks

Input: One key Output: Combine values associated to the key

slide-12
SLIDE 12

12/29

Outline

Motivations MapReduce Intersection with MapReduce Security Model and Cryptographic Tools Secure Intersection with MapReduce Performance Evaluation Conclusion

slide-13
SLIDE 13

13/29

Intersection with MapReduce2

3 participants NSA F654 U840 X098 GCHQ F654 M349 P027 Mossad F654 M349 U840

  • 2J. Leskovec, A. Rajaraman and J. D. Ullman. Mining of Massive Datasets.

Cambridge University Press.

slide-14
SLIDE 14

14/29

Intersection with MapReduce

Data

  • wners

NSA GCHQ Mossad

Map

NSA GCHQ Mossad Master Controller Public cloud

Key F654 Values F654 F654 F654 Key M349 Values M349 M349 Key P027 Value P027 Key U840 Value U840 Key X098 Value X098 Reduce

User Interpol F654

Reduce function It returns value only if: #values = #participants

slide-15
SLIDE 15

15/29

Outline

Motivations MapReduce Intersection with MapReduce Security Model and Cryptographic Tools Secure Intersection with MapReduce Performance Evaluation Conclusion

slide-16
SLIDE 16

16/29

Security Model

Cloud is honest-but-curious Data Owner

Relations

Cloud

Intersection User

Without security, Cloud learns: ◮ Content of relations ◮ Intersection result

slide-17
SLIDE 17

17/29

Cryptographic Tools

Pseudorandom function f : K × D → R ◮ Deterministic ◮ Indistinguishable from a random function Notation [m]k = f (k, m)

slide-18
SLIDE 18

18/29

Cryptographic Tools

Asymmetric encryption scheme ◮ (pk, sk) ← G(λ) ◮ c ← E(pk, m) ◮ m ← D(sk, c) D(sk, E(pk, m)) = m Notation {m} = E(pk, m)

slide-19
SLIDE 19

19/29

Outline

Motivations MapReduce Intersection with MapReduce Security Model and Cryptographic Tools Secure Intersection with MapReduce Performance Evaluation Conclusion

slide-20
SLIDE 20

20/29

Secure Intersection with MapReduce

Setting ◮ n relations: R1, . . . , Rn ◮ R1 has: k1, . . . , kn PRF secret keys, and pk ◮ Ri (for 2 ≤ i ≤ n) has: k1 and ki Preprocessing ◮ One main relation using the public key of the final user For each element x, compute the key-value pair:

  • [x]k1,
  • {x} ⊕ (⊕i=n

i=2[x]ki))

  • ◮ Other relation compute the key-value pair:

([x]k1, [x]ki)

slide-21
SLIDE 21

21/29

Secure Intersection with MapReduce

Processed relations

NSA∗

  • [F654]k1, ({F654} ⊕ [F654]k2 ⊕ [F654]k3)
  • [U840]k1, ({U840} ⊕ [U840]k2 ⊕ [U840]k3)
  • [X098]k1, ({X098} ⊕ [X098]k2 ⊕ [X098]k3)
  • GCHQ∗
  • [F654]k1, [F654]k2
  • [M349]k1, [M349]k2
  • [P027]k1, [P027]k2
  • Mossad∗
  • [F654]k1, [F654]k3
  • [M349]k1, [M349]k3
  • [U840]k1, [U840]k3
slide-22
SLIDE 22

22/29

Secure Intersection with MapReduce

Data

  • wners

NSA∗ GCHQ∗ Mossad∗

Map

NSA∗ GCHQ∗ Mossad∗ Master Controller Public cloud

Key [F654]k1 Values {F654} ⊕ [F654]k2 ⊕ [F654]k3 [F654]k2 [F654]k3 Key [M349]k1 Values [M349]k2 [M349]k3 Key [P027]k1 Value [P027]k2 Key [U840]k1 Values {U840} ⊕ [U840]k2 ⊕ [U840]k3 [U840]k3 Key [X098]k1 Values {X098} ⊕ [X098]k2 ⊕ [X098]k3 Reduce

User (sk, pk) Interpol {F654}

slide-23
SLIDE 23

23/29

Outline

Motivations MapReduce Intersection with MapReduce Security Model and Cryptographic Tools Secure Intersection with MapReduce Performance Evaluation Conclusion

slide-24
SLIDE 24

24/29

Experimental Results

Settings ◮ 3.2.0 / Standalone mode / Streaming ◮ 16.04 LTS ◮ Map and Reduce functions in Hardware ◮ 4 CPU @ 2.4 GHz ◮ 80 Gb of disk ◮ 8 Gb of RAM Experiments

  • 1. Varying the number of tuples
  • 2. Varying the number of intersected relations
slide-25
SLIDE 25

25/29

Results: Varying the Number of Tuples

0.5 1 1.5 2 2.5 3 500 1,000 1,500 2,000 Number of tuples (in millions) CPU time (s) Standard protocol3 Secure Intersection

  • 3J. Leskovec, A. Rajaraman and J. D. Ullman. Mining of Massive Datasets.

Cambridge University Press.

slide-26
SLIDE 26

26/29

Results: Varying the Number of Intersected Relations

2 3 4 5 6 7 8 9 10 100 200 300 400 500 Number of intersected relations (500,000 tuples / relation)) CPU time (s) Standard protocol4 Secure Intersection

  • 4J. Leskovec, A. Rajaraman and J. D. Ullman. Mining of Massive Datasets.

Cambridge University Press.

slide-27
SLIDE 27

27/29

Outline

Motivations MapReduce Intersection with MapReduce Security Model and Cryptographic Tools Secure Intersection with MapReduce Performance Evaluation Conclusion

slide-28
SLIDE 28

28/29

Conclusion and Future Works

Conclusion ◮ Design of secure intersection with MapReduce ◮ Collision resistance ◮ Practical scalability Future Works ◮ Apache Spark environment ◮ Malicious model

slide-29
SLIDE 29

29/29

Thank you for your attention.

Any questions? pascal.lafourcade@uca.fr