Effjcient Private Set Intersection for a Decentralised Web of Trust - - PowerPoint PPT Presentation

effjcient private set intersection for a decentralised
SMART_READER_LITE
LIVE PREVIEW

Effjcient Private Set Intersection for a Decentralised Web of Trust - - PowerPoint PPT Presentation

Effjcient Private Set Intersection for a Decentralised Web of Trust lvaro Garca-Recuero October 31, 2017 Privacy-preserving protocols for the WWW in the age of mass surveillance and adversarial learning lvaro Garca-Recuero


slide-1
SLIDE 1

Effjcient Private Set Intersection for a Decentralised Web of Trust

Álvaro García-Recuero October 31, 2017

slide-2
SLIDE 2

“Privacy-preserving protocols for the WWW in the age of mass surveillance and adversarial learning”

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 2 / 40

slide-3
SLIDE 3

Why is that?

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 3 / 40

slide-4
SLIDE 4

Strong and Malicious

Mass-surveillance AND personal data collection by third-parties

  • n the WWW are a real threat to liberal societies and citizens!a.

ahttps://www.theguardian.com/technology/2017/aug/01/

data-browsing-habits-brokers

Countermeasures

A truly decentralised WWW will require the network to provide privacy and trust by design.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 4 / 40

slide-5
SLIDE 5

Strong and Malicious

Mass-surveillance AND personal data collection by third-parties

  • n the WWW are a real threat to liberal societies and citizens!a.

ahttps://www.theguardian.com/technology/2017/aug/01/

data-browsing-habits-brokers

Countermeasures

A truly decentralised WWW will require the network to provide privacy and trust by design.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 4 / 40

slide-6
SLIDE 6

How safe is Big Data?

Adversarial learning

Manipulating or inserting corrupted samples in the dataset to

  • btain a desired outcome (e.g., fjnancial credit score in OSNs).

De-anonymisation

Possible to use external data sources to re-identify users and their preferences.

Privacy breaches

WoTa extension collecting users’ metadata in the browser.

ahttps://en.wikipedia.org/wiki/WOT_Services

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 5 / 40

slide-7
SLIDE 7

How safe is Big Data?

Adversarial learning

Manipulating or inserting corrupted samples in the dataset to

  • btain a desired outcome (e.g., fjnancial credit score in OSNs).

De-anonymisation

Possible to use external data sources to re-identify users and their preferences.

Privacy breaches

WoTa extension collecting users’ metadata in the browser.

ahttps://en.wikipedia.org/wiki/WOT_Services

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 5 / 40

slide-8
SLIDE 8

How safe is Big Data?

Adversarial learning

Manipulating or inserting corrupted samples in the dataset to

  • btain a desired outcome (e.g., fjnancial credit score in OSNs).

De-anonymisation

Possible to use external data sources to re-identify users and their preferences.

Privacy breaches

WoTa extension collecting users’ metadata in the browser.

ahttps://en.wikipedia.org/wiki/WOT_Services

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 5 / 40

slide-9
SLIDE 9

What is the Web-of-Trust about?

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 6 / 40

slide-10
SLIDE 10

What is decentralised PSI useful for?

Trust for a non-public Web-of-Trust

We should be able to establish trust without a centralised Certifjcation Authority (CA).

Going Decentralised

The user should able to establish direct trust with its peers, similarly to what happens with PGP, GnuPG and others, but without exposing who the signers are, etc.

Why is it desirable?

Centralised data silos prone to privacy breach, e.g., third-party apps as the WoT plugin. Governments and powerful authorities, e.g., NSA, GCHQ.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 7 / 40

slide-11
SLIDE 11

What is decentralised PSI useful for?

Trust for a non-public Web-of-Trust

We should be able to establish trust without a centralised Certifjcation Authority (CA).

Going Decentralised

The user should able to establish direct trust with its peers, similarly to what happens with PGP, GnuPG and others, but without exposing who the signers are, etc.

Why is it desirable?

Centralised data silos prone to privacy breach, e.g., third-party apps as the WoT plugin. Governments and powerful authorities, e.g., NSA, GCHQ.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 7 / 40

slide-12
SLIDE 12

What is decentralised PSI useful for?

Trust for a non-public Web-of-Trust

We should be able to establish trust without a centralised Certifjcation Authority (CA).

Going Decentralised

The user should able to establish direct trust with its peers, similarly to what happens with PGP, GnuPG and others, but without exposing who the signers are, etc.

Why is it desirable?

Centralised data silos prone to privacy breach, e.g., third-party apps as the WoT plugin. Governments and powerful authorities, e.g., NSA, GCHQ.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 7 / 40

slide-13
SLIDE 13

Abusing the WWW

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 8 / 40

slide-14
SLIDE 14

Defjnition

Modeling Abuse

Deny Deceive Degrade Disrupt

Government Communications Headquarters Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 9 / 40

slide-15
SLIDE 15

Defjning Deceive

Modeling Abuse

Supplanting a known user identity (impersonation) for infmuencing other users behaviour and activities, including assuming false identities (but not pseudonyms). SYLVESTER: framework for automated interaction & alias management in Online Social Networks. UNDERPASS Change outcome of online polls. SCRAPHEAP CHALLENGE: perfect spoofjng of emails from Blackberry targets. BURLESQUE: capability to send spoofed SMS text messages.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 10 / 40

slide-16
SLIDE 16

Defjning Degrade

Modeling Abuse

Disclosing personal and private data of others without their approval as to harm their public image or reputation. BIRDSTRIKE is a Twitter monitoring and profjle data collection tool. SPRING BISHOP: fjnds private photographs of targets in Facebook.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 11 / 40

slide-17
SLIDE 17

Defjning Deny

Modeling Abuse

Encouraging self-harm to other users, promoting violence (direct or indirect), terrorism or similar activities. CLEAN SWEEP: masquerades Facebook wall posts for individuals or entire countries, efgectively denying access to information (censorship). ROLLING THUNDER: distributed denial of service using P2P.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 12 / 40

slide-18
SLIDE 18

Defjning Disrupt

Modeling Abuse

Distracting provocations, denial-of-service, fmooding with messages, promote abuse. BIRDSONG: automated posting of Twitter updates. CANNONBALL: capability to send repeated text messages to a single target. PITBULL: enabling large scale delivery of a tailored message to users of instant messaging services.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 13 / 40

slide-19
SLIDE 19

Abuse detection

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 14 / 40

slide-20
SLIDE 20

Abuse ground truth

Trollslayer tool

1github.com/algarecu/trollslayer

1Repo

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 15 / 40

slide-21
SLIDE 21

Mutual Subscriptions

Feature analysis

0.02 0.05 0.10 0.20

CCDF of mutual followees in log−log scale

log(x) log[P(X > x)] 100 acceptable abusive

|Subscription ∩ Subscription| CCDF shows less overlap among subscriptions of author of abusive messages and subscriptions of potential victim. Privacy: it needs a protocol to protect metadata. Security? Hard to prevent increase in overlap of subscriptions of potential victim if that is public information.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 16 / 40

slide-22
SLIDE 22

Straw-man version

Privacy Protocol

Problem: Alice wants to compute n := |LA ∩ LB| Suppose each user has a private key ci and the corresponding public key is Ci := gci where g is some generator The set up is as follows:

LA: set of public keys representing Alice’s subscriptions LB: set of public keys representing Bob’s subscriptions Alice picks an ephemeral private scalar tA ∈ Z/pZ (set of scalars used for a D-H exchange). Bob picks an ephemeral private scalar tB ∈ Z/pZ (set of scalars used for a D-H exchange).

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 17 / 40

slide-23
SLIDE 23

Privacy Protocol: straw-man version

XA := { CtA

  • C ∈ LA

} YA := { ˆ CtA

  • ˆ

C ∈ XB } = { CtA·tB

  • C ∈ LA

} Alice Bob X

A

XB, YB XB := { CtB

  • C ∈ LB

} YB := { CtB

  • C ∈ XA

} = { CtB·tA

  • C ∈ LB

} Alice can get |YA ∩ YB| within linear cost Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 18 / 40

slide-24
SLIDE 24

Straw-man Protocol 1

Attack 1

Attack 1: insertion of sock-puppet accounts to infer size of the potential’s victim contact list. Solution: defeat it with shuffming of contact list before sending it to other party.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 19 / 40

slide-25
SLIDE 25

Straw-man Protocol 1

Attack 2

Attack 2: insertion of sock-puppet account with a marker in the potential perpetrator list allows to infer set membership in potential victim’s list (identifying pair of elements). Solution: hash the commitments of reblinded contact list in the reply to potential perpetrator.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 20 / 40

slide-26
SLIDE 26

Assume a fjxed system security parameter κ ≥ 1 For any list or set Z, defjne Z′ := {h(x)|x ∈ Z}, e.g., X ′

B,i:

hashing each element ∈ XB

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 21 / 40

slide-27
SLIDE 27

Protocol 1: Cut & choose version

Alice Bob send XA X ′

B,i, Y′ B,i

J XB,j, tB,j

1 Alice sends:

XA := sort [ CtA | C ∈ A ]

2 Bob responds with

commitments: X ′

B,i, Y′ B,i

for i ∈ 1, . . . , κ

3 Alice picks a non-empty

random subset J ⊆ {1, . . . , κ} and sends it to Bob.

4 Bob replies with

XB,j for j ∈ J, tB,jfor j / ∈ J

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 22 / 40

slide-28
SLIDE 28

Cut & choose version of Protocol 1: Verifjcation

For j / ∈ J, Alice checks the tB,j matches the commitment Y′

B,j.

For j ∈ J Alice checks the commitment to XB,j and computes: YA,j := { ˆ CtA

  • ˆ

C ∈ XB,j } Finally, Alice computes: n = |Y′

A,j ∩ Y′ B,j|.

Alice checks that n values for all j ∈ J, agree.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 23 / 40

slide-29
SLIDE 29

Privacy Analysis of PSI features

0.02 0.05 0.10 0.20 0.50

CCDF of mutual followers in log−log scale

log(x) log[P(X > x)] 100 acceptable abusive

|Subscriber ∩ Subscriber| CCDF shows that authors

  • f abusive messages are less

likely to have common subscribers. Security? Hard to prevent fake subscribers. Privacy? Yes, Protocol 1.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 24 / 40

slide-30
SLIDE 30

Privacy Analysis of PSI features

5e−04 2e−03 1e−02 5e−02 2e−01

CCDF of mutual followers−followees in log−log scale

log(x) log[P(X > x)] 100 acceptable abusive

CCDF of |Subscribers ∩ Subscriptionr| shows less overlap among subscriptions of authors of abusive messages and subscriptions of the potential victims. Security? Assume more diffjcult for an adversary to increase feature overlap. Privacy? Yes, our Protocol with BLS signatures.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 25 / 40

slide-31
SLIDE 31

Protocol 2: PSI with Subscriber Signatures

Assume Subscribers are willing to sign they are subscribed. Subscribers provide the signatures and not a certifjcation authority. BLS signatures are compatible with our blinding, so we integrate them with our cut & choose version of the protocol. Detailed protocol is in the paper.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 26 / 40

slide-32
SLIDE 32

What is Protocol 2 useful for?

Prove overlap of subscribers without reveling their identity. Key authentication in non-public Web-of-Trust (1-hop only). Unlike PSI-CA from De Cristofaro (2016), no need for a CA!

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 27 / 40

slide-33
SLIDE 33

Privacy-preserving features

Feature Falsifjcation/Adaptation Crypto helps? 5.1 # lists trivial n/a # subscriptions trivial n/a # subscriptions age trivial n/a

#subscriptions #subscribers

trivial n/a 5.2 # mentions costly n/a # hashtags costly n/a # mentions age costly yes # mentions # messages costly n/a 5.3 message invasive hard n/a 5.4 # messages age costly yes # retweets costly n/a # favorited messages costly n/a 5.5 age of account hard yes 5.6 # subscribers possible minimally # subscribers age possible minimally 5.7 subscription ∩ subscription costly

  • w. privacy

5.8 subscriber ∩ subscriber possible

  • w. privacy

5.9 subscribers ∩ subscriptionr very hard yes subscriptions ∩ subscriberr possible

  • w. privacy

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 28 / 40

slide-34
SLIDE 34

Decision Trees with privacy

Objective function is to maximize AUC under the P-R curve.

0.0 0.2 0.4 0.6 0.8 1.0 Recall 0.0 0.2 0.4 0.6 0.8 1.0 Precision Precision-Recall (AUC = 0.48)

Objective function is to minimize FP and FN rates.

acceptable abusive Predicted label acceptable abusive True label

0.915 0.085 0.355 0.645

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 29 / 40

slide-35
SLIDE 35

Random Forest with privacy

Objective function is to maximize AUC under the P-R curve.

0.0 0.2 0.4 0.6 0.8 1.0 Recall 0.0 0.2 0.4 0.6 0.8 1.0 Precision Precision-Recall (AUC = 0.48)

Objective function is to minimize FP and FN rates.

acceptable abusive Predicted label acceptable abusive True label

0.937 0.063 0.419 0.581

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 30 / 40

slide-36
SLIDE 36

Extra Trees with privacy

Extra Trees is the most balanced among FP and FN, and has the best P-R curve. Objective function is to maximize AUC under the P-R curve.

0.0 0.2 0.4 0.6 0.8 1.0 Recall 0.0 0.2 0.4 0.6 0.8 1.0 Precision Precision-Recall (AUC = 0.49)

Objective function is to minimize FP and FN rates.

acceptable abusive Predicted label acceptable abusive True label

0.795 0.205 0.194 0.806

0.24 0.32 0.40 0.48 0.56 0.64 0.72 0.80

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 31 / 40

slide-37
SLIDE 37

Gradient Boosting with privacy

Gradient Boosting = Gradient Descent + Boosting. Objective function is to maximize AUC under the P-R curve.

0.0 0.2 0.4 0.6 0.8 1.0 Recall 0.0 0.2 0.4 0.6 0.8 1.0 Precision Precision-Recall (AUC = 0.45)

Objective function is to minimize FP and FN rates.

acceptable abusive Predicted label acceptable abusive True label

0.972 0.028 0.581 0.419

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 32 / 40

slide-38
SLIDE 38

Method can protect privacy. Method can handle adaptive adversaries. Using reduced ground truth almost as Human Score.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 33 / 40

slide-39
SLIDE 39

Data minimisation

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 34 / 40

slide-40
SLIDE 40

MinHashes

Data minimisation

Intuition: we can use the intersection to estimated the approximated Jaccard index of J (A, B) by counting the number of indexes (i) such as that hA

min(·) = hB min(·).

Evaluate approximate, privacy-preserving PSI in terms of: (i) computation time (ii) accuracy of classifjcation.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 35 / 40

slide-41
SLIDE 41

The Effjcient Privacy-Preserving Protocol

Approximated Jaccard index estimation with MinHashes reduces computational footprint. In addition, Data Minimisation provides our PP Protocol for DOSN just a fjngerprint of the one-hop graph metadata. Note that even centralised Social Network providers as LinkedIn stop counting contacts after +500 on their site.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 36 / 40

slide-42
SLIDE 42

Results and performance

Features Timing (ms) # of hash. func. (k) Error bound All using J index) 3 018 632.98 – – All using approx. J index) 2 626 971.92 64 O(1/ √ k) All using approx. J index) 2 642 225.02 128 O(1/ √ k)

We see a reduction in computation time for the same set sizes (details in ASONAM ’17 article) Supervised learning results come close to what we obtained using no approximation features with PSI, now the Jaccard index thanks to MinHashes.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 37 / 40

slide-43
SLIDE 43

Conclusions & Future Work

Our protocol is resistant against malicious adversaries. Data minimisation reduces exposing training process to malicious adversaries tampering training samples. Use our protocol to support a decentralised Web-of-Trust that provides trust but also privacy.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 38 / 40

slide-44
SLIDE 44

Q & A

QUESTIONS? Contact: algarecu.wordpress.com Repos: github.com/algarecu

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 39 / 40

slide-45
SLIDE 45

Publications

Á. García-Recuero Effjcient Privacy-Preserving Adversarial Learning in Decentralized Online Social Networks. In 2017 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Sydney, Australia. Á. García-Recuero, J. Burdges, and C. Grothofg. Privacy-preserving abuse detection in future decentralized

  • nline social networks.

In 11th International ESORICS Workshop in Data Privacy Management, DPM 2016. Springer Lecture Notes in Computer Science.

Álvaro García-Recuero Effjcient Private Set Intersection for a Decentralised Web of Trust 40 / 40