Tagvisor: A Privacy Advisor for Sharing Hashtags Yang Zhang Joint - - PowerPoint PPT Presentation

▶

tagvisor a privacy advisor for sharing hashtags

Tagvisor: A Privacy Advisor for Sharing Hashtags Yang Zhang Joint - - PowerPoint PPT Presentation

Apr 10, 2024 316 likes •526 views

Tagvisor: A Privacy Advisor for Sharing Hashtags Yang Zhang Joint work with Mathias Humbert, Tahleen Rahman, Cheng-Te Li, Jun Pang and Michael Backes #hashtag 2 #hashtag 3 #hashtag 4 #hashtag 5 #hashtag #like4like #foodporn

slide-1

SLIDE 1

Tagvisor: A Privacy Advisor for Sharing Hashtags

Yang Zhang

Joint work with Mathias Humbert, Tahleen Rahman, Cheng-Te Li, Jun Pang and Michael Backes

slide-2

SLIDE 2

#hashtag

2

slide-3

SLIDE 3

#hashtag

3

slide-4

SLIDE 4

#hashtag

4

slide-5

SLIDE 5

#hashtag

5

slide-6

SLIDE 6

#hashtag

6

#like4like #foodporn #tbt

slide-7

SLIDE 7

#hashtag

7

#privacy #locationprivacy

slide-8

SLIDE 8

#contributions

Attack: location inference with hashtags
Defense: Tagvisor, a privacy advisor to mitigate the

privacy threat by hashtags

8

slide-9

SLIDE 9

#dataset

Collected through Instagram’s APIs
New York, Los Angeles, and London
Hashtags + locations (check-ins)

9

slide-10

SLIDE 10

#attack

10

[1, 1, 1, 0] [0, 1, 1, 0] [1, 0, 0, 1]

Bag-of-words for feature representation
Random forest classifier
Multiple-class classification, e.g., 498 classes (locations) in New York
All posts are trained together

#a#b#c #b#c #a#d

slide-11

SLIDE 11

#attack

11

slide-12

SLIDE 12

#attack

12

slide-13

SLIDE 13

#tagvisor

A privacy advisor for sharing hashtags
Fool the attacker’s location inferencer (ML classifier)
Three defense mechanisms
Hiding
Replacement
Generalization (location category)
Utility: preserving the semantical meaning of hashtags

13

slide-14

SLIDE 14

#hiding

14

hide #a hide #b hide #c successful attack delete one hashtag (can be more) #a#b#c #b#c #a#c #a#b

slide-15

SLIDE 15

#utility

15

Semantical meaning
Skip-gram, aka word2vec
Skip-gram over all posts’ hashtags

#a: [3.1, 1.3] #b: [2.5, 1.9] #c: [4.0, 5.1] #a #b #c #a#b#c #a#c #a#b Hashtag vectors d1 d2 d1 d2 #a#b#c #a#c #a#b

slide-16

SLIDE 16

#replacement

16

Replace each hashtag with all the possible hashtag
Search space is too big
Bound to the most closest hashtags (with word2vec)
Reduce the search space
Semantical meaning can be preserved

successful attack #a#b#c

slide-17

SLIDE 17

#generalization

Location category from foursquare
#centralpark -> #park
Do not apply to all hashtags
e.g., #tbt #love

17

slide-18

SLIDE 18

#tagvisor

Check whether the post’s location is inferred correctly
If no, then publish
Else, consider the three defense mechanisms
Pick the hashtag set with the highest utility

18

slide-19

SLIDE 19

#tagvisor

19

Obfuscating 2 hashtags is enough!

Obfuscating bounded number of hashtags

slide-20

SLIDE 20

#conclusion

First location inference attack with hashtags
Sharing hashtags is not safe!!!
A privacy advisor to mitigate this risk
Minimal risk and maximal utility
Fit for the real-world setting

20

#thankyou

https://yangzhangalmo.github.io/ @yangzhangalmo