Adversarial Classification on Social Networks Sixie Yu 1 Yevgeniy - - PowerPoint PPT Presentation

adversarial classification on social networks
SMART_READER_LITE
LIVE PREVIEW

Adversarial Classification on Social Networks Sixie Yu 1 Yevgeniy - - PowerPoint PPT Presentation

Adversarial Classification on Social Networks Sixie Yu 1 Yevgeniy Vorobeychik 1 Scott Alfeld 2 1 Electrical Engineering and Computer Science Vanderbilt University 2 Computer Science Amherst College AAMAS 2018 ( Electrical Engineering and


slide-1
SLIDE 1

Adversarial Classification on Social Networks

Sixie Yu1 Yevgeniy Vorobeychik1 Scott Alfeld2

1Electrical Engineering and Computer Science

Vanderbilt University

2Computer Science

Amherst College

AAMAS 2018

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 1 / 21

slide-2
SLIDE 2

Problem Setting

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 2 / 21

slide-3
SLIDE 3

Motivation

Over 50% adults in the U.S. regard social media as primary sources for news. [holcomb2013news]. Over 37 million news stories in 2016 U.S. Presidential election later proved fake. [allcott2017social] Anti-social posts/discussions are negatively affecting users and damage online communities. [cheng2015antisocial] Social network spams and phishing can defraud users and spread malwares.

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 3 / 21

slide-4
SLIDE 4

Traditional Defense

Train a “global” detector from past data and deploy it everywhere. Ignore network structures, propagation of messgaes, and adversarial behavior.

Not Adequate

Adversaries can tune content to avoid being detected. Traditional learning approaches ignore network structures.

The impact of detection errors. Being able to detect malicious content at multiple nodes creates a degree of redundancy.

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 4 / 21

slide-5
SLIDE 5

Table of Contents

1

Continuous-Time Diffusion

2

Defender Model

3

Attacker Model

4

Stackelberg Game Formulatioin

5

Solution Approach

6

Experimental Results

7

References

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 5 / 21

slide-6
SLIDE 6

Continuous-Time Diffusion

The propagation of a message depends on both the network structure and the features of the message (x).

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 6 / 21

slide-7
SLIDE 7

Continuous-Time Diffusion

The propagation of a message depends on both the network structure and the features of the message (x). A message started from a node s propagates to other nodes in a breadth-first search fashion.

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 6 / 21

slide-8
SLIDE 8

Continuous-Time Diffusion

The propagation of a message depends on both the network structure and the features of the message (x). A message started from a node s propagates to other nodes in a breadth-first search fashion. The propagation time through an edge e is sampled from a distribution fe(t; we, x).

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 6 / 21

slide-9
SLIDE 9

Continuous-Time Diffusion

The propagation of a message depends on both the network structure and the features of the message (x). A message started from a node s propagates to other nodes in a breadth-first search fashion. The propagation time through an edge e is sampled from a distribution fe(t; we, x). The time taken to affect a node i is the shortest path between s and i, where the weights of edges are propagation times associated with these edges.

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 6 / 21

slide-10
SLIDE 10

Continuous-Time Diffusion

The propagation of a message depends on both the network structure and the features of the message (x). A message started from a node s propagates to other nodes in a breadth-first search fashion. The propagation time through an edge e is sampled from a distribution fe(t; we, x). The time taken to affect a node i is the shortest path between s and i, where the weights of edges are propagation times associated with these edges. A node is affected if its shortest path to s is above T, which is externally supplied.

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 6 / 21

slide-11
SLIDE 11

Continuous-Time Diffusion

The propagation of a message depends on both the network structure and the features of the message (x). A message started from a node s propagates to other nodes in a breadth-first search fashion. The propagation time through an edge e is sampled from a distribution fe(t; we, x). The time taken to affect a node i is the shortest path between s and i, where the weights of edges are propagation times associated with these edges. A node is affected if its shortest path to s is above T, which is externally supplied. The influence of a message initially affecting a node s is defined as σ(s, x), which is the expected number of affected nodes over time window T.

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 6 / 21

slide-12
SLIDE 12

Table of Contents

1

Continuous-Time Diffusion

2

Defender Model

3

Attacker Model

4

Stackelberg Game Formulatioin

5

Solution Approach

6

Experimental Results

7

References

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 7 / 21

slide-13
SLIDE 13

Defender Model

Innovations

Learn and deploy heterogeneous detectors at different nodes. Explicitly considering both propagation of messages and adversarial manipulation during learning. Ud = α

  • x∈D−
  • i∈V

σ(i, Θ, x) − (1 − α)

  • x∈D+

σ(s, Θ, z(x)) (1) D−, D+ are benign and malicious data, respectively. Θ = {θ1, θ2, · · · , θ|V |} being parameters of detectors at different nodes. The expected influence is now a function of the parameters of detectors (Θ), as well as manipulated messages (z(x)). x → z(x): adversarial manipulation.

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 8 / 21

slide-14
SLIDE 14

Table of Contents

1

Continuous-Time Diffusion

2

Defender Model

3

Attacker Model

4

Stackelberg Game Formulatioin

5

Solution Approach

6

Experimental Results

7

References

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 9 / 21

slide-15
SLIDE 15

Attacker Model

Attacker’s actions

Find a node s ∈ V to start propagation (reminiscent of the famous influence maximization problem). Transform x → z(x) in order to avoid detection. For any original malicious instance x ∈ D+: max

i,z

σ(i, Θ, z) s.t ||z − x||p ≤ ǫ 1[θj(z) = 1] = 0, ∀j ∈ V (2) ǫ: the attacker’s budget. θj(z) = 1: the manipulated message is detected at node j.

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 10 / 21

slide-16
SLIDE 16

Table of Contents

1

Continuous-Time Diffusion

2

Defender Model

3

Attacker Model

4

Stackelberg Game Formulatioin

5

Solution Approach

6

Experimental Results

7

References

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 11 / 21

slide-17
SLIDE 17

Stackelberg Game

The interaction between the defender and the attacker is modeled as a Stackelberg game. which proceeds as follow: The defender first learns Θ (the parameters of detectors at different nodes). The attacker observes Θ and construct its optimal attack against the defender. max

Θ

α

  • x∈D−
  • i

σ(i, Θ, x) − (1 − α)

  • x∈D+

σ(s, Θ, z(x)) s.t. : ∀x ∈ D+ : (s, z(x)) ∈ arg max

j,z

σ(j, Θ, z) ∀x ∈ D+ : ||z(x) − x||p ≤ ǫ ∀x ∈ D+ : 1[θk(x) = 1] = 0, ∀k ∈ V The equilibrium of this game:

  • Θ, s(Θ), z(x; Θ)
  • .

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 12 / 21

slide-18
SLIDE 18

Table of Contents

1

Continuous-Time Diffusion

2

Defender Model

3

Attacker Model

4

Stackelberg Game Formulatioin

5

Solution Approach

6

Experimental Results

7

References

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 13 / 21

slide-19
SLIDE 19

Solution Approach

Assumption

The defender knows the node being attacked. This assumption enables us to collapse the bi-level optimization into a single-level optimization. Assume the defender knows the node s will be attacked, by leveraging Implicit Function Theorem, we can solve the single-level optimization, which results in the optimal defense strategy Θ∗

s.

Relax the assumption

We relax the assumption that the defender knows the node being attacked, and introduce a heuristic algorithm to solve for

  • Θ, s(Θ), z(x; Θ)
  • .

Heuristic algorithm: For each node i ∈ V we solve for the Θ∗

i .

Θ∗ = arg maxΘ∗

i Ud(Θ∗

i )

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 14 / 21

slide-20
SLIDE 20

Table of Contents

1

Continuous-Time Diffusion

2

Defender Model

3

Attacker Model

4

Stackelberg Game Formulatioin

5

Solution Approach

6

Experimental Results

7

References

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 15 / 21

slide-21
SLIDE 21

Experiments

In our experiments, we consider a specific detection model: logistic regression (LR) Θ = {θ1, θ2, · · · , θ|V |}: thresholds of detectors We compare our defense strategy against three others:

Baseline: simply learn a LR on training data and deploy it at all nodes Re-training: iteratively augment the original training data with attacked instances, re-training the LR each time, until convergence Personalized-single-threshold: this strategy is only allowed to tune a single node’s threshold.

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 16 / 21

slide-22
SLIDE 22

Experiments

Figure: The performance of each defense strategy. Each bar is averaged over 10 random topologies. Left: BA. Right: Small-world

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 17 / 21

slide-23
SLIDE 23

Our Contribution

Explicitly model the propagation process of contents through networks as a function of the features of the contents.

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 18 / 21

slide-24
SLIDE 24

Our Contribution

Explicitly model the propagation process of contents through networks as a function of the features of the contents. Instead of deploying a “global” detector, we learn and deploy a collection of heterogeneous detectors, which takes network structures, propagation of messages, and adversarial behavior into account.

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 18 / 21

slide-25
SLIDE 25

Our Contribution

Explicitly model the propagation process of contents through networks as a function of the features of the contents. Instead of deploying a “global” detector, we learn and deploy a collection of heterogeneous detectors, which takes network structures, propagation of messages, and adversarial behavior into account. Fomalize the overall problem as a Stackelberg game between a defender and an attacker.

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 18 / 21

slide-26
SLIDE 26

Our Contribution

Explicitly model the propagation process of contents through networks as a function of the features of the contents. Instead of deploying a “global” detector, we learn and deploy a collection of heterogeneous detectors, which takes network structures, propagation of messages, and adversarial behavior into account. Fomalize the overall problem as a Stackelberg game between a defender and an attacker. Utilize Implicit Function Theorem to design a novel approach for solving the resulted Stackelberg game.

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 18 / 21

slide-27
SLIDE 27

Thank you! Email: sixie.yu@vanderbilt.edu Homepage: sixie-yu.org

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 19 / 21

slide-28
SLIDE 28

Table of Contents

1

Continuous-Time Diffusion

2

Defender Model

3

Attacker Model

4

Stackelberg Game Formulatioin

5

Solution Approach

6

Experimental Results

7

References

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 20 / 21

slide-29
SLIDE 29

References

( Electrical Engineering and Computer Science Vanderbilt University, Computer Science Amherst College ) Adversarial Classification on Social Networks AAMAS 2018 21 / 21