Message Passing and Node Classification Prof. Srijan Kumar 1 - - PowerPoint PPT Presentation

message passing and node classification
SMART_READER_LITE
LIVE PREVIEW

Message Passing and Node Classification Prof. Srijan Kumar 1 - - PowerPoint PPT Presentation

CSE 6240: Web Search and Text Mining. Spring 2020 Message Passing and Node Classification Prof. Srijan Kumar 1 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining Outline Main question today: Given a network with


slide-1
SLIDE 1

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

1

CSE 6240: Web Search and Text Mining. Spring 2020

Message Passing and Node Classification

  • Prof. Srijan Kumar
slide-2
SLIDE 2

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

2

Outline

  • Main question today: Given a network

with labels on some nodes, how do we labels all the other nodes?

  • Example: In a network, some nodes are

fraudsters and some nodes are fully

  • trusted. How do you find the other

fraudsters and trustworthy nodes?

slide-3
SLIDE 3

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

3

Intuition

  • Collective classification: Idea of assigning

labels to all nodes in a network together

– Leverage the correlations in the network!

  • We will look at three techniques today:

– Relational classification – Iterative classification – Belief propagation

slide-4
SLIDE 4

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

4

Today’s Lecture

  • Overview of collective classification
  • Relational classification
  • Iterative classification
  • Belief propagation

The lecture slides are borrowed from Prof. Jure Leskovec’s slides from CS224W

slide-5
SLIDE 5

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

5

Correlations Exists in Networks

Example:

  • Real social network

– Nodes = people – Edges = friendship – Node color = race

  • People are

segregated by race due to homophily

(Easley and Kleinberg, 2010)

slide-6
SLIDE 6

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

6

Classification with Network Data

  • How to leverage this correlation observed

in networks to help predict user attributes or interests? How to predict the labels for the nodes in yellow?

slide-7
SLIDE 7

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

7

Motivation

  • Similar entities are typically close together or

directly connected:

– “Guilt-by-association”: If I am connected to a node with label X, then I am likely to have label X as well. – Example: Malicious/benign web page:

Malicious web pages link to one another to increase visibility, look credible, and rank higher in search engines

slide-8
SLIDE 8

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

8

Intuition

  • Classification label of a node O in network

may depend on:

– Features of O – Labels of the objects in O’s neighborhood – Features of objects in O’s neighborhood

slide-9
SLIDE 9

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

9

Guilt-by-association

Given:

  • few graph and
  • labeled nodes

Find: class (red/green) for rest nodes Assuming: networks have homophily

slide-10
SLIDE 10

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

10

Guilt-By-Association

  • Let 𝑿 be a 𝑜×𝑜 (weighted) adjacency matrix
  • ver 𝑜 nodes
  • Let Y = −1, 0, 1 ) be a vector of labels:

– 1: positive node, known to be involved in a gene function/biological process – -1: negative node – 0: unlabeled node

  • Goal: Predict which unlabeled nodes are

likely positive

slide-11
SLIDE 11

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

11

Collective Classification

  • Intuition: simultaneous classification of

interlinked objects using correlations

  • Several applications

– Document classification – Part of speech tagging – Link prediction – Optical character recognition – Image/3D data segmentation – Entity resolution in sensor networks – Spam and fraud detection

slide-12
SLIDE 12

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

12

Collective Classification Overview

  • Markov Assumption: the label Yi of one

node i depends on the label of its neighbors Ni

  • Collective classification involves 3 steps:

Local Classifier

  • Assign initial label

Relational Classifier

  • Capture

correlations between nodes Collective Inference

  • Propagate

correlations through network

𝑄(𝑍

  • |𝑗) = 𝑄 𝑍
  • 𝑂-)
slide-13
SLIDE 13

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

13

  • Predicts label based on node attributes/features
  • Classical classification
  • Does not employ network information

Collective Inference

  • Propagate

correlations through network

Local Classifier

  • Assign initial

label

Relational Classifier

  • Capture

correlations between nodes

  • Learn a classifier from the labels or/and attributes of

its neighbors to label one node

  • Network information is used
  • Apply relational classifier to each node iteratively
  • Iterate until the inconsistency between neighboring

labels is minimized

  • Network structure substantially affects the final

prediction

Collective Classification Overview

slide-14
SLIDE 14

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

14

Today’s Lecture

  • Overview of collective classification
  • Relational classification
  • Iterative classification
  • Belief propagation
slide-15
SLIDE 15

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

15

Problem Setting

  • How to predict the labels Yi for the nodes i in

yellow?

– Each node i has a feature vector fi – Labels for some nodes are given (+ for green, - for blue)

  • Task: find P(Yi) given the network and features

P(Yi) = ?

slide-16
SLIDE 16

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

16

Probabilistic Relational Classifier

  • Basic idea: Class probability of Yi is a

weighted average of class probabilities of its neighbors.

  • For labeled nodes, initialize with ground-

truth Y labels

  • For unlabeled nodes, initialize Y uniformly
  • Update all nodes in a random order till

convergence or till maximum number of iterations is reached

slide-17
SLIDE 17

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

17

Probabilistic Relational Classifier

  • Repeat for each node i and label c

– W(i,j) is the edge strength from i to j – |Ni| is the number of neighbors of I

  • Challenges:

– Convergence is not guaranteed – Model cannot use node feature information

slide-18
SLIDE 18

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

18

Example

Initialization: All labeled nodes to their labels and all unlabeled nodes uniformly

P(Y = 1) = 0 P(Y = 1) = 0 P(Y=1) = 0.5 P(Y = 1) = 0.5 P(Y = 1) = 0.5 P(Y = 1) = 0.5 P(Y = 1) = 1 P(Y = 1) = 1

P(Y=1) = 0.5

slide-19
SLIDE 19

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

19

  • Update for the 1st Iteration:

– For node 3, N3={1,2,4}

Example

P(Y = 1) = 0 P(Y = 1) = 0 P(Y=1) = 0.5 P(Y = 1) = 0.5 P(Y = 1) = 0.5 P(Y = 1) = 0.5 P(Y = 1) = 1 P(Y = 1) = 1

P(Y=1|N3) = 1/3 (0 + 0 + 0.5) = 0.17

slide-20
SLIDE 20

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

20

  • Update for the 1st Iteration:

– For node 4, N4={1,3, 5, 6}

Example

P(Y = 1) = 0 P(Y = 1) = 0 P(Y=1|N4)= ¼(0+ 0.17+0.5+1) = 0.42

P(Y=1) = 0.17

P(Y = 1) = 0.5 P(Y = 1) = 0.5 P(Y = 1) = 0.5 P(Y = 1) = 1 P(Y = 1) = 1

slide-21
SLIDE 21

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

21

  • Update for the 1st Iteration:

– For node 5, N5={4,6,7,8}

Example

P(Y = 1) = 0 P(Y = 1) = 0 P(Y=1|N4)= 0.42

P(Y=1) = 0.17

P(Y=1|N5) = ¼ (0.42+1+1+0.5) = 0.73 P(Y = 1) = 0.5 P(Y = 1) = 0.5 P(Y = 1) = 1 P(Y = 1) = 1

slide-22
SLIDE 22

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

22

After Iteration 1

P(Y = 1) = 0 P(Y = 1) = 0 P(Y = 1) = 0.17 P(Y = 1) = 0.42 P(Y = 1) = 0.73 P(Y = 1) = 0.91 P(Y = 1) = 1.00

Example

slide-23
SLIDE 23

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

23

After Iteration 2

P(Y = 1) = 0 P(Y = 1) = 0 P(Y = 1) = 0.14 P(Y = 1) = 0.47 P(Y = 1) = 0.85 P(Y = 1) = 0.95 P(Y = 1) = 1.00

Example

All neighbors values are

  • fixed. So the

value can not change.

slide-24
SLIDE 24

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

24

After Iteration 3

P(Y = 1) = 0 P(Y = 1) = 0 P(Y = 1) = 0.16 P(Y = 1) = 0.50 P(Y = 1) = 0.86 P(Y = 1) = 0.95 P(Y = 1) = 1.00

Example

slide-25
SLIDE 25

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

25

After Iteration 4

P(Y = 1) = 0 P(Y = 1) = 0 P(Y = 1) = 0.16 P(Y = 1) = 0.51 P(Y = 1) = 0.86 P(Y = 1) = 0.95 P(Y = 1) = 1.00

Example

slide-26
SLIDE 26

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

26

  • All scores stabilize after 5 iterations
  • Final labeling

– Nodes 5, 8, 9 are + (P(Yi = 1) > 0.5) – Node 3 is – (P(Yi = 1) < 0.5) – Node 4 is in between (P(Yi = 1) =0.5)

+ + +

  • +/-

Example

slide-27
SLIDE 27

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

27

Today’s Lecture

  • Overview of collective classification
  • Relational classification
  • Iterative classification
  • Belief propagation
slide-28
SLIDE 28

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

28

Iterative Classification

  • Relational classifiers do not use node

attributes

– How can one leverage them?

  • Main idea of iterative classification:

classify node i based on its attributes as well as labels of neighbor set Ni

slide-29
SLIDE 29

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

29

Iterative Classification: Process

  • 1. Create a feature vector ai for each node i
  • 2. Train a classifier to classify using ai
  • 3. Node may have various number of

neighbors, so we can aggregate using: count , mode, proportion, mean, exists, etc.

slide-30
SLIDE 30

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

30

Basic Architecture

  • Bootstrap phase

– Convert each node i to a flat vector ai – Use local classifier f(ai) (e.g., SVM, kNN, …) to compute best value for Yi

  • Iteration phase: Iterate till convergence

– Repeat for each node i

  • Update node vector ai
  • Update label Yi to f(ai). This is a hard assignment

– Iterate until class labels stabilize or max number

  • f iterations is reached
  • Note: Convergence is not guaranteed

– Run for max number of iterations

slide-31
SLIDE 31

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

31

Application of Iterative Classification Framework: Fake Reviewer/Review Detection

REV2: Fraudulent User Predictions in Rating Platforms Kumar et al. ACM Web Search and Data Mining, 2018

slide-32
SLIDE 32

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

32

Fake Review Spam

  • Review sites are an attractive target for

spam: a +1 star increase in rating increases 5-9% revenue!

  • Often hype/defame spam
  • Paid spammers
slide-33
SLIDE 33

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

33

Fake Review Spam Detection

  • Behavioral analysis

– individual features, geographic locations, login times, session history, etc.

  • Language analysis

– use of superlatives, many self-referencing, rate of misspell, many agreement words, …

  • Behavior and language is easy to fake!
  • Graph structure is hard to fake

– Graphs capture relationships between reviewers, reviews, stores

slide-34
SLIDE 34

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Problem Setup

  • Input: bipartite rating

graph as a weighted signed network:

– Nodes: users, products – Edges: rating scores between -1 and +1

  • Output: set of users

that give fake ratings

34

Red edges = -1 rating Green edges = +1 rating

slide-35
SLIDE 35

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  • Basic idea: Users,

products, and ratings have intrinsic quality scores: – Users have fairness scores – Products have goodness

scores

– Ratings have reliability

scores

  • All values are unknown

35

Each product has a ‘goodness’ score G 𝑞 ∈ −1,1 Each user has a ‘fairness’ score 𝐺 𝑣 ∈ 0,1 Each rating has a ‘reliability’ score R 𝑣, 𝑞 ∈ 0,1

REV2 Solution Formulation

slide-36
SLIDE 36

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  • How can one calculate the

values for all nodes and edges simultaneously?

  • Solution: Collective

classification

36

Each product has a ‘goodness’ score G 𝑞 ∈ −1,1 Each user has a ‘fairness’ score 𝐺 𝑣 ∈ 0,1 Each rating has a ‘reliability’ score R 𝑣, 𝑞 ∈ 0,1

REV2 Solution Formulation

slide-37
SLIDE 37

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

37

Fairness of Users

  • Fixing goodness and reliability, fairness is

updated as:

slide-38
SLIDE 38

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

38

Goodness of Products

  • Fixing fairness and reliability, goodness is

updated as:

slide-39
SLIDE 39

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

39

Reliability of Ratings

  • Fixing fairness and goodness, reliability is

updated as:

slide-40
SLIDE 40

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

40

Initialization: Start with Best Scores

G(p) = 1 G(p) = 1 G(p) = 1 F(u) = 1 F(u) = 1 F(u) = 1 R(u,p) = 1 R(u,p) = 1 R(u,p) = 1 R(u,p) = 1

slide-41
SLIDE 41

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

41

Updating Goodness, Iteration 1

F(u) = 1 F(u) = 1 F(u) = 1 F(u) = 1 F(u) = 1

R(r) = 1

R(r) = 1 G(p) = 0.67 G(p) = 0.67 G(p) = -0.67 R(r) = 1 R(r) = 1

slide-42
SLIDE 42

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

42

Updating Reliability, Iteration 1

F(u) = 1 F(u) = 1 F(u) = 1 F(u) = 1 F(u) = 1 F(u) = 1 R(r) = 0.92 R(r) = 0.92 R(r) = 0.92 R(r) = 0.58 R(r) = 0.58 G(p) = 0.67 G(p) = 0.67 G(p) = -0.67 Both gamma values are set to 1

slide-43
SLIDE 43

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

43

Update Fairness, Iteration 1

F(u) = 0.92 F(u) = 0.92 F(u) = 0.58 F(u) = 0.92 F(u) = 0.92 F(u) = 0.92 R(r) = 0.92 R(r) = 0.92 R(r) = 0.58 R(r) = 0.58 R(r) = 0.92 G(p) = 0.67 G(p) = 0.67 G(p) = -0.67

slide-44
SLIDE 44

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

44

After Convergence

F(u) = 0.83 F(u) = 0.83 F(u) = 0.17 F(u) = 0.83 F(u) = 0.83 F(u) = 0.83 R(r) = 0.83 R(r) = .83 R(r) = 0.83 R(r) = 0.17 R(r) = 0.83 R(r) = 0.17 G(p) = 0.67 G(p) = 0.67 G(p) = -0.67

slide-45
SLIDE 45

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

45

Properties of REV2 Solution

  • Guaranteed to converge
  • Number of iterations till convergence is

upper-bounded

  • Time–complexity: linear
slide-46
SLIDE 46

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

46

Performance

  • Low fairness users = Fraudsters
  • 127 of 150 lowest fairness users in Flipkart

were real fraudsters

  • REV2 is being used in production at

Flipkart

slide-47
SLIDE 47

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

47

Linear Scalability

  • Multiple iterations, but linear scalability
slide-48
SLIDE 48

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

48

Today’s Lecture

  • Overview of collective classification
  • Relational classification
  • Iterative classification
  • Belief propagation
slide-49
SLIDE 49

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

49

Loopy belief propagation

  • Intuition: Use neighbors belief about a node to

predict node label

– Used to estimate marginals (beliefs) or the most likely

states of all variables (nodes)

  • Iterative process in which neighbor variables “talk” to

each other, passing messages

  • When consensus is reached, calculate final belief
slide-50
SLIDE 50

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Message Passing Basics

Task: Count the number of nodes in a graph* Condition: Each node can only interact (pass message) with its neighbors Example: straight line graph

50

adapted from MacKay (2003) textbook

* Graph can not have loops. Explanation later.

slide-51
SLIDE 51

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

1 before you

2 before you there's 1 of me 3 before you 4 before you 5 before you

Task: Count the number of nodes in a graph Condition: Each node can pass message to its neighbors Solution: Each node listens to the message from its neighbor, updates it, and passes it forward

51

1 after you 2 after you 3 after you 4 after you 5 after you 6 after you

Message Passing Basics

slide-52
SLIDE 52

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

3 behind you

2 before you

there's 1 of me Belief: Must be 2 + 1 + 3 = 6

  • f us
  • nly see

my incoming messages

52

2 before you

Each node only sees incoming messages

Message Passing Basics

slide-53
SLIDE 53

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

4 behind you 1 before you there's 1 of me

  • nly see

my incoming messages

53

Belief: Must be 2 + 1 + 3 = 6

  • f us

Belief: Must be 1 + 1 + 4 = 6

  • f us

Each node only sees incoming messages

Message Passing Basics

slide-54
SLIDE 54

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Message Passing in a Tree

7 here 3 here 11 here (= 7+3+1) 1 of me

54

Each node receives reports from all branches of tree

slide-55
SLIDE 55

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

3 here 3 here 7 here (= 3+3+1)

Each node receives reports from all branches of tree

55

Message Passing in a Tree

slide-56
SLIDE 56

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Message Passing in a Tree

7 here 3 here 11 here (= 7+3+1)

56

Each node receives reports from all branches of tree

slide-57
SLIDE 57

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Message Passing in a Tree

7 here 3 here 3 here Belief: Must be 14 of us

57

Each node receives reports from all branches of tree

slide-58
SLIDE 58

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Message Passing in a Tree

7 here 3 here 3 here Belief: Must be 14 of us

58

Each node receives reports from all branches of tree

slide-59
SLIDE 59

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

59

Loopy BP algorithm

What message will i send to j?

  • It depends on what i hears

from its neighbors k

  • Each neighbor k passes a

message to i: k’s beliefs of the state to i

slide-60
SLIDE 60

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

60

Notations

  • Label-label potential matrix : Dependency

between a node and its neighbor. equals the probability of a node i being in state given that it has a j neighbor in state

  • Prior belief : Probability of node i

being in state

  • is i’s estimate of j being in state
  • is the set of all states
slide-61
SLIDE 61

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

61

Loopy BP algorithm

  • 1. Initialize all messages to 1
  • 2. Repeat for each node

61

Label-label potential Prior All messages from neighbors Sum over all states

slide-62
SLIDE 62

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

62

Loopy BP algorithm

After convergence: = i’s belief of being in state

Prior All messages from neighbors

slide-63
SLIDE 63

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

63

Loopy belief propagation

  • What if our graph has cycles?

– Message from different subgraphs are no longer independent! – BP will give wrong results

slide-64
SLIDE 64

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

BP and Loops

64

T 2 F 1 T 2 F 1 T 2 F 1 T 2 F 1 T 2 F 1 T 4 F 1 T 4 F 1

  • Messages loop around and around:

2, 4, 8, 16, 32, ... More and more convinced that these variables are T!

  • BP incorrectly treats this message as

separate evidence that the variable is T.

  • Multiplies these two messages as if

they were independent.

  • But they don’t actually come from

independent parts of the graph.

  • One influenced the other (via a cycle).
slide-65
SLIDE 65

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

65

Advantages of Belief Propagation

  • Advantages:

– Easy to program & parallelize – General: can apply to any graphical model w/ any form of potentials (higher order than pairwise)

  • Challenges:

– Convergence is not guaranteed (when to stop), especially if many closed loops

  • Potential functions (parameters)

– require training to estimate – learning by gradient-based optimization: convergence issues during training

slide-66
SLIDE 66

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

66

Application of belief propagation: Online auction fraud

Netprobe: A Fast and Scalable System for Fraud Detection in Online Auction Networks Pandit et al., World Wide Web conference 2007

slide-67
SLIDE 67

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

67

Online Auction Fraud

  • Auction sites: attractive target for fraud
  • 63% complaints to Federal Internet Crime

Complaint Center in U.S. in 2006

  • Average loss per incident: = $385
slide-68
SLIDE 68

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

68

Online Auction Fraud Detection

  • Insufficient solution to look at individual

features: user attributes, geographic locations, login times, session history, etc.

  • Hard to fake: graph structure
  • Capture relationships between users
  • Main question: how do fraudsters interact

with other users and among each other?

– In addition to buy/sell relations, are there more complex relations?

slide-69
SLIDE 69

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

69

Feedback Mechanism

  • Each user has a reputation score
  • Users rate each other via feedback
  • Question: How do fraudsters game the

feedback system?

slide-70
SLIDE 70

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

70

Auction “Roles” of Users

  • Do they boost each
  • ther’s reputation?

– No, because if one is

caught, all will be caught

  • They form near-bipartite

cores (2 roles)

– Accomplice: trades with

honest, looks legit

– Fraudster: trades with

accomplice, fraud with honest

slide-71
SLIDE 71

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

71

Detecting auction fraud

  • How to find near-bipartite cores? How to find

roles (honest, accomplice, fraudster)?

– Use belief propagation!

  • How to set BP parameters (potentials)?

– prior beliefs: prior knowledge, unbiased if none – compatibility potentials: by insight

slide-72
SLIDE 72

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

72

Belief propagation in action

Initialize all nodes as unbiased

slide-73
SLIDE 73

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

73

Belief propagation in action

Initialize all nodes as unbiased At each iteration, for each node, compute messages to its neighbors

slide-74
SLIDE 74

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

74

Belief propagation in action

Initialize all nodes as unbiased Continue till convergence At each iteration, for each node, compute messages to its neighbors

slide-75
SLIDE 75

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

75

Final belief scores = final roles

P(fraudster) P(associate) P(honest)

slide-76
SLIDE 76

Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

76

Today’s Lecture

  • Overview of collective classification
  • Relational classification

– Weighted average of neighborhood properties – Can not take node attributes while labeling

  • Iterative classification

– Takes node features while labeling

  • Belief propagation

– Message passing to update each node’s belief

  • f itself based on neighbors’ beliefs