Toward Relational Learning with Misinformation Liang Wu * , Jundong - - PowerPoint PPT Presentation

toward relational learning with misinformation
SMART_READER_LITE
LIVE PREVIEW

Toward Relational Learning with Misinformation Liang Wu * , Jundong - - PowerPoint PPT Presentation

Toward Relational Learning with Misinformation Liang Wu * , Jundong Li * , Fred Morstatter + , Huan Liu * * Arizona State University + University of Southern California {wuliang, jundongl, huanliu}@asu.edu, morstatt@usc.edu Arizona State


slide-1
SLIDE 1

Arizona State University Data Mining and Machine Learning Lab

Toward Relational Learning with Misinformation

Liang Wu*, Jundong Li*, Fred Morstatter+, Huan Liu*

*Arizona State University +University of Southern California

{wuliang, jundongl, huanliu}@asu.edu, morstatt@usc.edu

slide-2
SLIDE 2

Arizona State University Data Mining and Machine Learning Lab

Classification in Social Media

  • Relational learning aims to classify linked

nodes in a graph (social networks)

  • Task: Classification
  • Feature: Attributes, Links
slide-3
SLIDE 3

Arizona State University Data Mining and Machine Learning Lab

Classification in Social Media: Our Task

  • Relational learning aims to classify linked

nodes in a graph (social networks)

  • Task: Classification
  • Feature: Attributes, Links
  • Challenge: Data is Inaccurate
slide-4
SLIDE 4

Arizona State University Data Mining and Machine Learning Lab

Social Media Data is Inaccurate and Noisy

  • Attacks of content polluters

– Node attributes cannot reveal the identity

  • Colloquial language of regular users

– Misinformation, inaccurate data

slide-5
SLIDE 5

Arizona State University Data Mining and Machine Learning Lab

Classification with Noisy Data

  • Weighting Nodes
  • Anomalous points are lower weighted

– Larger loss leads to smaller weights

Weighted Learning Node Weights

Classifier

?

slide-6
SLIDE 6

Arizona State University Data Mining and Machine Learning Lab

Classification with Noisy Social Media Data

  • Attacks of content polluters

– Node attributes cannot reveal the identity

  • Colloquial language of regular users

– Misinformation, inaccurate data

slide-7
SLIDE 7

Arizona State University Data Mining and Machine Learning Lab

Robust Classification with Network Information

  • Weighting Nodes with Centrality
  • Authoritative points are higher weighted

– – Larger centrality leads to higher weights

Weighted Learning Node Weights

Classifier

?

slide-8
SLIDE 8

Arizona State University Data Mining and Machine Learning Lab

Denoising with Social Networks?

  • Links can be noisy
  • Obtaining all links (complete graph) is difficult
slide-9
SLIDE 9

Arizona State University Data Mining and Machine Learning Lab

Community Structures are More Robust

Malicious User

slide-10
SLIDE 10

Arizona State University Data Mining and Machine Learning Lab

Community Structures are More Robust

Malicious User Community Detection Malicious User

slide-11
SLIDE 11

Arizona State University Data Mining and Machine Learning Lab

Denoise with Community Structures

slide-12
SLIDE 12

Arizona State University Data Mining and Machine Learning Lab

Community Candidate Generation + Community Selection

slide-13
SLIDE 13

Arizona State University Data Mining and Machine Learning Lab

Community Candidate Generation + Community Selection

𝐱,𝐝 𝐧𝐣𝐨 ෍ 𝒋=𝟐 𝑶

ci 𝐲𝐣𝐱 − 𝐳𝒋

𝟑 + λ1||w||2

2 𝐓𝐯𝐜𝐤𝐟𝐝𝐮 𝐮𝐩 ෍

𝒋

𝒅𝒋 = 𝑳

+ λ2σi=0

d

σj=1

ni ||𝐝Gj

i||2

𝑕𝑠𝑝𝑣𝑞 𝑀𝑏𝑡𝑡𝑝 𝑏𝑤𝑝𝑗𝑒 𝑝𝑤𝑓𝑠𝑔𝑗𝑢𝑢𝑗𝑜𝑕

L1

1 nor

norm on

  • n th

the in inter-group p le level L2 norm on

  • n th

the in intra-group le level

d: depth of hierarchy of Louvain method ni: number of groups on layer i 𝐝Gj

i: nodes of group j on layer i

slide-14
SLIDE 14

Arizona State University Data Mining and Machine Learning Lab

Optimization

𝐱 𝐧𝐣𝐨 ෍ 𝒋=𝟐 𝒏

ci 𝐲𝐣𝐱 − 𝐳𝐣 𝟑 + λ1||w||2

2

Optimize w 𝐱,𝐝 𝐧𝐣𝐨 ෍ 𝒋=𝟐 𝒏

ci 𝑢𝒋

𝐓𝐯𝐜𝐤𝐟𝐝𝐮 𝐮𝐩 ෍

𝒋

𝒅𝒋 = 𝟐

+ λ2σi=0

d

σj=1

ni ||𝐝Gj

i||2

Optimize c

slide-15
SLIDE 15

Arizona State University Data Mining and Machine Learning Lab

Evaluation

Results

Macro- and Micro-average of F1-measures with increasing ratio of misinformation

Flickr

slide-16
SLIDE 16

Arizona State University Data Mining and Machine Learning Lab

More Results

BlogCatalog

Effectiveness of identifying mislabeled instances

BlogCatalog Flickr

slide-17
SLIDE 17

Arizona State University Data Mining and Machine Learning Lab

Conclusions

  • A supervised learning method with inaccurate

networked data

– Focusing on community structures instead of links – Can be integrated to other algorithms – Efficient to solve