Leveraging Joint Interactions for Credibility Analysis in News - - PowerPoint PPT Presentation

leveraging joint interactions for credibility analysis in
SMART_READER_LITE
LIVE PREVIEW

Leveraging Joint Interactions for Credibility Analysis in News - - PowerPoint PPT Presentation

Leveraging Joint Interactions for Credibility Analysis in News Communities Subhabrata Mukherjee and Gerhard Weikum Max Planck Institute for Informatics CIKM 2015 Motivation Media plays a crucial role in public dissemination of information


slide-1
SLIDE 1

Leveraging Joint Interactions for Credibility Analysis in News Communities

Subhabrata Mukherjee and Gerhard Weikum Max Planck Institute for Informatics CIKM 2015

slide-2
SLIDE 2

Motivation

➢ Media plays a crucial role in public dissemination of

information

➢ However, people believe there is substantial media bias in

news in view of inter-dependencies and cross-ownerships of media companies and other industries (like energy)

➢ 4 out of 5 Americans among younger generations do not trust

major news networks [Gallup poll, 2013]

➢ This work: Credibility Analysis of News Communities

slide-3
SLIDE 3

News Community

➢ A news community is a news aggregator site (e.g., reddit.com,

digg.com, newstrust.net) where:

➢ Users can give explicit feedback (e.g., rate, review, share) on the

quality of news

➢ Interact (e.g., comment, vote) with each other

➢ However, this adds user subjectivity as users incorporate their

  • wn bias and perspectives in the framework

➢ Controversial topics create polarization among users which

influence their evaluation

slide-4
SLIDE 4

Contributions

  • A model to capture joint interaction between language,

topics, users and sources leading to better prediction than the ones in isolation

  • User expertise, source trustworthiness, language objectivity,

topical perspective and article credibility mutually reinforce each other

  • A supervised Conditional Random Field model that can

capture these interactions, and handle real-valued ratings

slide-5
SLIDE 5

s1 s1 d1 r11 r12 u1 u2 d2 r22 u2

y1 y2 C1 C2

Source Article Review User

Example

FACTORS

slide-6
SLIDE 6

s1 s1 d1 r11 r12 u1 u2 d2 r22 u2

y1 y2 C1 C2

Source Article Review User

Alternet.org (progressive/liberal) Why do conservaties hate your children? Ratings Discussions (liberal vs.conservative)

Example

Topic: Climate

FACTORS Instantiation

slide-7
SLIDE 7

s1 s1 d1 r11 r12 u1 u2 d2 r22 u2

y1 y2 C1 C2

Viewpoint, Expertise Why do conservaties hate your children? Ratings Discussions (liberal vs.conservative)

Example

Topic: Climate

Source Article Review User

FACTORS FEATURES

slide-8
SLIDE 8

s1 s1 d1 r11 r12 u1 u2 d2 r22 u2

y1 y2 C1 C2

Viewpoint, Expertise Emotionality, Discourse Ratings Discussions (liberal vs.conservative)

Example

Topic: Climate

Source Article Review User

FACTORS FEATURES

slide-9
SLIDE 9

s1 s1 d1 r11 r12 u1 u2 d2 r22 u2

y1 y2 C1 C2

Viewpoint, Expertise Emotionality, Discourse Discussions (liberal vs.conservative)

Example

Ratings Topic

Source Article Review User

FACTORS FEATURES

slide-10
SLIDE 10

s1 s1 d1 r11 r12 u1 u2 d2 r22 u2

y1 y2 C1 C2

Viewpoint, Expertise Emotionality, Discourse Bias, Viewpoint, Expertise

Example

Topic Ratings

Source Article Review User

FACTORS FEATURES

slide-11
SLIDE 11

s1 s1 d1 r11 r12 u1 u2 d2 r22 u2

y1 y2 C1 C2

Source Article Review User Article Credibility Rating? Trustworthiness Objectivity Credibility Expertise

Task

FACTORS ATTRIBUTES

slide-12
SLIDE 12

Credibility Analysis

➢ Given a set of news sources generating news

articles, and users reviewing them on different qualitative aspects with mutual interactions:

➢ Jointly rank the sources, articles, and users based

  • n their trustworthiness, credibility,and expertise
slide-13
SLIDE 13

Credibility of Statements in Health Communities

[S. Mukherjee et al.: KDD‘14]

slide-14
SLIDE 14

s1 s1 d1 r11 r12 u1 u2 d2 r22 u2

y1 y2 C1 C2

Source Article Review User Objectivity

Language Features

A s s e r t i v e s , F a c t i v e s , H e d g e s , I m p l i c a t i v e s , R e p

  • r

t , D i s c

  • u

r s e , S u b j e c t i v i t y e t c .

  • 1. M. Recasens, C. Danescu-Niculescu-Mizil, and D. Jurafsky. Linguistic models for analyzing and detecting

biased language. In ACL, 2013.

  • 2. S. Mukherjee, G. Weikum, and C. Danescu-Niculescu-Mizil. People on drugs: Credibility of user statements

in health communities. KDD, 2014.

slide-15
SLIDE 15

➢ Only 33% of the articles have explicit tags ➢ Use Latent Dirichlet Allocation to learn the latent topic distribution

in the corpus of news articles

Topic Features

slide-16
SLIDE 16

Source Features

slide-17
SLIDE 17

Category Elements

Engagement answers, ratings (given / received), comments etc. Agreement Inter-user agreement Topics perspective and expertise Interactions user-user, user-item, user-source

User Features

slide-18
SLIDE 18

s1 s1 d1 r11 r12 u1 u2 d2 r22 u2

y1 y2 C1 C2

Source Article Review User Article Credibility Rating? Source Models Article Language Model, Topic Model Review Language Model, Topic Model User Models How to aggregate?

Given a factor, with its features, use Support Vector Regression to learn a model that will predict its rating for an article.

slide-19
SLIDE 19

Probability Mass Function for discrete labels: Probability Density Function for continuous ratings:

Conditional Random Field

slide-20
SLIDE 20

Clique potential

Energy Function

User Potential Source Potential Language Potential Topic Potential Clique: source, article, <users>, <reviews>

slide-21
SLIDE 21

error of predictor SVR partitions the user space user expertise

slide-22
SLIDE 22

source trustworthiness

Energy Function

language objectivity topical perspective

slide-23
SLIDE 23

Σ needs to be positive definite for inverse to exist → {α, β, γ} > 0 Makes sense: predictor reliability should be positive The joint p.d.f is a multivariate gaussian distribution

slide-24
SLIDE 24

Maximize log-likelihood with respect to log λk instead of λk Prediction is the expected value of the function given by the mean of the Multivariate Gaussian distribution: Constrained optimization problem. Gradient ascent cannot be directly used.

slide-25
SLIDE 25

Experiments: NewsTrust

Data available at: http://www.mpi-inf.mpg.de/impact/credibilityanalysis/

slide-26
SLIDE 26

Predicting User Ratings

Users, Articles, Ratings +Time +Review Text +Review Text and Interactions

  • 1. Y. Koren. Factorization meets the neighborhood: A multifaceted collaborative filtering model. KDD, 2008.
  • 2. J. McAuley and J. Leskovec. Hidden factors and hidden topics: Understanding rating dimensions with

review text. RecSys, 2013.

  • 3. J. J. McAuley and J. Leskovec. From amateurs to connoisseurs: modeling the evolution of user expertise

through online reviews. In WWW, 2013.

slide-27
SLIDE 27

Predicting Article Credibility Ratings

slide-28
SLIDE 28

Predicting Article Credibility Ratings

slide-29
SLIDE 29

Predicting Article Credibility Ratings

slide-30
SLIDE 30

Predicting Article Credibility Ratings

slide-31
SLIDE 31

Ranking Trustworthy News Sources Ranking Expert Users:

slide-32
SLIDE 32

Sample Output: Most and Least Trust Sources on Sample Topics

slide-33
SLIDE 33

Conclusions

➢ Joint interaction between language, topics, users

and sources lead to better prediction in multiple tasks

➢ User expertise, source trustworthiness, language

  • bjectivity, topical perspective and article

credibility mutually reinforce each other

slide-34
SLIDE 34

Ongoing Work

➢ Analyze temporal evolution of these factors ➢ Communities are inherently dynamic in nature ➢ Source trustworthiness, and user expertise change

with time

➢ To this end we propose an Experience-aware Item

Recommendation for Evolving Review Communities, ICDM 2015.