A Framework for Political Portmanteau Decomposition Nabil Hossain - - PowerPoint PPT Presentation

a framework for political portmanteau decomposition
SMART_READER_LITE
LIVE PREVIEW

A Framework for Political Portmanteau Decomposition Nabil Hossain - - PowerPoint PPT Presentation

A Framework for Political Portmanteau Decomposition Nabil Hossain Minh Tran Henry Kautz nhossain@cs.rochester.edu Dept. Computer Science University of Rochester, NY Political Portmanteau Portmanteau words formed by combining sounds


slide-1
SLIDE 1

A Framework for Political Portmanteau Decomposition

  • Dept. Computer Science

University of Rochester, NY

Nabil Hossain

nhossain@cs.rochester.edu

Henry Kautz Minh Tran

slide-2
SLIDE 2

Political Portmanteau

  • Portmanteau
  • words formed by combining sounds and meanings of two words
  • brunch = breakfast + lunch motel = motor + hotel
  • Political portmanteau (PP)
  • portmanteau in which at least one word refers to political entity
  • libtard = liberal + retard repugnican = repugnant + republican
  • political framing
  • creative, sticky
  • novel slang; can be used in hate speech
slide-3
SLIDE 3

Political Portmanteau

  • Portmanteau
  • words formed by combining sounds and meanings of two words
  • brunch = breakfast + lunch motel = motor + hotel
  • Political portmanteau (PP)
  • portmanteau in which at least one word refers to political entity
  • libtard = liberal + retard repugnican = repugnant + republican
  • political framing
  • creative, sticky
  • novel slang; can be used in hate speech
slide-4
SLIDE 4

Political Portmanteau

  • Portmanteau
  • words formed by combining sounds and meanings of two words
  • brunch = breakfast + lunch motel = motor + hotel
  • Political portmanteau (PP)
  • portmanteau in which at least one word refers to political entity
  • libtard = liberal + retard repugnican = repugnant + republican
  • offensive; political framing
  • creative, humorous, slang, sticky
  • can be used in hate speech
slide-5
SLIDE 5

Contributions

  • Framework for identifying political portmanteau from the web
  • Algorithm for PP detection and decomposition into root words
  • First shared dataset of PP
slide-6
SLIDE 6

ICWSM 2018

Method

  • Extract words from Reddit news comments
  • Apply slang detection algorithm
  • Classify the detected words into PP vs not-PP
  • Decompose detected PP into root words:

[ where X or Y is a political term ]

X + Y → PP

Potential Slang

Hossain, Nabil, Thanh Thuy Trang Tran, and Henry Kautz. "Discovering Political Slang in Readers' Comments." In ICWSM 2018. Slang Detection

Reddit Comments

slide-7
SLIDE 7

Expert Annotators

Method

  • Extract words from Reddit news comments
  • Apply slang detection algorithm
  • Classify the detected words into PP vs not-PP
  • Decompose detected PP into root words:

[ where X or Y is a political term ]

X + Y → PP

Potential Slang

Hossain, Nabil, Thanh Thuy Trang Tran, and Henry Kautz. "Discovering Political Slang in Readers' Comments." In ICWSM 2018.

PP (libtard) Not-PP (repub) PP Detection

Slang Detection

Reddit Comments

ICWSM 2018

slide-8
SLIDE 8

Expert Annotators

PP Decomposition

Method

  • Extract words from Reddit news comments
  • Apply slang detection algorithm
  • Classify the detected words into PP vs not-PP
  • Decompose detected PP into root words:
  • r

E + C → PP C + E → PP

Potential Slang

Hossain, Nabil, Thanh Thuy Trang Tran, and Henry Kautz. "Discovering Political Slang in Readers' Comments." In ICWSM 2018.

PP (libtard) Not-PP (repub) PP Detection

Political Entities (liberal, cruz, …)

lib + C = libtard

C = {retard, dotard, custard, …}

Wordlist Classifier

Comment Context

Prefix/suffix match

Slang Detection

Reddit Comments

ICWSM 2018

slide-9
SLIDE 9

Model Details

  • distribution Model — no contextual features
  • Edit distances, word length, usage frequency
  • capture sound blending and word popularity
  • XGBoost — uses pre-trained GloVe word vector features from comments
  • also uses distribution model features

β β

PP Decomposition Accuracy PP Detection Accuracy Questions: nhossain@cs.rochester.edu

slide-10
SLIDE 10

Results

  • distribution Model — no contextual features
  • Edit distances, word length, usage frequency
  • capture sound blending and word popularity
  • XGBoost — uses pre-trained GloVe word vector features from comments
  • also uses distribution model features

β β

PP Decomposition Accuracy PP Detection Accuracy Questions: nhossain@cs.rochester.edu

slide-11
SLIDE 11

Results

  • distribution Model — no contextual features
  • Edit distances, word length, usage frequency
  • capture sound blending and word popularity
  • XGBoost — uses pre-trained GloVe word vector features from comments
  • also uses distribution model features

β β

PP Decomposition Accuracy PP Detection Accuracy Questions: nhossain@cs.rochester.edu Website: https://cs.rochester.edu/u/nhossain

slide-12
SLIDE 12

Results

  • distribution Model — no contextual features
  • Edit distances, word length, usage frequency
  • capture sound blending and word popularity
  • XGBoost — uses pre-trained GloVe word vector features from comments
  • also uses distribution model features

β β

PP Decomposition Accuracy PP Detection Accuracy

Questions: nhossain@cs.rochester.edu Website: https://cs.rochester.edu/u/nhossain