Part-of-Speech Tagging for Twitter: Annotation, Features, and - - PowerPoint PPT Presentation

part of speech tagging for twitter
SMART_READER_LITE
LIVE PREVIEW

Part-of-Speech Tagging for Twitter: Annotation, Features, and - - PowerPoint PPT Presentation

Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments presented by: Pragati Shah Sally Gao Kennan Grant Overview 1. Introduction 2. Problem 3. Methodology 4. Results 5. Extensions 2 1. Introduction Primary goals


slide-1
SLIDE 1

Part-of-Speech Tagging for Twitter:

Annotation, Features, and Experiments

presented by:

Pragati Shah Sally Gao Kennan Grant

slide-2
SLIDE 2

Overview

  • 1. Introduction
  • 2. Problem
  • 3. Methodology
  • 4. Results
  • 5. Extensions

2

slide-3
SLIDE 3

1. Introduction

Primary goals and results

3

Goals: ○ Enable richer text analysis of Twitter and

  • ther social media platforms

○ Provide case study on how to rapidly engineer core NLP system for new datasets Results: ○ ~90% accuracy on test corpus ○ Openly accessible annotated corpus and trained POS tagger

slide-4
SLIDE 4
  • 2. Problem:

Why do we need a Twitter POS tagger?

4

  • 1. Conversational tone
  • 2. Unconventional orthography
  • 3. Character limit (280 — used to be 140)

Twitter has 328 million monthly active users and is a fruitful source of user-generated content. However, POS tagging for Twitter is challenging.

slide-5
SLIDE 5
  • 3. Methodology

Summary

  • 1. Define Tagging Scheme

Develop tag set and manually annotate corpus

  • 2. Create Features

Create additional features to incorporate into model

  • 3. Build Tagger

Conditional Random Field (CRF)

5

1,827 manually tagged tweets

  • 4. Evaluate

Cross-validate and compare tagging accuracy against Stanford tagger

slide-6
SLIDE 6
  • 3. Methodology

Tagset Development

Aim: Develop intuitive tagset to maximize tagging consistency

6

Steps:

1. Design coarse tagset: {Standard tags} + {Twitter-specific tags}. 2. Tokenize with Twitter tokenizer, and tag with Stanford POS tagger. 3. Correct automatic predictions of Step 2 with manual annotation. 4. Revise tokenization and tagging guidelines. 5. Correct annotations from Step 3. 6. Calculate annotator agreement. 7. Make final sweep to correct errors.

slide-7
SLIDE 7
  • 3. Methodology

Tagset Development

Cohen’s Kappa (κ) ◎ Measures inter-rater reliability ◎ i.e. the agreement between two raters who each classify N items into C mutually exclusive categories

7

In paper, κ = 0.914

slide-8
SLIDE 8
  • 3. Methodology

Tagging Scheme

Final Tagging Scheme: 25 tags

◎ Standard POS tags : (Nouns, Pronouns, Verbs, Adjectives etc.) ◎ Combined POS tags: {nominal, proper noun} × {verb, possessive} ◎ Twitter/online-specific tags: (#, @, URL & email-ids, emoticons and discourse markers). ◎ Miscellaneous Category tag (G): Multiword Abbreviations, Partial words, artifacts of tokenization errors, miscellaneous symbols, possessive endings

8

slide-9
SLIDE 9
  • 3. Methodology

Tagging Scheme

9

Tag Description Example

S Nominal + possessive someone’s ^ Proper noun usa M Proper noun + verbal Mark’ll ! Interjection lol, haha, yea # Hashtag* #acl @ At-mention @BarackObama E Emoticon :-) G Other abbreviations, foreign words, possessive endings, symbols, garbage ily [I love you] ♫

  • ->

*35% of hashtags were tagged with something other than #

slide-10
SLIDE 10
  • 3. Methodology

Conditional Random Field

◎ Discriminative undirected probabilistic graphical model ○ Model global dependencies

10

slide-11
SLIDE 11
  • 3. Methodology

Feature Engineering

CRF enables the incorporation of arbitrary local features. Base features: ◎ A feature for each word type ◎ Features to check whether word contains digits or hyphens ◎ Suffix features ◎ Features looking at capitalization patterns

11

slide-12
SLIDE 12
  • 3. Methodology

Feature Engineering

◎ TwOrth: Twitter Orthography. ○ Regex-style rules to detect @ mentions, hashtags, URLs. ◎ Names: Frequently capitalized tokens. ○ Twitter users are inconsistent in their use of capitalization. ○ Likelihood of capitalization = ◎ TagDict: Traditional tag dictionary. ○ Features for POS tags from traditional tag dictionary (PTB). ◎ DistSim: Distributional similarity. ○ Representation of term similarity via distributional features. ○ Used 1.9 million tokens from 134,000 unlabeled tweets for 10,000 most common terms. ◎ Metaph: Phonetic normalization. ○ Used the Metaphone algorithm (1999) to create coarse phonetic normalization, e.g. “lmao,” “lmaoo,” “lmaooo” map to LM.

12

slide-13
SLIDE 13
  • 3. Methodology

Evaluation

training set: 1,000 tweets (14,542 tokens) development set: 327 tweets (4,770 tokens) test set: 500 tweets (7,124 tokens)

◎ Trained Stanford tagger on labeled data ◎ Tuned Gaussian prior on development data ◎ In addition to tagger with full feature set, performed feature ablation experiments (remove one set of categories one at a time)

13

slide-14
SLIDE 14
  • 4. Results

Tagging Accuracy

Relative error reduction of 25% compared to the Stanford tagger

14

CRF Tagger with full feature set Stanford tagging accuracy Feature ablation experiments

slide-15
SLIDE 15
  • 4. Results

Challenges

◎ Despite the NAMES feature, the system struggles to identify proper nouns with non-standard capitalization ◎ The recall of proper nouns is only 71% ◎ The system also struggles with the miscellaneous category, G — accuracy of 26%

15

slide-16
SLIDE 16
  • 5. Extensions and Uses

◎ Cited by 739 according to Google Scholar ◎ Owoputi et al. (2013):

○ Developed improved annotation guidelines ○ Improved annotations in the Gimpel et al. corpus ○ Twitter tagging improved from 90% to 93% accuracy (state-of-the-art results) using large-scale unsupervised word clustering and new lexical features

◎ Mohammad et al. (2013)

○ Used the Gimpel et al. POS tagger to build state-of-the-art Twitter sentiment classifier. ◎ Lamb et al. (2013) ○ Used the Gimpel et a. POS tagger to surveil the spread of flu infections on Twitter.

16

slide-17
SLIDE 17

Thanks!

Any questions?

17