Feature Design for Polarity Classification Presentation for FEAST - - PowerPoint PPT Presentation

feature design for polarity classification
SMART_READER_LITE
LIVE PREVIEW

Feature Design for Polarity Classification Presentation for FEAST - - PowerPoint PPT Presentation

Feature Design for Polarity Classification Presentation for FEAST (Saarland University) 22nd April 2009 By Michael Wiegand Spoken Language Systems (IRTG) Outline of Talk Introduction to Polarity Classification Semi-Supervised


slide-1
SLIDE 1

Feature Design for Polarity Classification

Presentation for FEAST (Saarland University) 22nd April 2009 By Michael Wiegand Spoken Language Systems (IRTG)

slide-2
SLIDE 2

Outline of Talk

  • Introduction to Polarity Classification
  • Semi-Supervised Learning for Document-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments

2

  • Supervised Learning Sentence-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments
  • Topic-Related Sentence-Level Classification
  • Conclusion
slide-3
SLIDE 3

Outline of Talk

  • Introduction to Polarity Classification
  • Semi-Supervised Learning for Document-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments

3

  • Supervised Learning Sentence-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments
  • Topic-Related Sentence-Level Classification
  • Conclusion
slide-4
SLIDE 4

What is Polarity Classification?

  • Polarity Classification is a subtask in Opinion Mining
  • 2 different types of text classification in Opinion Mining:
  • Subjectivity Detection
  • Does a text represent an or a ?
  • vs.
  • 4
  • Polarity Classification
  • Given an opinionated text, is the opinion expressed in the

text or ?

  • vs.
slide-5
SLIDE 5

What is Polarity Classification?

  • Polarity Classification is a subtask in Opinion Mining
  • 2 different types of text classification in Opinion Mining:
  • Subjectivity Detection
  • Does a text represent an or a ?
  • vs.
  • 5
  • Polarity Classification
  • Given an opinionated text, is the opinion expressed in the

text or ?

  • vs.
slide-6
SLIDE 6

Why Polarity Classification?

  • Increasingly more opinionated content on the

web (Web 2.0) need for retrieving/classifying this kind of content

  • What makes polarity classification difficult?
  • Different from common topic classification

6

  • Different from common topic classification
  • Different kind of cues: (e.g. ,

etc.); not necessarily frequent content words!

  • (!, ")
  • of polar expressions (e.g.
  • vs. )
slide-7
SLIDE 7

Outline of Talk

  • Introduction to Polarity Classification
  • Semi-Supervised Learning for Document-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments

7

  • Supervised Learning Sentence-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments
  • Topic-Related Sentence-Level Classification
  • Conclusion
slide-8
SLIDE 8

Semi-Supervised Learning - an Illustration

8

slide-9
SLIDE 9

Semi-Supervised Learning - an Illustration

  • 9
slide-10
SLIDE 10

Semi-Supervised Learning - an Illustration

10

slide-11
SLIDE 11

Semi-Supervised Learning - an Illustration

11

slide-12
SLIDE 12

Semi-Supervised Learning - an Illustration

  • 12
slide-13
SLIDE 13

Semi-Supervised Learning - an Illustration

  • 13
slide-14
SLIDE 14

Semi-Supervised Learning - an Illustration

  • 14
slide-15
SLIDE 15

Semi-Supervised Learning - an Illustration

15

slide-16
SLIDE 16

Semi-Supervised Learning - an Illustration

  • 16
slide-17
SLIDE 17

Semi-Supervised Learning - an Illustration

  • 17
slide-18
SLIDE 18

Outline of Talk

  • Introduction to Polarity Classification
  • Semi-Supervised Learning for Document-Level Classification
  • Related Work
  • Feature Design
  • Experiments

18

  • Supervised Learning Sentence-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments
  • Topic-Related Sentence-Level Classification
  • Conclusion
slide-19
SLIDE 19

The task

  • Document-level text classification of reviews
  • Decide whether a document is either a positive
  • r a negative review
  • Use labeled and unlabeled documents for

training

19

training

  • All documents, both labeled and unlabeled, are

assumed to be subjective ( #$%)

  • All documents, both labeled and unlabeled, are

either positive or negative reviews

slide-20
SLIDE 20

Outline of Talk

  • Introduction to Polarity Classification
  • Semi-Supervised Learning for Document-Level Classification
  • The Task
  • Feature Design
  • Experiments

20

  • Supervised Learning Sentence-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments
  • Topic-Related Sentence-Level Classification
  • Conclusion
slide-21
SLIDE 21

Related Work

  • Supervised Learning:
  • Different algorithms and feature selection/extraction

methods [Pang 2002; Salvetti 2006; Ng 2006; Gamon 2004]

  • Unsupervised Learning:
  • Induction of polarity lexicons (i.e. identification of

21

  • Induction of polarity lexicons (i.e. identification of

polar expression) using & &[Turney 2002]

  • Semi-Supervised Learning:
  • Extending Turney‘s webmining approach with

labeled data [Beineke 2004]

  • EM in the context of domain adaptation [Aue 2005]
slide-22
SLIDE 22

Contribution of this work

  • extensive study of semi-supervised

learning for polarity classification

  • Comparison of different feature sets
  • Evaluation on various domains

22

  • Evaluation on various domains
slide-23
SLIDE 23

Outline of Talk

  • Introduction to Polarity Classification
  • Semi-Supervised Learning for Document-Level Classification
  • The Task
  • Related Work
  • Experiments

23

  • Supervised Learning Sentence-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments
  • Topic-Related Sentence-Level Classification
  • Conclusion
slide-24
SLIDE 24

Why is feature selection more important in semi- supervised learning than in supervised learning?

  • Less information contained in small

labeled datasets intrinsic predictiveness of features is important

  • Inappropriate feature sets may lead

24

  • Inappropriate feature sets may lead

semi-supervised classifiers astray

  • In polarity classification there is the

danger that topic information interferes

slide-25
SLIDE 25

Different Types of Feature Selection Method

  • Standard baseline:
  • Frequently occurring words in collection
  • Exclude stopwords
  • Polarity lexicons:
  • Open-domain lexicon containing ',

25

  • Open-domain lexicon containing ',

i.e. words carrying a prior polarity (e.g. nice, hate, superb, awful)

  • Adjectives & adverbs:
  • Parts of speech with the highest proportion of polar

expressions

slide-26
SLIDE 26

The Different Feature Sets

!"

  • #$
  • Top non-stopwords

())) Automatic Selection 2000

  • Top adjectives & adverbs

*$ Automatic Selection 600

  • 26

Selection Appraisal Groups *+ Polarity Lexicon 2014 [Whitelaw 2005] General Inquirer + Polarity Lexicon 2882 [Stone 1967] Subjectivity Lexicon ,-* Polarity Lexicon 4615 [Wilson 2005] SentiWordNet . Polarity Lexicon 11366 [Esuli 2006]

slide-27
SLIDE 27

The Different Feature Sets

!"

  • #$
  • %
  • !
  • &'''

%% ()

  • !
  • *''

%%

27

Appraisal Groups *+ Polarity Lexicon 2014 [Whitelaw 2005] General Inquirer + Polarity Lexicon 2882 [Stone 1967] Subjectivity Lexicon ,-* Polarity Lexicon 4615 [Wilson 2005] SentiWordNet . Polarity Lexicon 11366 [Esuli 2006]

slide-28
SLIDE 28

The Different Feature Sets

!"

  • #$
  • Top non-stopwords

())) Automatic Selection 2000

  • Top adjectives & adverbs

*$ Automatic Selection 600

  • 28

Selection

!+

  • ,
  • &'./

0&''12 +34

  • ,
  • &55&

0 .6*72 (-

  • ,
  • /*.1

0&''12 8

  • ,
  • ..9**

0$&''*2

slide-29
SLIDE 29

Outline of Talk

  • Introduction to Polarity Classification
  • Semi-Supervised Learning for Document-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • $
  • Supervised Learning Sentence-Level Classification

29

  • Supervised Learning Sentence-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments
  • Topic-Related Sentence-Level Classification
  • Conclusion
slide-30
SLIDE 30

Set up of Experiments

  • Two different datasets:
  • IMDB-movie reviews: 1000 positive and 1000 negative reviews

(benchmark data-set) [Pang 2002]

  • RateItAll reviews /&""":

900 positive and 900 negative reviews per domain

  • Learn models with varying amounts of labeled

30

  • Learn models with varying amounts of labeled

documents (20, 50, 100, 200 etc.)

  • 1000 unlabeled documents (also serve as labeled test

data [Aue 2005; Joachims 1999; Joachims 2003])

  • Average results of 20 random samples
slide-31
SLIDE 31

Set up of Experiments

  • Word normalization by applying

&&!

  • Part-of-speech tagging was done using 0

!!

31

  • Use 1,2! for Support Vector Machines

[SVMs] (i.e. supervised classifier)

  • Use +2!for Spectral Graph Transduction

[SGT] (i.e. semi-supervised classifier)

slide-32
SLIDE 32

Decomposing Best Configuration ,3&/ 4/ 5*!&/

32

slide-33
SLIDE 33

Decomposing Best Configuration ,3&/ 4/ 5*!&/

33

On this domain Appraisal Groups *+ work best, :; <

slide-34
SLIDE 34

Semi-Supervised Learning: different feature sets

  • n different domains trained on and

documents

=>

? @

+

? $>3% @

,-

  • >,A! 8

+3 !+

  • *''

34

  • &'''

&'''

>,A! 8

+3 !+ *'' !() !

  • 67.75

73.88 !" 74.77 73.95 73.74 74.51

  • 62.38

67.20 68.45 68.40 #$"%& 68.44 68.79

  • 57.96

61.83 57.57 59.80 60.62 58.53 #'"!!

  • 57.95

57.48 65.44 68.37 #%"# 65.09 68.05

  • 59.81

62.60 67.39 67.10 66.14 "% 66.58

  • 61.17

64.60 66.92 67.69 67.83 67.22 *5"9'

slide-35
SLIDE 35

Semi-Supervised Learning: different feature sets

  • n different domains trained on and

documents

=>

? @

+

? $>3% @

,-

  • >,A! 8

+3 !+

  • *''

35

  • &'''

&'''

>,A! 8

+3 !+ *'' !() !

  • 67.75

73.88 !" 74.77 73.95 73.74 74.51

  • 62.38

67.20 68.45 68.40 #$"%& 68.44 68.79

  • 57.96

61.83 57.57 59.80 60.62 58.53 #'"!!

  • 57.95

57.48 65.44 68.37 #%"# 65.09 68.05

  • 59.81

62.60 67.39 67.10 66.14 "% 66.58

  • 61.17

64.60 66.92 67.69 67.83 67.22 *5"9'

just using a standard feature set may not improve performance at all (when switching to SGT)

slide-36
SLIDE 36

Semi-Supervised Learning: different feature sets

  • n different domains trained on and

documents

=>

? @

+

? $>3% @

,-

  • >,A! 8

+3 !+

  • *''

36

  • &'''

&'''

>,A! 8

+3 !+ *'' !() !

  • 67.75

73.88 !" 74.77 73.95 73.74 74.51

  • 62.38

67.20 68.45 68.40 #$"%& 68.44 68.79

  • 57.96

61.83 57.57 59.80 60.62 58.53 #'"!!

  • 57.95

57.48 65.44 68.37 #%"# 65.09 68.05

  • 59.81

62.60 67.39 67.10 66.14 "% 66.58

  • 61.17

64.60 66.92 67.69 67.83 67.22 *5"9'

slide-37
SLIDE 37

Semi-Supervised Learning: different feature sets

  • n different domains trained on and

documents

=>

? @

+

? $>3% @

,-

  • >,A! 8

+3 !+

  • *''
  • n each domain, compare the best

feature set with adjectives/adverbs

37

  • &'''

&'''

>,A! 8

+3 !+ *'' !() !

  • 67.75

73.88 !" 74.77 73.95 73.74 74.51

  • 62.38

67.20 68.45 68.40 #$"%& 68.44 68.79

  • 57.96

61.83 57.57 59.80 60.62 58.53 #'"!!

  • 57.95

57.48 65.44 68.37 #%"# 65.09 68.05

  • 59.81

62.60 67.39 67.10 66.14 "% 66.58

  • 61.17

64.60 66.92 67.69 67.83 67.22 *5"9'

slide-38
SLIDE 38

Why do adjectives & adverbs do so well?

  • High proportion of polar expressions among

these parts of speech

  • In-domain words are used
  • In-domain polar adjectives, such as &or

(food domain), do not appear in open-

38

(food domain), do not appear in open- domain polarity lexicons

  • Stemming allows capturing polar expressions

from other parts of speech

  • ! (adj) and (noun) are both

reduced to

slide-39
SLIDE 39

Summary of Experiments in Semi- Supervised Classification

  • Semi-supervised learning produces better results than

supervised learning in polarity classification when only few labeled data are used

  • Polarity lexicons and in-domain adjectives & adverbs
  • ffer better performance than frequent non-stopwords
  • Specific polarity lexicons can significantly outperform

39

  • Specific polarity lexicons can significantly outperform
  • ther feature sets on particular domains (e.g. *

+ *+ on movie domain)

  • On average, in-domain $0#produce

best performance

slide-40
SLIDE 40

Outline of Talk

  • Introduction to Polarity Classification
  • Semi-Supervised Learning for Document-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments

40

  • Supervised Learning Sentence-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments
  • Topic-Related Sentence-Level Classification
  • Conclusion
slide-41
SLIDE 41

Outline of Talk

  • Introduction to Polarity Classification
  • Semi-Supervised Learning for Document-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments

41

  • Supervised Learning Sentence-Level Classification
  • Related Work
  • Feature Design
  • Experiments
  • Topic-Related Sentence-Level Classification
  • Conclusion
slide-42
SLIDE 42

The Task

  • Decide whether a given (is

either an overall positive or negative

  • pinion
  • Learn exclusively from labeled data

42

  • Learn exclusively from labeled data
  • All sentences to be classified are

assumed to be subjective and carrying either positive or negative overall polarity

slide-43
SLIDE 43

Why Polarity Classification at Sentence-Level?

  • More fine-grained polarity classification

than document-level classification is needed for NLP tasks such as:

  • Question Answering

43

  • Question Answering
  • Text Summarization
  • Subjectivity/polarity is usually not

uniformly distributed across a document [McDonald 2006]

slide-44
SLIDE 44

Difference to Document-Level Classification

… … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … B;>$8

44

… … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … )*+++

slide-45
SLIDE 45

Difference to Document-Level Classification

… … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … B;>$8

45

… … … … … … … … … … … … … … … … … … … … … … , … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … )*+++

slide-46
SLIDE 46

Difference to Document-Level Classification

… … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … B;>$8

  • At document level text

classification relies very much on redundancy

  • The correct interpretation of

every word is NOT necessary

46

… … … … … … … … … … … … … … … … … … … … … … , … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … )*

necessary

  • One does not have to

determine whether , is a verb and hence a positive polar expression or just a preposition

  • the unambiguous polar

expressions suffice for classification

slide-47
SLIDE 47

Difference to Document-Level Classification

$8$8$

  • At sentence level the

correct interpretation of every word is more important!

47

… … … … , … … … … … $8$8$ )*+++

important!

  • There is less information

contained in a sentence

  • One needs to determine

whether , is a verb and hence a positive polar expression or just a preposition

slide-48
SLIDE 48

Difference to Document-Level Classification

$8$8$

  • At sentence level the

correct interpretation of every word is more important!

48

… … … … ,-./… … … $8$8$ )*+++

important!

  • There is less information

contained in a sentence

  • One needs to determine

whether , is a verb and hence a positive polar expression or just a preposition

slide-49
SLIDE 49

Difference to Document-Level Classification

$8$8$

  • At sentence level the

correct interpretation of every word is more important!

49

… … … … ,-./… … … $8$8$ )*

important!

  • There is less information

contained in a sentence

  • One needs to determine

whether , is a verb and hence a positive polar expression or just a preposition

slide-50
SLIDE 50

Outline of Talk

  • Introduction to Polarity Classification
  • Semi-Supervised Learning for Document-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments

50

  • Supervised Learning Sentence-Level Classification
  • The Task
  • Feature Design
  • Experiments
  • Topic-Related Sentence-Level Classification
  • Conclusion
slide-51
SLIDE 51

Related Work

  • Linguistic features for polarity classification:
  • [Gamon 2004; Ng 2006]: linguistic features on

document-level polarity classification

  • [Wilson 2005; Choi 2008]: contextual

disambiguation of expression-level polarity classification

51

classification

  • Sentence-level polarity classification:
  • [Meena 2007]: conjunctions as features for

sentence-level polarity classification of product reviews

  • [Moilanen 2007]: deep syntactic features on polarity

classification of headlines

slide-52
SLIDE 52

Contribution of this Work

  • Inspection of various linguistic features

derived from sentence structure

  • Various combinations of linguistic

features, polarity features and bag-of-

52

features, polarity features and bag-of- words features (unlike [Moilanen 2007])

slide-53
SLIDE 53

Feature Hierarchy

  • 53

% %

bag of words linguistic features prior polarity linguistic features

slide-54
SLIDE 54

Feature Hierarchy

  • 54

% %

bag of words linguistic features prior polarity linguistic features

slide-55
SLIDE 55

Feature Hierarchy

  • 8
  • C

55

% %

bag of words linguistic features prior polarity linguistic features

slide-56
SLIDE 56

Feature Hierarchy

  • 56

% %

  • linguistic features

prior polarity linguistic features

slide-57
SLIDE 57

Feature Hierarchy

  • 57

% %

  • linguistic features

prior polarity linguistic features

  • !

D

slide-58
SLIDE 58

Feature Hierarchy

  • 58

% %

bag of words

  • prior polarity

linguistic features

slide-59
SLIDE 59

Linguistic Word-Level Features

  • Linguistic features: describe linguistic

properties of words

  • (Generalized) parts of speech:
  • tags: "#"$"#
  • WordNet hypernyms

59

  • WordNet hypernyms
  • Depth level !
  • Main predicate
  • Clause type
slide-60
SLIDE 60

Linguistic Word-Level Features

  • Linguistic features: describe linguistic

properties of words

  • (Generalized) parts of speech:
  • tags: "#"$"#
  • WordNet hypernyms

60

  • WordNet hypernyms
  • Depth level !
  • Main predicate
  • Clause type

Example of linguistic word-level feature: word: + part of speech: # feature: _is_a__in_this_sentence

slide-61
SLIDE 61

Linguistic Word-Level Features

  • Linguistic features: describe linguistic

properties of words

  • ?+D@E
  • tags: "#"$"#
  • WordNet hypernyms
  • 61
  • WordNet hypernyms
  • Depth level !
  • Main predicate
  • Clause type
slide-62
SLIDE 62

Linguistic Word-Level Features

  • Linguistic features: describe linguistic

properties of words

  • (Generalized) part of speech:
  • tags: "#"$"#
  • 8

+D

62

  • 8
  • Depth level !
  • Main predicate
  • Clause type

+D

slide-63
SLIDE 63

Linguistic Word-Level Features

  • Linguistic features: describe linguistic

properties of words

  • (Generalized) part of speech:
  • tags: "#"$"#
  • WordNet hypernyms

63

  • WordNet hypernyms
  • !
  • >

,

slide-64
SLIDE 64

Linguistic Word-Level Features - Motivation

  • Part of Speech:
  • Some polar cues are ambiguous, e.g. can be

either a polar verb or a preposition

  • Polar verb: 0I spinach2F.
  • Preposition: 0The durian fruit smells hell2%.
  • WordNet Hypernyms

64

  • WordNet Hypernyms
  • Hypernyms are a means of generalization
  • Car #$
  • Since sentence-level text classification is sparse,

adding hypernyms of the words in the sentence may compensate this

slide-65
SLIDE 65

Depth Level Feature - Motivation

Fortunately , the Court has .

65

Fortunately multiple , independent bases the Court has to reject the Government `s Motion

slide-66
SLIDE 66

Depth Level Feature - Motivation

Fortunately , the Court has .

Overall polarity:

66

Fortunately multiple , independent bases the Court has to reject the Government `s Motion

slide-67
SLIDE 67

Depth Level Feature - Motivation

  • ,

the Court has .

Overall polarity:

67

  • multiple , independent bases

the Court has to reject the Government `s Motion

slide-68
SLIDE 68

Depth Level Feature - Motivation

  • ,

the Court has .

Overall polarity:

68

  • multiple , independent bases

the Court has to ( the Government `s Motion

slide-69
SLIDE 69

Depth Level Feature - Motivation

  • ,

the Court has .

Overall polarity:

69

  • multiple , independent bases

the Court has to ( the Government `s Motion

The sentence contains both a positive and a negative polar expression.

slide-70
SLIDE 70

Depth Level Feature - Motivation

  • ,

the Court has .

Overall polarity:

70

  • multiple , independent bases

the Court has to ( the Government `s Motion

Determine the depth level

  • f both polar

expressions!

slide-71
SLIDE 71

Depth Level Feature - Motivation

  • ,

the Court has .

Overall polarity:

' . & 9

71

  • multiple , independent bases

the Court has to ( the Government `s Motion

Determine the depth level

  • f both polar

expressions!

9

slide-72
SLIDE 72

Depth Level Feature - Motivation

  • ,

the Court has .

Overall polarity:

' . & 9

72

  • multiple , independent bases

the Court has to ( the Government `s Motion

Determine the depth level

  • f both polar

expressions!

9

slide-73
SLIDE 73

Depth Level Feature - Motivation

  • ,

the Court has .

Overall polarity:

' . & 9

73

  • multiple , independent bases

the Court has to ( the Government `s Motion

Depth of positive polar Expression is

9 9

slide-74
SLIDE 74

Depth Level Feature - Motivation

  • ,

the Court has .

Overall polarity:

' . & 9 . & 9

74

  • multiple , independent bases

the Court has to ( the Government `s Motion

Depth of negative polar Expression is

9 5 9 / 1 * 7 5

slide-75
SLIDE 75

Depth Level Feature - Motivation

  • ,

the Court has .

Overall polarity:

' . & 9 . & 9

75

  • multiple , independent bases

the Court has to ( the Government `s Motion

Positive polar expression is preferred since it is in a lower depth level

9 9 / 1 * 7 5

slide-76
SLIDE 76

Other Prominence Features - Motivation

  • Main predicate:
  • Main predicate is more predictive towards polarity

than other grammatical relations

  • 0The Pakistani government -/ President

Bush and his on 2F.

  • Clause type:

76

  • Clause type:
  • Expressions from a & are more important

than expressions from subordinate clauses

  • 0[Although he had ] 0((0)

1 (212F.

  • Some types of conjunctions weaken !"
  • r strengthen a clause #
slide-77
SLIDE 77

Feature Hierarchy

  • !
  • C

77

% %

bag of words linguistic features prior polarity linguistic features

slide-78
SLIDE 78

Feature Hierarchy

  • 78

% %

bag of words linguistic features

  • linguistic features
slide-79
SLIDE 79

Prior Polarity Features

  • Explicit polarity:
  • Number of polar expressions in

sentence

  • Number of ! polar expressions in

79

  • Number of ! polar expressions in

sentence

slide-80
SLIDE 80

Feature Hierarchy

  • 80

% %

bag of words linguistic features prior polarity

slide-81
SLIDE 81

Linguistic Sentence-Level Features

  • Prior polarity features in conjunction with a

particular linguistic property

  • Linguistic properties:
  • Part of speech
  • Depth level

81

  • Depth level
  • Main predicate
  • Clause type
  • Polarity strength
  • In negation scope? (nice)
  • Intensified? ( nice)
  • In modal scope? (&!be nice)
  • Modified by positive/negative polar expression?
slide-82
SLIDE 82

Outline of Talk

  • Introduction to Polarity Classification
  • Semi-Supervised Learning for Document-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments

82

  • Supervised Learning Sentence-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • $
  • Topic-Related Sentence-Level Classification
  • Conclusion
slide-83
SLIDE 83

Set up of Experiments

  • Use a modified version of the MPQA corpus [Wiebe

2003]

  • Prior polarity of lexical units and their strength is
  • btained from #$2'[Wilson 2005]
  • Use SVMLight in standard configuration [Joachims

1999]

83

1999]

  • 10 fold cross-validation
  • Use χ2 feature selection on linguistic word-level

features

  • χ2 feature selection is done on every !

partitioning separately in cross-validation

slide-84
SLIDE 84

Results

;8,3

4 * Bag of words &&6 67.2 Bag of words 68.6 Bag of words + linguistic word-level features 79".

84 ;,3

4 * Prior polarity 70.4 Prior polarity + linguistic sentence-level features 73.8 Prior polarity + bag of words 75.4 Prior polarity + bag of words + ling. word-level features 76.7 Prior polarity + bag of words + ling. sent.-level features 76.8 Prior polarity + bag of words + all linguistic features 77"1

slide-85
SLIDE 85

Results

;8,3

4 * Bag of words &&6 67.2 Bag of words 68.6 Bag of words + linguistic word-level features 79".

85 ;,3

4 * Prior polarity 70.4 Prior polarity + linguistic sentence-level features 73.8 Prior polarity + bag of words 75.4 Prior polarity + bag of words + ling. word-level features 76.7 Prior polarity + bag of words + ling. sent.-level features 76.8 Prior polarity + bag of words + all linguistic features 77"1

slide-86
SLIDE 86

Results

;8,3

4 * Bag of words &&6 67.2 Bag of words 68.6 Bag of words + linguistic word-level features 79".

86 ;,3

4 * Prior polarity 70.4 Prior polarity + linguistic sentence-level features 73.8 Prior polarity + bag of words 75.4 Prior polarity + bag of words + ling. word-level features 76.7 Prior polarity + bag of words + ling. sent.-level features 76.8 Prior polarity + bag of words + all linguistic features 77"1

slide-87
SLIDE 87

Results

;8,3

4 * Bag of words &&6 67.2 Bag of words 68.6 Bag of words + linguistic word-level features 79".

87 ;,3

4 * Prior polarity 70.4 Prior polarity + linguistic sentence-level features 73.8 Prior polarity + bag of words 75.4 Prior polarity + bag of words + ling. word-level features 76.7 Prior polarity + bag of words + ling. sent.-level features 76.8 Prior polarity + bag of words + all linguistic features 77"1

slide-88
SLIDE 88

Results

;8,3

4 * Bag of words &&6 67.2 Bag of words 68.6 Bag of words + linguistic word-level features 79".

88 ;,3

4 * Prior polarity 70.4 Prior polarity + linguistic sentence-level features 73.8 Prior polarity + bag of words 75.4 Prior polarity + bag of words + ling. word-level features 76.7 Prior polarity + bag of words + ling. sent.-level features 76.8 Prior polarity + bag of words + all linguistic features 77"1

slide-89
SLIDE 89

Summary of Experiments

  • Bag-of-words features give pretty low performance
  • Linguistic word-level features using polarity

information increase performance when used in combination to bag-of-words features

  • Slightly better performance can be achieved with

explicit polarity features in combination with linguistic

89

explicit polarity features in combination with linguistic sentence-level features

  • Bag-of-words features with (basic) prior polarity

features are significantly better than both all word-level features or all sentence-level features

  • Best overall performance: use all feature types!
slide-90
SLIDE 90

Summary of Experiments

  • Bag-of-words features give pretty low performance
  • Linguistic word-level features using polarity

information increase performance when used in combination to bag-of-words features

  • Slightly better performance can be achieved with

explicit polarity features in combination with linguistic

90

explicit polarity features in combination with linguistic sentence-level features

  • :%%

% %

  • Best overall performance: use all feature sets!
slide-91
SLIDE 91

Outline of Talk

  • Introduction to Polarity Classification
  • Semi-Supervised Learning for Document-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments

91

  • Supervised Learning Sentence-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments
  • % %-23,4
  • Conclusion
slide-92
SLIDE 92

Topic-Related Sentence-Level Classification

  • Task: Extract sentences containing a specific

polarity towards a given topic from a corpus

  • Example:
  • Query: Find negative opinions about Mozart
  • Relevant Sentence: ,6" "0

#!&

92

#!&

  • Relevant feature types:
  • Disambiguating polarity features
  • Relational features describing the dependency

between topic word and polar expression

  • Spatial distance
  • Syntactic relations
slide-93
SLIDE 93

Outline of Talk

  • Introduction to Polarity Classification
  • Semi-Supervised Learning for Document-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments

93

  • Supervised Learning Sentence-Level Classification
  • The Task
  • Related Work
  • Feature Design
  • Experiments
  • Topic-Related Sentence-Level Classification
slide-94
SLIDE 94

Conclusion

  • Feature design is crucial for polarity

classification

  • Don‘t rely on the learning algorithms you

are using!

94

are using!

  • Two sources of knowledge are helpful:
  • Explicit knowledge of prior polarity of

words/expressions polarity lexicons

  • Linguistic knowledge
slide-95
SLIDE 95

References (Semi-Supervised Document-Level Learning)

  • 56"31(53)"Philip J.

Stone, Dexter C. Dunphy, Marshall S. Smith, and Daniel M. Ogilvie. , 789:

  • ,78(.7(("T. Joachims. *;

,< 12!"7888

  • 930(017:;(1<"K.

Nigam, A. McCallum, S. Thrun, and T. Mitchell. ,2!"()))

  • 51:+130((57(56"
  • B. Pang, L. Lee, S. Vaithyanathan. ! ,.2())(

95

  • B. Pang, L. Lee, S. Vaithyanathan. ! ,.2())(
  • 5151=+>1(?

:30(0@="P. Turney. ! *2"())(

  • 3(1>0(01(?01

(" P. Turney and M. Littman. ! ="())>

  • <90?<17" J. Wiebe, T.

Wilson, C. Cardie. Annotating Expressions of Opinions and Emotions

slide-96
SLIDE 96

References (Semi-Supervised Document-Level Learning)

  • (7(5"T. Joachims.

! ,2"())>

  • 51(>1@=30(A18

01"P. Beineke, T. Hastie, and S. Vaithyanathan. ! *2"())?

  • 1<(>1)()

11B"B. Pang and L. Lee. ! *2"())?

  • 130(31(,;>);C7

.(5@07()"M. Gamon. !

96

.(5@07()"M. Gamon. ! =2.+"())?

  • :01)"Casey Whitelaw, Navendu

Garg, and Shlomo Argamon. ! ;,"())@

  • @(B39)5871)" T.

Wilson, J. Wiebe, P. Hoffmann. ! ,.2A2"())@

  • 31B130=;1>3)"A. Aue

and M. Gamon. ! B*.2"())@

  • <?@;<>()79(@(0?"

Andrea Esuli and Fabrizio Sebastiani. ! 2B"())9

slide-97
SLIDE 97

References (Supervised Sentence- Level Classification)

  • ,78(.7((" T. Joachims. CD "

C!"*& Advances in Kernel Methods- Support Vector Learning. ,"788:

  • 930(:A))1"S. Scott, and S. Matwin.

! *2! ..2!! !&"788E

  • 51:+130((57(56"
  • B. Pang, L. Lee, S. Vaithyanathan, ! ,.2())(
  • <90?<17" J. Wiebe, T.

97

  • <90?<17" J. Wiebe, T.

Wilson, C. Cardie. Annotating Expressions of Opinions and Emotions in Language. 2!!B"())>

  • 130(31(,;>);C7

.(5@07()"M. Gamon. ! =2.+"())?

  • @(B39)5871)" T.

Wilson, J. Wiebe, P. Hoffmann. ! ,.2A2"())@

slide-98
SLIDE 98

References (Supervised Sentence- Level Classification)

  • 130(86(

;()8"S. Matsumoto, H. Takamura, M. Okumura. ! *;33"())@

  • <915@07(D=(5

0(30(0@=" V. Ng, S. Dasgupta, and

  • S. M. Niaz Arifin. ! =2.+A*2())9.
  • (018831)"R.

McDonald, K. Hannan, T. Neylong, M. Wells. ! *2"()):

98

McDonald, K. Hannan, T. Neylong, M. Wells. ! *2"()):

  • 13(" K. Moilanen and S. Pulman. !

B*.2"()):

  • (871)5(03(

:7()"A. Meena, and T.V. Prabhakar. ! B"()):

  • 7=5311((0(0

1)"Y. Choi and C. Cardie. ! ,.2())E