Feature Design for Polarity Classification Presentation for FEAST - - PowerPoint PPT Presentation
Feature Design for Polarity Classification Presentation for FEAST - - PowerPoint PPT Presentation
Feature Design for Polarity Classification Presentation for FEAST (Saarland University) 22nd April 2009 By Michael Wiegand Spoken Language Systems (IRTG) Outline of Talk Introduction to Polarity Classification Semi-Supervised
Outline of Talk
- Introduction to Polarity Classification
- Semi-Supervised Learning for Document-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
2
- Supervised Learning Sentence-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
- Topic-Related Sentence-Level Classification
- Conclusion
Outline of Talk
- Introduction to Polarity Classification
- Semi-Supervised Learning for Document-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
3
- Supervised Learning Sentence-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
- Topic-Related Sentence-Level Classification
- Conclusion
What is Polarity Classification?
- Polarity Classification is a subtask in Opinion Mining
- 2 different types of text classification in Opinion Mining:
- Subjectivity Detection
- Does a text represent an or a ?
- vs.
- 4
- Polarity Classification
- Given an opinionated text, is the opinion expressed in the
text or ?
- vs.
What is Polarity Classification?
- Polarity Classification is a subtask in Opinion Mining
- 2 different types of text classification in Opinion Mining:
- Subjectivity Detection
- Does a text represent an or a ?
- vs.
- 5
- Polarity Classification
- Given an opinionated text, is the opinion expressed in the
text or ?
- vs.
Why Polarity Classification?
- Increasingly more opinionated content on the
web (Web 2.0) need for retrieving/classifying this kind of content
- What makes polarity classification difficult?
- Different from common topic classification
6
- Different from common topic classification
- Different kind of cues: (e.g. ,
etc.); not necessarily frequent content words!
- (!, ")
- of polar expressions (e.g.
- vs. )
Outline of Talk
- Introduction to Polarity Classification
- Semi-Supervised Learning for Document-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
7
- Supervised Learning Sentence-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
- Topic-Related Sentence-Level Classification
- Conclusion
Semi-Supervised Learning - an Illustration
8
Semi-Supervised Learning - an Illustration
- 9
Semi-Supervised Learning - an Illustration
10
Semi-Supervised Learning - an Illustration
11
Semi-Supervised Learning - an Illustration
- 12
Semi-Supervised Learning - an Illustration
- 13
Semi-Supervised Learning - an Illustration
- 14
Semi-Supervised Learning - an Illustration
15
Semi-Supervised Learning - an Illustration
- 16
Semi-Supervised Learning - an Illustration
- 17
Outline of Talk
- Introduction to Polarity Classification
- Semi-Supervised Learning for Document-Level Classification
- Related Work
- Feature Design
- Experiments
18
- Supervised Learning Sentence-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
- Topic-Related Sentence-Level Classification
- Conclusion
The task
- Document-level text classification of reviews
- Decide whether a document is either a positive
- r a negative review
- Use labeled and unlabeled documents for
training
19
training
- All documents, both labeled and unlabeled, are
assumed to be subjective ( #$%)
- All documents, both labeled and unlabeled, are
either positive or negative reviews
Outline of Talk
- Introduction to Polarity Classification
- Semi-Supervised Learning for Document-Level Classification
- The Task
- Feature Design
- Experiments
20
- Supervised Learning Sentence-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
- Topic-Related Sentence-Level Classification
- Conclusion
Related Work
- Supervised Learning:
- Different algorithms and feature selection/extraction
methods [Pang 2002; Salvetti 2006; Ng 2006; Gamon 2004]
- Unsupervised Learning:
- Induction of polarity lexicons (i.e. identification of
21
- Induction of polarity lexicons (i.e. identification of
polar expression) using & &[Turney 2002]
- Semi-Supervised Learning:
- Extending Turney‘s webmining approach with
labeled data [Beineke 2004]
- EM in the context of domain adaptation [Aue 2005]
Contribution of this work
- extensive study of semi-supervised
learning for polarity classification
- Comparison of different feature sets
- Evaluation on various domains
22
- Evaluation on various domains
Outline of Talk
- Introduction to Polarity Classification
- Semi-Supervised Learning for Document-Level Classification
- The Task
- Related Work
- Experiments
23
- Supervised Learning Sentence-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
- Topic-Related Sentence-Level Classification
- Conclusion
Why is feature selection more important in semi- supervised learning than in supervised learning?
- Less information contained in small
labeled datasets intrinsic predictiveness of features is important
- Inappropriate feature sets may lead
24
- Inappropriate feature sets may lead
semi-supervised classifiers astray
- In polarity classification there is the
danger that topic information interferes
Different Types of Feature Selection Method
- Standard baseline:
- Frequently occurring words in collection
- Exclude stopwords
- Polarity lexicons:
- Open-domain lexicon containing ',
25
- Open-domain lexicon containing ',
i.e. words carrying a prior polarity (e.g. nice, hate, superb, awful)
- Adjectives & adverbs:
- Parts of speech with the highest proportion of polar
expressions
The Different Feature Sets
!"
- #$
- Top non-stopwords
())) Automatic Selection 2000
- Top adjectives & adverbs
*$ Automatic Selection 600
- 26
Selection Appraisal Groups *+ Polarity Lexicon 2014 [Whitelaw 2005] General Inquirer + Polarity Lexicon 2882 [Stone 1967] Subjectivity Lexicon ,-* Polarity Lexicon 4615 [Wilson 2005] SentiWordNet . Polarity Lexicon 11366 [Esuli 2006]
The Different Feature Sets
!"
- #$
- %
- !
- &'''
%% ()
- !
- *''
%%
27
Appraisal Groups *+ Polarity Lexicon 2014 [Whitelaw 2005] General Inquirer + Polarity Lexicon 2882 [Stone 1967] Subjectivity Lexicon ,-* Polarity Lexicon 4615 [Wilson 2005] SentiWordNet . Polarity Lexicon 11366 [Esuli 2006]
The Different Feature Sets
!"
- #$
- Top non-stopwords
())) Automatic Selection 2000
- Top adjectives & adverbs
*$ Automatic Selection 600
- 28
Selection
!+
- ,
- &'./
0&''12 +34
- ,
- &55&
0 .6*72 (-
- ,
- /*.1
0&''12 8
- ,
- ..9**
0$&''*2
Outline of Talk
- Introduction to Polarity Classification
- Semi-Supervised Learning for Document-Level Classification
- The Task
- Related Work
- Feature Design
- $
- Supervised Learning Sentence-Level Classification
29
- Supervised Learning Sentence-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
- Topic-Related Sentence-Level Classification
- Conclusion
Set up of Experiments
- Two different datasets:
- IMDB-movie reviews: 1000 positive and 1000 negative reviews
(benchmark data-set) [Pang 2002]
- RateItAll reviews /&""":
900 positive and 900 negative reviews per domain
- Learn models with varying amounts of labeled
30
- Learn models with varying amounts of labeled
documents (20, 50, 100, 200 etc.)
- 1000 unlabeled documents (also serve as labeled test
data [Aue 2005; Joachims 1999; Joachims 2003])
- Average results of 20 random samples
Set up of Experiments
- Word normalization by applying
&&!
- Part-of-speech tagging was done using 0
!!
31
- Use 1,2! for Support Vector Machines
[SVMs] (i.e. supervised classifier)
- Use +2!for Spectral Graph Transduction
[SGT] (i.e. semi-supervised classifier)
Decomposing Best Configuration ,3&/ 4/ 5*!&/
32
Decomposing Best Configuration ,3&/ 4/ 5*!&/
33
On this domain Appraisal Groups *+ work best, :; <
Semi-Supervised Learning: different feature sets
- n different domains trained on and
documents
=>
? @
+
? $>3% @
,-
- >,A! 8
+3 !+
- *''
34
- &'''
&'''
>,A! 8
+3 !+ *'' !() !
- 67.75
73.88 !" 74.77 73.95 73.74 74.51
- 62.38
67.20 68.45 68.40 #$"%& 68.44 68.79
- 57.96
61.83 57.57 59.80 60.62 58.53 #'"!!
- 57.95
57.48 65.44 68.37 #%"# 65.09 68.05
- 59.81
62.60 67.39 67.10 66.14 "% 66.58
- 61.17
64.60 66.92 67.69 67.83 67.22 *5"9'
Semi-Supervised Learning: different feature sets
- n different domains trained on and
documents
=>
? @
+
? $>3% @
,-
- >,A! 8
+3 !+
- *''
35
- &'''
&'''
>,A! 8
+3 !+ *'' !() !
- 67.75
73.88 !" 74.77 73.95 73.74 74.51
- 62.38
67.20 68.45 68.40 #$"%& 68.44 68.79
- 57.96
61.83 57.57 59.80 60.62 58.53 #'"!!
- 57.95
57.48 65.44 68.37 #%"# 65.09 68.05
- 59.81
62.60 67.39 67.10 66.14 "% 66.58
- 61.17
64.60 66.92 67.69 67.83 67.22 *5"9'
just using a standard feature set may not improve performance at all (when switching to SGT)
Semi-Supervised Learning: different feature sets
- n different domains trained on and
documents
=>
? @
+
? $>3% @
,-
- >,A! 8
+3 !+
- *''
36
- &'''
&'''
>,A! 8
+3 !+ *'' !() !
- 67.75
73.88 !" 74.77 73.95 73.74 74.51
- 62.38
67.20 68.45 68.40 #$"%& 68.44 68.79
- 57.96
61.83 57.57 59.80 60.62 58.53 #'"!!
- 57.95
57.48 65.44 68.37 #%"# 65.09 68.05
- 59.81
62.60 67.39 67.10 66.14 "% 66.58
- 61.17
64.60 66.92 67.69 67.83 67.22 *5"9'
Semi-Supervised Learning: different feature sets
- n different domains trained on and
documents
=>
? @
+
? $>3% @
,-
- >,A! 8
+3 !+
- *''
- n each domain, compare the best
feature set with adjectives/adverbs
37
- &'''
&'''
>,A! 8
+3 !+ *'' !() !
- 67.75
73.88 !" 74.77 73.95 73.74 74.51
- 62.38
67.20 68.45 68.40 #$"%& 68.44 68.79
- 57.96
61.83 57.57 59.80 60.62 58.53 #'"!!
- 57.95
57.48 65.44 68.37 #%"# 65.09 68.05
- 59.81
62.60 67.39 67.10 66.14 "% 66.58
- 61.17
64.60 66.92 67.69 67.83 67.22 *5"9'
Why do adjectives & adverbs do so well?
- High proportion of polar expressions among
these parts of speech
- In-domain words are used
- In-domain polar adjectives, such as &or
(food domain), do not appear in open-
38
(food domain), do not appear in open- domain polarity lexicons
- Stemming allows capturing polar expressions
from other parts of speech
- ! (adj) and (noun) are both
reduced to
Summary of Experiments in Semi- Supervised Classification
- Semi-supervised learning produces better results than
supervised learning in polarity classification when only few labeled data are used
- Polarity lexicons and in-domain adjectives & adverbs
- ffer better performance than frequent non-stopwords
- Specific polarity lexicons can significantly outperform
39
- Specific polarity lexicons can significantly outperform
- ther feature sets on particular domains (e.g. *
+ *+ on movie domain)
- On average, in-domain $0#produce
best performance
Outline of Talk
- Introduction to Polarity Classification
- Semi-Supervised Learning for Document-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
40
- Supervised Learning Sentence-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
- Topic-Related Sentence-Level Classification
- Conclusion
Outline of Talk
- Introduction to Polarity Classification
- Semi-Supervised Learning for Document-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
41
- Supervised Learning Sentence-Level Classification
- Related Work
- Feature Design
- Experiments
- Topic-Related Sentence-Level Classification
- Conclusion
The Task
- Decide whether a given (is
either an overall positive or negative
- pinion
- Learn exclusively from labeled data
42
- Learn exclusively from labeled data
- All sentences to be classified are
assumed to be subjective and carrying either positive or negative overall polarity
Why Polarity Classification at Sentence-Level?
- More fine-grained polarity classification
than document-level classification is needed for NLP tasks such as:
- Question Answering
43
- Question Answering
- Text Summarization
- Subjectivity/polarity is usually not
uniformly distributed across a document [McDonald 2006]
Difference to Document-Level Classification
… … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … B;>$8
44
… … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … )*+++
Difference to Document-Level Classification
… … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … B;>$8
45
… … … … … … … … … … … … … … … … … … … … … … , … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … )*+++
Difference to Document-Level Classification
… … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … B;>$8
- At document level text
classification relies very much on redundancy
- The correct interpretation of
every word is NOT necessary
46
… … … … … … … … … … … … … … … … … … … … … … , … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … )*
necessary
- One does not have to
determine whether , is a verb and hence a positive polar expression or just a preposition
- the unambiguous polar
expressions suffice for classification
Difference to Document-Level Classification
$8$8$
- At sentence level the
correct interpretation of every word is more important!
47
… … … … , … … … … … $8$8$ )*+++
important!
- There is less information
contained in a sentence
- One needs to determine
whether , is a verb and hence a positive polar expression or just a preposition
Difference to Document-Level Classification
$8$8$
- At sentence level the
correct interpretation of every word is more important!
48
… … … … ,-./… … … $8$8$ )*+++
important!
- There is less information
contained in a sentence
- One needs to determine
whether , is a verb and hence a positive polar expression or just a preposition
Difference to Document-Level Classification
$8$8$
- At sentence level the
correct interpretation of every word is more important!
49
… … … … ,-./… … … $8$8$ )*
important!
- There is less information
contained in a sentence
- One needs to determine
whether , is a verb and hence a positive polar expression or just a preposition
Outline of Talk
- Introduction to Polarity Classification
- Semi-Supervised Learning for Document-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
50
- Supervised Learning Sentence-Level Classification
- The Task
- Feature Design
- Experiments
- Topic-Related Sentence-Level Classification
- Conclusion
Related Work
- Linguistic features for polarity classification:
- [Gamon 2004; Ng 2006]: linguistic features on
document-level polarity classification
- [Wilson 2005; Choi 2008]: contextual
disambiguation of expression-level polarity classification
51
classification
- Sentence-level polarity classification:
- [Meena 2007]: conjunctions as features for
sentence-level polarity classification of product reviews
- [Moilanen 2007]: deep syntactic features on polarity
classification of headlines
Contribution of this Work
- Inspection of various linguistic features
derived from sentence structure
- Various combinations of linguistic
features, polarity features and bag-of-
52
features, polarity features and bag-of- words features (unlike [Moilanen 2007])
Feature Hierarchy
- 53
% %
bag of words linguistic features prior polarity linguistic features
Feature Hierarchy
- 54
% %
bag of words linguistic features prior polarity linguistic features
Feature Hierarchy
- 8
- C
55
% %
bag of words linguistic features prior polarity linguistic features
Feature Hierarchy
- 56
% %
- linguistic features
prior polarity linguistic features
Feature Hierarchy
- 57
% %
- linguistic features
prior polarity linguistic features
- !
D
Feature Hierarchy
- 58
% %
bag of words
- prior polarity
linguistic features
Linguistic Word-Level Features
- Linguistic features: describe linguistic
properties of words
- (Generalized) parts of speech:
- tags: "#"$"#
- WordNet hypernyms
59
- WordNet hypernyms
- Depth level !
- Main predicate
- Clause type
Linguistic Word-Level Features
- Linguistic features: describe linguistic
properties of words
- (Generalized) parts of speech:
- tags: "#"$"#
- WordNet hypernyms
60
- WordNet hypernyms
- Depth level !
- Main predicate
- Clause type
Example of linguistic word-level feature: word: + part of speech: # feature: _is_a__in_this_sentence
Linguistic Word-Level Features
- Linguistic features: describe linguistic
properties of words
- ?+D@E
- tags: "#"$"#
- WordNet hypernyms
- 61
- WordNet hypernyms
- Depth level !
- Main predicate
- Clause type
Linguistic Word-Level Features
- Linguistic features: describe linguistic
properties of words
- (Generalized) part of speech:
- tags: "#"$"#
- 8
+D
62
- 8
- Depth level !
- Main predicate
- Clause type
+D
Linguistic Word-Level Features
- Linguistic features: describe linguistic
properties of words
- (Generalized) part of speech:
- tags: "#"$"#
- WordNet hypernyms
63
- WordNet hypernyms
- !
- >
,
Linguistic Word-Level Features - Motivation
- Part of Speech:
- Some polar cues are ambiguous, e.g. can be
either a polar verb or a preposition
- Polar verb: 0I spinach2F.
- Preposition: 0The durian fruit smells hell2%.
- WordNet Hypernyms
64
- WordNet Hypernyms
- Hypernyms are a means of generalization
- Car #$
- Since sentence-level text classification is sparse,
adding hypernyms of the words in the sentence may compensate this
Depth Level Feature - Motivation
Fortunately , the Court has .
65
Fortunately multiple , independent bases the Court has to reject the Government `s Motion
Depth Level Feature - Motivation
Fortunately , the Court has .
Overall polarity:
66
Fortunately multiple , independent bases the Court has to reject the Government `s Motion
Depth Level Feature - Motivation
- ,
the Court has .
Overall polarity:
67
- multiple , independent bases
the Court has to reject the Government `s Motion
Depth Level Feature - Motivation
- ,
the Court has .
Overall polarity:
68
- multiple , independent bases
the Court has to ( the Government `s Motion
Depth Level Feature - Motivation
- ,
the Court has .
Overall polarity:
69
- multiple , independent bases
the Court has to ( the Government `s Motion
The sentence contains both a positive and a negative polar expression.
Depth Level Feature - Motivation
- ,
the Court has .
Overall polarity:
70
- multiple , independent bases
the Court has to ( the Government `s Motion
Determine the depth level
- f both polar
expressions!
Depth Level Feature - Motivation
- ,
the Court has .
Overall polarity:
' . & 9
71
- multiple , independent bases
the Court has to ( the Government `s Motion
Determine the depth level
- f both polar
expressions!
9
Depth Level Feature - Motivation
- ,
the Court has .
Overall polarity:
' . & 9
72
- multiple , independent bases
the Court has to ( the Government `s Motion
Determine the depth level
- f both polar
expressions!
9
Depth Level Feature - Motivation
- ,
the Court has .
Overall polarity:
' . & 9
73
- multiple , independent bases
the Court has to ( the Government `s Motion
Depth of positive polar Expression is
9 9
Depth Level Feature - Motivation
- ,
the Court has .
Overall polarity:
' . & 9 . & 9
74
- multiple , independent bases
the Court has to ( the Government `s Motion
Depth of negative polar Expression is
9 5 9 / 1 * 7 5
Depth Level Feature - Motivation
- ,
the Court has .
Overall polarity:
' . & 9 . & 9
75
- multiple , independent bases
the Court has to ( the Government `s Motion
Positive polar expression is preferred since it is in a lower depth level
9 9 / 1 * 7 5
Other Prominence Features - Motivation
- Main predicate:
- Main predicate is more predictive towards polarity
than other grammatical relations
- 0The Pakistani government -/ President
Bush and his on 2F.
- Clause type:
76
- Clause type:
- Expressions from a & are more important
than expressions from subordinate clauses
- 0[Although he had ] 0((0)
1 (212F.
- Some types of conjunctions weaken !"
- r strengthen a clause #
Feature Hierarchy
- !
- C
77
% %
bag of words linguistic features prior polarity linguistic features
Feature Hierarchy
- 78
% %
bag of words linguistic features
- linguistic features
Prior Polarity Features
- Explicit polarity:
- Number of polar expressions in
sentence
- Number of ! polar expressions in
79
- Number of ! polar expressions in
sentence
Feature Hierarchy
- 80
% %
bag of words linguistic features prior polarity
Linguistic Sentence-Level Features
- Prior polarity features in conjunction with a
particular linguistic property
- Linguistic properties:
- Part of speech
- Depth level
81
- Depth level
- Main predicate
- Clause type
- Polarity strength
- In negation scope? (nice)
- Intensified? ( nice)
- In modal scope? (&!be nice)
- Modified by positive/negative polar expression?
Outline of Talk
- Introduction to Polarity Classification
- Semi-Supervised Learning for Document-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
82
- Supervised Learning Sentence-Level Classification
- The Task
- Related Work
- Feature Design
- $
- Topic-Related Sentence-Level Classification
- Conclusion
Set up of Experiments
- Use a modified version of the MPQA corpus [Wiebe
2003]
- Prior polarity of lexical units and their strength is
- btained from #$2'[Wilson 2005]
- Use SVMLight in standard configuration [Joachims
1999]
83
1999]
- 10 fold cross-validation
- Use χ2 feature selection on linguistic word-level
features
- χ2 feature selection is done on every !
partitioning separately in cross-validation
Results
;8,3
4 * Bag of words &&6 67.2 Bag of words 68.6 Bag of words + linguistic word-level features 79".
84 ;,3
4 * Prior polarity 70.4 Prior polarity + linguistic sentence-level features 73.8 Prior polarity + bag of words 75.4 Prior polarity + bag of words + ling. word-level features 76.7 Prior polarity + bag of words + ling. sent.-level features 76.8 Prior polarity + bag of words + all linguistic features 77"1
Results
;8,3
4 * Bag of words &&6 67.2 Bag of words 68.6 Bag of words + linguistic word-level features 79".
85 ;,3
4 * Prior polarity 70.4 Prior polarity + linguistic sentence-level features 73.8 Prior polarity + bag of words 75.4 Prior polarity + bag of words + ling. word-level features 76.7 Prior polarity + bag of words + ling. sent.-level features 76.8 Prior polarity + bag of words + all linguistic features 77"1
Results
;8,3
4 * Bag of words &&6 67.2 Bag of words 68.6 Bag of words + linguistic word-level features 79".
86 ;,3
4 * Prior polarity 70.4 Prior polarity + linguistic sentence-level features 73.8 Prior polarity + bag of words 75.4 Prior polarity + bag of words + ling. word-level features 76.7 Prior polarity + bag of words + ling. sent.-level features 76.8 Prior polarity + bag of words + all linguistic features 77"1
Results
;8,3
4 * Bag of words &&6 67.2 Bag of words 68.6 Bag of words + linguistic word-level features 79".
87 ;,3
4 * Prior polarity 70.4 Prior polarity + linguistic sentence-level features 73.8 Prior polarity + bag of words 75.4 Prior polarity + bag of words + ling. word-level features 76.7 Prior polarity + bag of words + ling. sent.-level features 76.8 Prior polarity + bag of words + all linguistic features 77"1
Results
;8,3
4 * Bag of words &&6 67.2 Bag of words 68.6 Bag of words + linguistic word-level features 79".
88 ;,3
4 * Prior polarity 70.4 Prior polarity + linguistic sentence-level features 73.8 Prior polarity + bag of words 75.4 Prior polarity + bag of words + ling. word-level features 76.7 Prior polarity + bag of words + ling. sent.-level features 76.8 Prior polarity + bag of words + all linguistic features 77"1
Summary of Experiments
- Bag-of-words features give pretty low performance
- Linguistic word-level features using polarity
information increase performance when used in combination to bag-of-words features
- Slightly better performance can be achieved with
explicit polarity features in combination with linguistic
89
explicit polarity features in combination with linguistic sentence-level features
- Bag-of-words features with (basic) prior polarity
features are significantly better than both all word-level features or all sentence-level features
- Best overall performance: use all feature types!
Summary of Experiments
- Bag-of-words features give pretty low performance
- Linguistic word-level features using polarity
information increase performance when used in combination to bag-of-words features
- Slightly better performance can be achieved with
explicit polarity features in combination with linguistic
90
explicit polarity features in combination with linguistic sentence-level features
- :%%
% %
- Best overall performance: use all feature sets!
Outline of Talk
- Introduction to Polarity Classification
- Semi-Supervised Learning for Document-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
91
- Supervised Learning Sentence-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
- % %-23,4
- Conclusion
Topic-Related Sentence-Level Classification
- Task: Extract sentences containing a specific
polarity towards a given topic from a corpus
- Example:
- Query: Find negative opinions about Mozart
- Relevant Sentence: ,6" "0
#!&
92
#!&
- Relevant feature types:
- Disambiguating polarity features
- Relational features describing the dependency
between topic word and polar expression
- Spatial distance
- Syntactic relations
Outline of Talk
- Introduction to Polarity Classification
- Semi-Supervised Learning for Document-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
93
- Supervised Learning Sentence-Level Classification
- The Task
- Related Work
- Feature Design
- Experiments
- Topic-Related Sentence-Level Classification
Conclusion
- Feature design is crucial for polarity
classification
- Don‘t rely on the learning algorithms you
are using!
94
are using!
- Two sources of knowledge are helpful:
- Explicit knowledge of prior polarity of
words/expressions polarity lexicons
- Linguistic knowledge
References (Semi-Supervised Document-Level Learning)
- 56"31(53)"Philip J.
Stone, Dexter C. Dunphy, Marshall S. Smith, and Daniel M. Ogilvie. , 789:
- ,78(.7(("T. Joachims. *;
,< 12!"7888
- 930(017:;(1<"K.
Nigam, A. McCallum, S. Thrun, and T. Mitchell. ,2!"()))
- 51:+130((57(56"
- B. Pang, L. Lee, S. Vaithyanathan. ! ,.2())(
95
- B. Pang, L. Lee, S. Vaithyanathan. ! ,.2())(
- 5151=+>1(?
:30(0@="P. Turney. ! *2"())(
- 3(1>0(01(?01
(" P. Turney and M. Littman. ! ="())>
- <90?<17" J. Wiebe, T.
Wilson, C. Cardie. Annotating Expressions of Opinions and Emotions
References (Semi-Supervised Document-Level Learning)
- (7(5"T. Joachims.
! ,2"())>
- 51(>1@=30(A18
01"P. Beineke, T. Hastie, and S. Vaithyanathan. ! *2"())?
- 1<(>1)()
11B"B. Pang and L. Lee. ! *2"())?
- 130(31(,;>);C7
.(5@07()"M. Gamon. !
96
.(5@07()"M. Gamon. ! =2.+"())?
- :01)"Casey Whitelaw, Navendu
Garg, and Shlomo Argamon. ! ;,"())@
- @(B39)5871)" T.
Wilson, J. Wiebe, P. Hoffmann. ! ,.2A2"())@
- 31B130=;1>3)"A. Aue
and M. Gamon. ! B*.2"())@
- <?@;<>()79(@(0?"
Andrea Esuli and Fabrizio Sebastiani. ! 2B"())9
References (Supervised Sentence- Level Classification)
- ,78(.7((" T. Joachims. CD "
C!"*& Advances in Kernel Methods- Support Vector Learning. ,"788:
- 930(:A))1"S. Scott, and S. Matwin.
! *2! ..2!! !&"788E
- 51:+130((57(56"
- B. Pang, L. Lee, S. Vaithyanathan, ! ,.2())(
- <90?<17" J. Wiebe, T.
97
- <90?<17" J. Wiebe, T.
Wilson, C. Cardie. Annotating Expressions of Opinions and Emotions in Language. 2!!B"())>
- 130(31(,;>);C7
.(5@07()"M. Gamon. ! =2.+"())?
- @(B39)5871)" T.
Wilson, J. Wiebe, P. Hoffmann. ! ,.2A2"())@
References (Supervised Sentence- Level Classification)
- 130(86(
;()8"S. Matsumoto, H. Takamura, M. Okumura. ! *;33"())@
- <915@07(D=(5
0(30(0@=" V. Ng, S. Dasgupta, and
- S. M. Niaz Arifin. ! =2.+A*2())9.
- (018831)"R.
McDonald, K. Hannan, T. Neylong, M. Wells. ! *2"()):
98
McDonald, K. Hannan, T. Neylong, M. Wells. ! *2"()):
- 13(" K. Moilanen and S. Pulman. !
B*.2"()):
- (871)5(03(
:7()"A. Meena, and T.V. Prabhakar. ! B"()):
- 7=5311((0(0