Exploring Demographic Language Variations to Improve Multilingual - - PowerPoint PPT Presentation

exploring demographic language variations to improve
SMART_READER_LITE
LIVE PREVIEW

Exploring Demographic Language Variations to Improve Multilingual - - PowerPoint PPT Presentation

Exploring Demographic Language Variations to Improve Multilingual Sentiment Analysis in Social Media Svitlana Volkova 1 , Theresa Wilson 2 and David Yarowsky 1 , 2 , 1 Center for Language and Speech Processing, Johns Hopkins University 2


slide-1
SLIDE 1

Exploring Demographic Language Variations to Improve Multilingual Sentiment Analysis in Social Media

Svitlana Volkova1, Theresa Wilson2 and David Yarowsky1,2,

1Center for Language and Speech Processing, Johns Hopkins University 2Human-Language technology Center of Excellence

slide-2
SLIDE 2

Motivation

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 2 / 35

slide-3
SLIDE 3

Motivation

Demographic language variations (DLV) have been studied by socio-linguists for decades (Picard, 1997; Gefen & Ridings, 2005; Holmes & Meyerhoff, 2004; Macaulay, 2006; Tagliamonte, 2006).

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 2 / 35

slide-4
SLIDE 4

Motivation

Demographic language variations (DLV) have been studied by socio-linguists for decades (Picard, 1997; Gefen & Ridings, 2005; Holmes & Meyerhoff, 2004; Macaulay, 2006; Tagliamonte, 2006). DLV have been recently explored in personal email communication, blog posts, and public discussions (Boneva et al., 2001; Mohammad & Yang, 2011; Eisenstein et al., 2010; Bamman et al., 2012)

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 2 / 35

slide-5
SLIDE 5

Motivation

Demographic language variations (DLV) have been studied by socio-linguists for decades (Picard, 1997; Gefen & Ridings, 2005; Holmes & Meyerhoff, 2004; Macaulay, 2006; Tagliamonte, 2006). DLV have been recently explored in personal email communication, blog posts, and public discussions (Boneva et al., 2001; Mohammad & Yang, 2011; Eisenstein et al., 2010; Bamman et al., 2012) We propose to study differences in subjective language in social media to support commercial applications:

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 2 / 35

slide-6
SLIDE 6

Motivation

Demographic language variations (DLV) have been studied by socio-linguists for decades (Picard, 1997; Gefen & Ridings, 2005; Holmes & Meyerhoff, 2004; Macaulay, 2006; Tagliamonte, 2006). DLV have been recently explored in personal email communication, blog posts, and public discussions (Boneva et al., 2001; Mohammad & Yang, 2011; Eisenstein et al., 2010; Bamman et al., 2012) We propose to study differences in subjective language in social media to support commercial applications: personalized recommendation systems and targeted online advertising (Fan & Chang, 2009),

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 2 / 35

slide-7
SLIDE 7

Motivation

Demographic language variations (DLV) have been studied by socio-linguists for decades (Picard, 1997; Gefen & Ridings, 2005; Holmes & Meyerhoff, 2004; Macaulay, 2006; Tagliamonte, 2006). DLV have been recently explored in personal email communication, blog posts, and public discussions (Boneva et al., 2001; Mohammad & Yang, 2011; Eisenstein et al., 2010; Bamman et al., 2012) We propose to study differences in subjective language in social media to support commercial applications: personalized recommendation systems and targeted online advertising (Fan & Chang, 2009), detecting helpful product reviews (Ott et al., 2011),

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 2 / 35

slide-8
SLIDE 8

Motivation

Demographic language variations (DLV) have been studied by socio-linguists for decades (Picard, 1997; Gefen & Ridings, 2005; Holmes & Meyerhoff, 2004; Macaulay, 2006; Tagliamonte, 2006). DLV have been recently explored in personal email communication, blog posts, and public discussions (Boneva et al., 2001; Mohammad & Yang, 2011; Eisenstein et al., 2010; Bamman et al., 2012) We propose to study differences in subjective language in social media to support commercial applications: personalized recommendation systems and targeted online advertising (Fan & Chang, 2009), detecting helpful product reviews (Ott et al., 2011), tracking sentiment in real time (Resnik, 2013),

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 2 / 35

slide-9
SLIDE 9

Motivation

Demographic language variations (DLV) have been studied by socio-linguists for decades (Picard, 1997; Gefen & Ridings, 2005; Holmes & Meyerhoff, 2004; Macaulay, 2006; Tagliamonte, 2006). DLV have been recently explored in personal email communication, blog posts, and public discussions (Boneva et al., 2001; Mohammad & Yang, 2011; Eisenstein et al., 2010; Bamman et al., 2012) We propose to study differences in subjective language in social media to support commercial applications: personalized recommendation systems and targeted online advertising (Fan & Chang, 2009), detecting helpful product reviews (Ott et al., 2011), tracking sentiment in real time (Resnik, 2013), large-scale, low-cost, passive polling (O’Connor et al., 2010).

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 2 / 35

slide-10
SLIDE 10

Motivation

Male ♂ and Female ♀ Twitter users use subjective terms differently: ♀+ “Chocolate is my weakness”

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 3 / 35

slide-11
SLIDE 11

Motivation

Male ♂ and Female ♀ Twitter users use subjective terms differently: ♀+ “Chocolate is my weakness” ♂− “Clearly they know our weakness. Argggg....”

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 3 / 35

slide-12
SLIDE 12

Motivation

Male ♂ and Female ♀ Twitter users use subjective terms differently: ♀+ “Chocolate is my weakness” ♂− “Clearly they know our weakness. Argggg....”

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 3 / 35

slide-13
SLIDE 13

Motivation

Male ♂ and Female ♀ Twitter users use subjective terms differently: ♀+ “Chocolate is my weakness” ♂− “Clearly they know our weakness. Argggg....”

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 4 / 35

slide-14
SLIDE 14

Motivation

Male ♂ and Female ♀ Twitter users use subjective terms differently: ♀+ “Chocolate is my weakness” ♂− “Clearly they know our weakness. Argggg....”

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 5 / 35

slide-15
SLIDE 15

Motivation

Male ♂ and Female ♀ Twitter users use subjective terms differently: ♀+ “Chocolate is my weakness” ♂− “Clearly they know our weakness. Argggg....”

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 6 / 35

slide-16
SLIDE 16

Goal

  • I. Explore gender bias in the use of subjective language in Twitter:
  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 7 / 35

slide-17
SLIDE 17

Goal

  • I. Explore gender bias in the use of subjective language in Twitter:

investigate multilingual subjective lexical variations;

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 7 / 35

slide-18
SLIDE 18

Goal

  • I. Explore gender bias in the use of subjective language in Twitter:

investigate multilingual subjective lexical variations; cross-cultural emoticon and hashtag usage.

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 7 / 35

slide-19
SLIDE 19

Goal

  • I. Explore gender bias in the use of subjective language in Twitter:

investigate multilingual subjective lexical variations; cross-cultural emoticon and hashtag usage.

  • II. Incorporate gender bias into models to improve sentiment

analysis for English, Spanish, and Russian:

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 7 / 35

slide-20
SLIDE 20

Goal

  • I. Explore gender bias in the use of subjective language in Twitter:

investigate multilingual subjective lexical variations; cross-cultural emoticon and hashtag usage.

  • II. Incorporate gender bias into models to improve sentiment

analysis for English, Spanish, and Russian:

demonstrate that simple, binary features representing author gender are insufficient for gender-dependent sentiment analysis.

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 7 / 35

slide-21
SLIDE 21

Data

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 8 / 35

slide-22
SLIDE 22

Data

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 8 / 35

slide-23
SLIDE 23

Data

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 8 / 35

slide-24
SLIDE 24

Data

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 9 / 35

slide-25
SLIDE 25

Data

Automatic gender label prediction using user first name morphology (precision is above 0.98 across languages).

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 9 / 35

slide-26
SLIDE 26

Data

Automatic gender label prediction using user first name morphology (precision is above 0.98 across languages). Sentiment labels from Mechanical Turk (5 annotations per tweet):

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 9 / 35

slide-27
SLIDE 27

Data

Automatic gender label prediction using user first name morphology (precision is above 0.98 across languages). Sentiment labels from Mechanical Turk (5 annotations per tweet):

Positive: Как же приятно просто лечь в постель после тяжелого дня... (It is a great pleasure to go to bed after a long day at work...)

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 9 / 35

slide-28
SLIDE 28

Data

Automatic gender label prediction using user first name morphology (precision is above 0.98 across languages). Sentiment labels from Mechanical Turk (5 annotations per tweet):

Positive: Как же приятно просто лечь в постель после тяжелого дня... (It is a great pleasure to go to bed after a long day at work...) Negative: Уважаемый господин Прохоров купите эти выборы! (Dear Mr. Prokhorov just buy the elections!)

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 9 / 35

slide-29
SLIDE 29

Data

Automatic gender label prediction using user first name morphology (precision is above 0.98 across languages). Sentiment labels from Mechanical Turk (5 annotations per tweet):

Positive: Как же приятно просто лечь в постель после тяжелого дня... (It is a great pleasure to go to bed after a long day at work...) Negative: Уважаемый господин Прохоров купите эти выборы! (Dear Mr. Prokhorov just buy the elections!) Both: Затолкали меня на местном рынке! но зато закупилась подарками для всей семьи :) (It was crowded at the local market! But I got presents for my family:-))

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 9 / 35

slide-30
SLIDE 30

Data

Automatic gender label prediction using user first name morphology (precision is above 0.98 across languages). Sentiment labels from Mechanical Turk (5 annotations per tweet):

Positive: Как же приятно просто лечь в постель после тяжелого дня... (It is a great pleasure to go to bed after a long day at work...) Negative: Уважаемый господин Прохоров купите эти выборы! (Dear Mr. Prokhorov just buy the elections!) Both: Затолкали меня на местном рынке! но зато закупилась подарками для всей семьи :) (It was crowded at the local market! But I got presents for my family:-)) Neutral: Киев очень старый город (Kiev is a very old city).

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 9 / 35

slide-31
SLIDE 31

Metrics Lexical Evaluation across Genders

Term ti subjectivity: pti(subj|g) = c(ti, P, g) + c(ti, N, g) c(ti, g) ,

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 10 / 35

slide-32
SLIDE 32

Metrics Lexical Evaluation across Genders

Term ti subjectivity: pti(subj|g) = c(ti, P, g) + c(ti, N, g) c(ti, g) , Term ti polarity: pti(+|g) = c(ti, P, g) c(ti, P, g) + c(ti, N, g),

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 10 / 35

slide-33
SLIDE 33

Metrics Lexical Evaluation across Genders

Term ti subjectivity: pti(subj|g) = c(ti, P, g) + c(ti, N, g) c(ti, g) , Term ti polarity: pti(+|g) = c(ti, P, g) c(ti, P, g) + c(ti, N, g), Polarity change across genders: ∆p+

ti = |pti(+|F) − pti(+|M)|

s.t.

  • 1 −

tf subj

ti

(F) tf subj

ti

(M)

  • ≤ λ, tf subj

ti

(M) = 0, λ controls term frequency similarity.

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 10 / 35

slide-34
SLIDE 34

Lexical Evaluation across Genders for English

Terms: - from LI, △ - bootstrapped lexicon LB, and - hashtags

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 11 / 35

slide-35
SLIDE 35

Lexical Evaluation across Genders for English

Terms: - from LI, △ - bootstrapped lexicon LB, and - hashtags

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 11 / 35

slide-36
SLIDE 36

Lexical Evaluation across Genders for English

Terms: - from LI, △ - bootstrapped lexicon LB, and - hashtags

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 12 / 35

slide-37
SLIDE 37

Lexical Evaluation across Genders for English

Terms: - from LI, △ - bootstrapped lexicon LB, and - hashtags

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 13 / 35

slide-38
SLIDE 38

Lexical Evaluation for Spanish and Russian

Spanish:

fiasco, triunfar (succeed) and #britneyspears used F + but M−; horooriza (horrifies), #metallica and #latingrammy used F − but M+.

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 14 / 35

slide-39
SLIDE 39

Lexical Evaluation for Spanish and Russian

Spanish:

fiasco, triunfar (succeed) and #britneyspears used F + but M−; horooriza (horrifies), #metallica and #latingrammy used F − but M+.

Russian:

мечтайте (dream!), магический (magical) and совет (advice) used F + but M−; исскушение (temptation), сложны (complicated), #iphones and #spartak (soccer team) used F − but M+.

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 14 / 35

slide-40
SLIDE 40

How gender differences in subjective language can help subjectivity and polarity classification in social media?

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 15 / 35

slide-41
SLIDE 41

Rule-based Subjectivity Classifiers

Gender-Independent (Riloff & Wiebe, 2003; Volkova et al., 2013): GIndRB

subj =

  • 1

if w · f ≥ 0.5,

  • therwise.
  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 16 / 35

slide-42
SLIDE 42

Rule-based Subjectivity Classifiers

Gender-Independent (Riloff & Wiebe, 2003; Volkova et al., 2013): GIndRB

subj =

  • 1

if w · f ≥ 0.5,

  • therwise.

Gender-Dependent: GDepRB

subj =

  • 1

if wM · f M ≥ 0.5 ∧ M,

  • therwise.
  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 16 / 35

slide-43
SLIDE 43

Rule-based Subjectivity Classifiers

Gender-Independent (Riloff & Wiebe, 2003; Volkova et al., 2013): GIndRB

subj =

  • 1

if w · f ≥ 0.5,

  • therwise.

Gender-Dependent: GDepRB

subj =

  • 1

if wF · f F ≥ 0.5 ∧ F,

  • therwise.
  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 17 / 35

slide-44
SLIDE 44

Rule-based Subjectivity Classification Results

Start with LI and incrementally add Emoticons, Adjectives, AdveRbs, Verbs, Nouns from LB.

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 18 / 35

slide-45
SLIDE 45

Rule-based Subjectivity Classification Results

Start with LI and incrementally add Emoticons, Adjectives, AdveRbs, Verbs, Nouns from LB.

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 19 / 35

slide-46
SLIDE 46

Rule-based Polarity Classifiers

Gender-Independent (Riloff & Wiebe, 2003; Volkova et al., 2013): GIndRB

pol =

  • 1

if w+ · f + ≥ w− · f −,

  • therwise
  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 20 / 35

slide-47
SLIDE 47

Rule-based Polarity Classifiers

Gender-Independent (Riloff & Wiebe, 2003; Volkova et al., 2013): GIndRB

pol =

  • 1

if w+ · f + ≥ w− · f −,

  • therwise

Gender-Dependent: GDepRB

pol =

  • 1

if

  • wM+ ·

f M+ ≥

  • wM− ·

f M− ∧ M,

  • therwise
  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 20 / 35

slide-48
SLIDE 48

Rule-based Polarity Classifiers

Gender-Independent (Riloff & Wiebe, 2003; Volkova et al., 2013): GIndRB

pol =

  • 1

if w+ · f + ≥ w− · f −,

  • therwise

Gender-Dependent: GDepRB

pol =

  • 1

if

  • wF+ ·

f F+ ≥

  • wF− ·

f F− ∧ F,

  • therwise
  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 21 / 35

slide-49
SLIDE 49

Rule-based Polarity Classification Results

Start with LI and incrementally add Emoticons, Adjectives, AdveRbs, Verbs, Nouns from LB.

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 22 / 35

slide-50
SLIDE 50

Rule-based Polarity Classification Results

Start with LI and incrementally add Emoticons, Adjectives, AdveRbs, Verbs, Nouns from LB.

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 23 / 35

slide-51
SLIDE 51

Experimental Setup

Gender-Independent features: V - unigram counts, LI, LB - set-count features from the original and bootstrapped lexicons, and E - emoticons

  • f GInd

subj = [LI, LB, E, V];

  • f GInd

pol

= [L+

I , L+ B , E+, L− I , L− B , E−, V].

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 24 / 35

slide-52
SLIDE 52

Experimental Setup

Gender-Independent features: V - unigram counts, LI, LB - set-count features from the original and bootstrapped lexicons, and E - emoticons

  • f GInd

subj = [LI, LB, E, V];

  • f GInd

pol

= [L+

I , L+ B , E+, L− I , L− B , E−, V].

Gender-Dependent joint features:

  • f GDep−J

subj

= [LM

I , LM B , EM, LF I , LF B, EF, V];

  • f Dep−J

pol

= [LM+

I

, LM+

B

, EM+, LF+

I

, LF+

B , EF+

LM−

I

, LM−

B

, EM−, LF−

I

, LF−

B , EF−, V].

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 24 / 35

slide-53
SLIDE 53

Subjectivity Classification Results using SL

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 25 / 35

slide-54
SLIDE 54

Subjectivity Classification Results using SL

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 26 / 35

slide-55
SLIDE 55

Subjectivity Classification Results using SL

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 27 / 35

slide-56
SLIDE 56

Polarity Classification Results using SL

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 28 / 35

slide-57
SLIDE 57

Polarity Classification Results using SL

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 29 / 35

slide-58
SLIDE 58

Polarity Classification Results using SL

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 30 / 35

slide-59
SLIDE 59

Summary

Empirical study of differences in subjective language between male and female users in Twitter.

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 31 / 35

slide-60
SLIDE 60

Summary

Empirical study of differences in subjective language between male and female users in Twitter. Analysis of hashtag and emoticon usage across cultures.

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 31 / 35

slide-61
SLIDE 61

Summary

Empirical study of differences in subjective language between male and female users in Twitter. Analysis of hashtag and emoticon usage across cultures. Incorporating author gender as a model component can significantly improve subjectivity and polarity classification for multiple languages in social media.

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 31 / 35

slide-62
SLIDE 62

Summary

Empirical study of differences in subjective language between male and female users in Twitter. Analysis of hashtag and emoticon usage across cultures. Incorporating author gender as a model component can significantly improve subjectivity and polarity classification for multiple languages in social media.

Data: http://www.cs.jhu.edu/~svitlana/data/ data_emnlp2013.zip

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 31 / 35

slide-63
SLIDE 63

References I

Bamman, D., Eisenstein, J., & Schnoebelen, T. (2012). Gender in Twitter: styles, stances, and social networks. Computing Research Repository. Boneva, B., Kraut, R., & Frohlich, D. (2001). Using email for personal relationships: The difference gender makes. American Behavioral Scientist, 45(3), 530-549. Eisenstein, J., O’Connor, B., Smith, N. A., & Xing, E. P . (2010). A latent variable model for geographic lexical variation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’10) (p. 1277-1287). Fan, T. K., & Chang, C. H. (2009). Sentiment-oriented contextual

  • advertising. Advances in Information Retrieval, 5478, 202-215.
  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 32 / 35

slide-64
SLIDE 64

References II

Gefen, D., & Ridings, C. M. (2005). If you spoke as she does, sir, instead of the way you do: a sociolinguistics perspective of gender differences in virtual communities. SIGMIS Database, 36(2), 78-92. Holmes, J., & Meyerhoff, M. (2004). The handbook of language and

  • gender. Blackwell Publishing.

Macaulay, R. (2006). Pure grammaticalization: The development of a teenage intensifier. Language Variation and Change, 18(03), 267–283. Mohammad, S., & Yang, T. (2011). Tracking sentiment in mail: How genders differ on emotional axes. In Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA’11) (p. 70-79).

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 33 / 35

slide-65
SLIDE 65

References III

O’Connor, B., Eisenstein, J., Xing, E. P ., & Smith, N. A. (2010). A mixture model of demographic lexical variation. In Proceedings

  • f NIPS Workshop on Machine Learning in Computational Social

Science (p. 1-7). Ott, M., Choi, Y., Cardie, C., & Hancock, J. T. (2011). Finding deceptive

  • pinion spam by any stretch of the imagination. In Proceedings
  • f the 49th Annual Meeting of the Association for Computational

Linguistics: Human Language Technologies (p. 309-319). Picard, R. W. (1997). Affective computing. MIT Press. Resnik, P . (2013). Getting real(-time) with live polling. (http://vimeo.com/68210812) Riloff, E., & Wiebe, J. (2003). Learning extraction patterns for subjective expressions. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’03) (p. 105-112).

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 34 / 35

slide-66
SLIDE 66

References IV

Tagliamonte, S. A. (2006). Analysing sociolinguistic variation. Cambridge University Press, 1st. Edition. Volkova, S., Wilson, T., & Yarowsky, D. (2013). Exploring sentiment in social media: Bootstrapping subjectivity clues from multilingual Twitter streams. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL ’13) (pp. 505–510).

  • S. Volkova, T. Wilson, D. Yarowsky (JHU)

Demographic Language Variations in Twitter 35 / 35