Natural Language Processing for Framing Noah Smith University of - - PowerPoint PPT Presentation

natural language processing for framing
SMART_READER_LITE
LIVE PREVIEW

Natural Language Processing for Framing Noah Smith University of - - PowerPoint PPT Presentation

Natural Language Processing for Framing Noah Smith University of Washington nasmith@cs.washington.edu Collaborators: David Bamman (UCB), Amber Boydstun (UCD), Dallas Card (CMU), Justin Gross (UMass), Brendan OConnor (UMass), Philip Resnik


slide-1
SLIDE 1

Natural Language Processing for Framing

Noah Smith University of Washington nasmith@cs.washington.edu Collaborators: David Bamman (UCB), Amber Boydstun (UCD), Dallas Card (CMU), Justin Gross (UMass), Brendan O’Connor (UMass), Philip Resnik (UMD) May 24, 2016 These slides: http://tinyurl.com/framing-noah

slide-2
SLIDE 2

Outline

A

R

K

◮ Motivation: a study in which we’re using NLP ◮ Building a text classifier:

  • 1. Define the classes
  • 2. Annotate training examples
  • 3. Featurize data

◮ Brief tangent: creating new features

  • 4. Learn to classify
  • 5. Evaluate the classifier

◮ Looking ahead

slide-3
SLIDE 3

Some Terminology

A

R

K

Natural language processing (NLP): Algorithms that do useful things with text.

(for someone) (or other linguistic data)

Framing is choosing “a few elements of perceived reality and assembling a narrative that highlights connections among them to promote a particular interpretation.” Entman (1993, 2007)

slide-4
SLIDE 4

Media Framing and Public Opinion

A

R

K

◮ We know that framing works . . . sometimes. ◮ Lack of systematic tests of framing effects on public opinion

When do media framing and public opinion covary?

slide-5
SLIDE 5

Hypotheses

A

R

K

H1: Issue Salience The covariance between media framing of immigration and public

  • pinion will be stronger during periods of time when immigration is

highly salient in the media, relative to periods of time when the issue is not highly salient.

slide-6
SLIDE 6

Hypotheses

A

R

K

H1: Issue Salience The covariance between media framing of immigration and public

  • pinion will be stronger during periods of time when immigration is

highly salient in the media, relative to periods of time when the issue is not highly salient. H2: Frame Competition The more diffuse media coverage of immigration is across competing frames, the weaker the covariance between media framing of the issue and public opinion about the issue will be.

slide-7
SLIDE 7

Variables

A

R

K

◮ Public mood (dependent variable) from Stimson (2014)

(higher is more liberal)

slide-8
SLIDE 8

Variables

A

R

K

◮ Public mood (dependent variable) from Stimson (2014)

(higher is more liberal)

40 45 50 55 60 65 70 75 1952 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 2012

slide-9
SLIDE 9

Text Corpus

A

R

K

◮ 13 U.S. newspapers (e.g., NYT, USA Today) ◮ 1980–2012 (132 quarters) ◮ 38,283 articles ◮ Annotated for tone (pro/anti-immigration) and 14 emphasis

framing “dimensions”

◮ Random subset of 4,154 manually annotated ◮ Automatic annotation of the rest ◮ (More about this later!)

slide-10
SLIDE 10

Variables

A

R

K

◮ Public mood (dependent variable) from Stimson (2014)

(higher is more liberal) From the text corpus (38,283 articles):

slide-11
SLIDE 11

Variables

A

R

K

◮ Public mood (dependent variable) from Stimson (2014)

(higher is more liberal) From the text corpus (38,283 articles):

◮ Media tone: count(pro) – count(anti)

slide-12
SLIDE 12

Variables

A

R

K

◮ Public mood (dependent variable) from Stimson (2014)

(higher is more liberal) From the text corpus (38,283 articles):

◮ Media tone: count(pro) – count(anti)

250 500 750

Pro

250 500 750

Neutral

1980 1985 1990 1995 2000 2005 2010 250 500 750

Anti

slide-13
SLIDE 13

Variables

A

R

K

◮ Public mood (dependent variable) from Stimson (2014)

(higher is more liberal) From the text corpus (38,283 articles):

◮ Media tone: count(pro) – count(anti) ◮ High salience: ≥ 350 articles published in the quarter?

(binary)

slide-14
SLIDE 14

Variables

A

R

K

◮ Public mood (dependent variable) from Stimson (2014)

(higher is more liberal) From the text corpus (38,283 articles):

◮ Media tone: count(pro) – count(anti) ◮ High salience: ≥ 350 articles published in the quarter?

(binary) H1: Public mood ∝ Media tone × High salience

slide-15
SLIDE 15

Variables

A

R

K

◮ Public mood (dependent variable) from Stimson (2014)

(higher is more liberal) From the text corpus (38,283 articles):

◮ Media tone: count(pro) – count(anti) ◮ High salience: ≥ 350 articles published in the quarter?

(binary)

◮ Frame competition: Shannon entropy across emphasis framing

dimensions in the quarter

slide-16
SLIDE 16

Framing Dimensions over Time

A

R

K

slide-17
SLIDE 17

Variables

A

R

K

◮ Public mood (dependent variable) from Stimson (2014)

(higher is more liberal) From the text corpus (38,283 articles):

◮ Media tone: count(pro) – count(anti) ◮ High salience: ≥ 350 articles published in the quarter?

(binary)

◮ Frame competition: Shannon entropy across emphasis framing

dimensions in the quarter H2: Public mood ∝ –(Media tone × Frame competition)

slide-18
SLIDE 18

Regression

A

R

K

coefficient standard error Public mood (lagged) 0.83 0.05 Media tone 222.09 108.53 High salience 0.30 1.26 Media tone × high salience 9.57 5.00 H1 Frame competition –10.06 10.86 Media tone × frame competition –87.41 43.60 H2 Constant 32.48 27.61 N = 132; adjusted R2 = 0.759, RMSE = 3.772, p < 0.05 in bold

slide-19
SLIDE 19

Discussion

A

R

K

◮ Public opinion on immigration ∝ Media tone on immigration

◮ . . . more when immigration is a salient issue ◮ . . . less when frame competition is high

◮ Still to be accounted for:

◮ Demographic shifts ◮ Major events ◮ . . .

slide-20
SLIDE 20

Outline

A

R

K

◮ Motivation: a study in which we’re using NLP

  • ◮ Building a text classifier:
  • 1. Define the classes
  • 2. Annotate training examples
  • 3. Featurize data

◮ Brief tangent: creating new features

  • 4. Learn to classify
  • 5. Evaluate the classifier

◮ Looking ahead

slide-21
SLIDE 21

Text Classification

A

R

K

Mosteller and Wallace (1963) automatically inferred the authors of the disputed Federalist Papers. Many other examples:

◮ News: politics vs. sports vs. business vs. technology ... ◮ Reviews of films, restaurants, products: postive vs. negative ◮ Email: spam vs. not ◮ What is the reading level of a piece of text? ◮ Will a scientific paper be cited? ◮ Will a piece of proposed legislation pass?

slide-22
SLIDE 22

Media Frames Codebook: Framing Dimensions

Boydstun et al. (2014)

A

R

K

Economic: costs, benefits, or other financial implications Capacity and resources: availability of physical, human, or financial resources Morality: religious or ethical implications Fairness and equality: balance or distribution of rights, responsibilities, and resources Legality, constitutionality and jurisprudence: rights, freedoms, and the authority of government Policy prescription and evaluation: discussion of specific policies aimed at addressing problems Crime and punishment: effectiveness and implications of laws and their enforcement Security and defense: threats to welfare of the individual, community, or nation Health and safety: health care, sanitation, and public safety Quality of life: threats and opportunities for the individual’s health, happiness, and well-being Cultural identity: traditions, customs, or values of a social group in relation to a policy issue Public opinion: attitudes and opinions of the general public, including polling and demographics Political: considerations related to politics and politicians, including lobbying, elections, and attempts to sway voters External regulation and reputation: international reputation or foreign policy of the United States Other: any coherent group of frames not covered by the above categories

slide-23
SLIDE 23

Outline

A

R

K

◮ Motivation: a study in which we’re using NLP

  • ◮ Building a text classifier:
  • 1. Define the classes
  • 2. Annotate training examples
  • 3. Featurize data

◮ Brief tangent: creating new features

  • 4. Learn to classify
  • 5. Evaluate the classifier

◮ Looking ahead

slide-24
SLIDE 24

Media Frames Corpus

Card et al. (2015)

A

R

K

◮ Articles selected by keyword search across thirteen

newspapers, 1980–2012, on three issues

◮ Annotated for primary framing dimension, overall tone (i.e.,

stance on the issue, pro-/anti-/neutral), and arbitrary spans that evoke framing dimensions

◮ 5,549 (immigration) ◮ 6,298 (same-sex marriage) ◮ 4,077 (smoking)

◮ https://github.com/dallascard/media_frames_corpus

slide-25
SLIDE 25

Example (Denver Post, 2006)

A

R

K

[WHERE THE JOBS ARE]Economic [Critics of illegal immigration can make many cogent arguments to support the position that the U.S. Congress and the Colorado legislature must develop effective and well-enforced immigration policies that will restrict the number of people who migrate here legally and illegally.]Policy prescription [It’s true that all forms of [immigration exert influence over our economic and cultural make-up.]Cultural identity In some ways, immigration improves our economy by adding laborers, taxpayers and consumers, and in other ways immigration detracts from our economy by increasing the number of students, health care recipients and other beneficiaries of public services.]Economic [Some economists say that immigrants, legal and illegal, produce a net economic gain, while others say that they create a net loss]Economic. There are rational arguments to support both sides of this debate, and it’s useful and educational to hear the varying positions.

slide-26
SLIDE 26

Example (Denver Post, 2006)

A

R

K

[WHERE THE JOBS ARE]Economic [Critics of illegal immigration can make many cogent arguments to support the position that the U.S. Congress and the Colorado legislature must develop effective and well-enforced immigration policies that will restrict the number of people who migrate here legally and illegally.]Public opinion [It’s true that all forms of immigration exert influence over our economic and [cultural make-up.]Cultural identity In some ways, immigration improves our economy by adding laborers, taxpayers and consumers, and in other ways [immigration detracts from our economy by increasing the number of students, health care recipients and other beneficiaries of public services.]Capacity]Economic [Some economists say that immigrants, legal and illegal, produce a net economic gain, while others say that they create a net loss.]Economic There are rational arguments to support both sides of this debate, and it’s useful and educational to hear the varying positions.

slide-27
SLIDE 27

Interannotator Agreement

Card et al. (2015)

A

R

K

10 20 30 40

Round

0.0 0.2 0.4 0.6 0.8 1.0

Krippendorff’s Alpha Stage 1 Stage 2 Stage 3

Immigration Smoking Same-sex marriage

slide-28
SLIDE 28

Outline

A

R

K

◮ Motivation: a study in which we’re using NLP

  • ◮ Building a text classifier:
  • 1. Define the classes
  • 2. Annotate training examples
  • 3. Featurize data

◮ Brief tangent: creating new features

  • 4. Learn to classify
  • 5. Evaluate the classifier

◮ Looking ahead

slide-29
SLIDE 29

“Featurizing” Text Data

A

R

K

words

protest, rally, poll, march, protester, boycott, voter

slide-30
SLIDE 30

“Featurizing” Text Data

A

R

K

words

protest, rally, poll, march, protester, boycott, voter

Blei et al. (2003) topics

“protest/religion” (rally, rallies, marchers, church, los) “non-profits” (raza, advocacy, sierra, coalition)

slide-31
SLIDE 31

“Featurizing” Text Data

A

R

K

words

protest, rally, poll, march, protester, boycott, voter

Blei et al. (2003) topics

“protest/religion” (rally, rallies, marchers, church, los) “non-profits” (raza, advocacy, sierra, coalition)

Schneider and Smith (2015) multiwords

deal with, los angeles, a number of, day laborers, im- migration status, new yorkers, federal officials, hunger strike, service employees international union

slide-32
SLIDE 32

“Featurizing” Text Data

A

R

K

words

protest, rally, poll, march, protester, boycott, voter

Blei et al. (2003) topics

“protest/religion” (rally, rallies, marchers, church, los) “non-profits” (raza, advocacy, sierra, coalition)

Schneider and Smith (2015) multiwords

deal with, los angeles, a number of, day laborers, im- migration status, new yorkers, federal officials, hunger strike, service employees international union

FrameNet/Das et al. (2014) predicates

Taking sides, Emotion directed, Attack, Colonization, Discussion, Judgment communication

slide-33
SLIDE 33

“Featurizing” Text Data

A

R

K

words

protest, rally, poll, march, protester, boycott, voter

Blei et al. (2003) topics

“protest/religion” (rally, rallies, marchers, church, los) “non-profits” (raza, advocacy, sierra, coalition)

Schneider and Smith (2015) multiwords

deal with, los angeles, a number of, day laborers, im- migration status, new yorkers, federal officials, hunger strike, service employees international union

FrameNet/Das et al. (2014) predicates

Taking sides, Emotion directed, Attack, Colonization, Discussion, Judgment communication . . . , Manning et al. (2014) syntactic classes, dependencies, sentiment, and named entities; Brown et al. (1992) clusters, Wikipedia page titles (Singh et al., 2012), . . .

slide-34
SLIDE 34

Pop Quiz: Which Dimension’s Classifier Was That?

A

R

K

Economic: costs, benefits, or other financial implications Capacity and resources: availability of physical, human, or financial resources Morality: religious or ethical implications Fairness and equality: balance or distribution of rights, responsibilities, and resources Legality, constitutionality and jurisprudence: rights, freedoms, and the authority of government Policy prescription and evaluation: discussion of specific policies aimed at addressing problems Crime and punishment: effectiveness and implications of laws and their enforcement Security and defense: threats to welfare of the individual, community, or nation Health and safety: health care, sanitation, and public safety Quality of life: threats and opportunities for the individual’s health, happiness, and well-being Cultural identity: traditions, customs, or values of a social group in relation to a policy issue Public opinion: attitudes and opinions of the general public, including polling and demographics Political: considerations related to politics and politicians, including lobbying, elections, and attempts to sway voters External regulation and reputation: international reputation or foreign policy of the United States

slide-35
SLIDE 35

Outline

A

R

K

◮ Motivation: a study in which we’re using NLP

  • ◮ Building a text classifier:
  • 1. Define the classes
  • 2. Annotate training examples
  • 3. Featurize data

◮ Brief tangent: creating new features

  • 4. Learn to classify
  • 5. Evaluate the classifier

◮ Looking ahead

slide-36
SLIDE 36

Framing through Personas

A

R

K

◮ Bamman et al. (2013) introduced a latent variable model for

clustering mentions of entities into personas

slide-37
SLIDE 37

Framing through Personas

A

R

K

◮ Bamman et al. (2013) introduced a latent variable model for

clustering mentions of entities into personas

◮ Preprocessing: dependency parsing, multiword analysis,

named entity recognition, pronominal coreference, and lots of filtering.

slide-38
SLIDE 38

Framing through Personas

A

R

K

◮ Bamman et al. (2013) introduced a latent variable model for

clustering mentions of entities into personas

◮ Preprocessing: dependency parsing, multiword analysis,

named entity recognition, pronominal coreference, and lots of filtering.

◮ Applied to the immigration articles (k = 50), we find

“personas” for:

◮ Immigrants ◮ Government agencies,

administrations, police/officials, courts/judges

◮ Political candidates ◮ Employers ◮ Universities/schools ◮ Protesters ◮ Terrorists ◮ Green cards,

laws/policies, law suits

◮ Countries ◮ Information sources ◮ Eli´

an Gonz´ alez

slide-39
SLIDE 39

Immigrant “Personas”

A

R

K

2 deport live come detain leave hold arrest re- lease face arrive go take flee allow tell want send get return old woman man people immigrant family 44 foreign skilled american hire high-tech allow bring temporary many find need import work do- mestic hire recruit get pay new seasonal worker company student immigrant employer 50 illegal criminal deport commit arrest immigrant convict convict legal commit deport serious re- lease identify hold arrest detain violent undocu- mented remove immigrant alien crime people deportation

slide-40
SLIDE 40

Outline

A

R

K

◮ Motivation: a study in which we’re using NLP

  • ◮ Building a text classifier:
  • 1. Define the classes
  • 2. Annotate training examples
  • 3. Featurize data
  • ◮ Brief tangent: creating new features
  • 4. Learn to classify
  • 5. Evaluate the classifier

◮ Looking ahead

slide-41
SLIDE 41

Learning Classifiers for Framing in Text

A

R

K

◮ 4,154 training documents, with annotations converted to

presence/absence per framing dimension (10% held out for testing)

◮ Logistic regression to predict presence/absence for each

dimension

◮ Bayesian optimization (Yogatama et al., 2015) with the ladder

(Blum and Hardt, 2015):

◮ Which features to include? ◮ Minimum word count ◮ Binarize, tfidf, or no transformation? ◮ Downcase? ◮ Paragraph and sentence “pseudodocuments,” with weights

(Zaidan et al., 2007)

◮ Regularization strength (ℓ1 and ℓ2) for each feature type

slide-42
SLIDE 42

Outline

A

R

K

◮ Motivation: a study in which we’re using NLP

  • ◮ Building a text classifier:
  • 1. Define the classes
  • 2. Annotate training examples
  • 3. Featurize data
  • ◮ Brief tangent: creating new features
  • 4. Learn to classify
  • 5. Evaluate the classifier

◮ Looking ahead

slide-43
SLIDE 43

Evaluation

A

R

K

  • 1. Accuracy of binary classifiers, across 14 dimensions, computed
  • n test cases where annotators agree: 90.4% ± 5.0

Better measure, F1: 67.1% ± 12.9

slide-44
SLIDE 44

Evaluation

A

R

K

  • 1. Accuracy of binary classifiers, across 14 dimensions, computed
  • n test cases where annotators agree: 90.4% ± 5.0

Better measure, F1: 67.1% ± 12.9

200 400 600 800 1000 1200 0.5 0.6 0.7 0.8

F1 as a function of the number of positive training examples

slide-45
SLIDE 45

Evaluation

A

R

K

  • 1. Accuracy of binary classifiers, across 14 dimensions, computed
  • n test cases where annotators agree: 90.4% ± 5.0

Better measure, F1: 67.1% ± 12.9

  • 2. Absolute error in aggregate proportion estimation:

4.8%± 1.8; 1.5% ± 1.3 with Hopkins and King (2010) correction

slide-46
SLIDE 46

Evaluation

A

R

K

  • 1. Accuracy of binary classifiers, across 14 dimensions, computed
  • n test cases where annotators agree: 90.4% ± 5.0

Better measure, F1: 67.1% ± 12.9

  • 2. Absolute error in aggregate proportion estimation:

4.8%± 1.8; 1.5% ± 1.3 with Hopkins and King (2010) correction

slide-47
SLIDE 47

Evaluation

A

R

K

  • 1. Accuracy of binary classifiers, across 14 dimensions, computed
  • n test cases where annotators agree: 90.4% ± 5.0

Better measure, F1: 67.1% ± 12.9

  • 2. Absolute error in aggregate proportion estimation:

4.8%± 1.8; 1.5% ± 1.3 with Hopkins and King (2010) correction

slide-48
SLIDE 48

Evaluation

A

R

K

  • 1. Accuracy of binary classifiers, across 14 dimensions, computed
  • n test cases where annotators agree: 90.4% ± 5.0

Better measure, F1: 67.1% ± 12.9

  • 2. Absolute error in aggregate proportion estimation:

4.8%± 1.8; 1.5% ± 1.3 with Hopkins and King (2010) correction

  • 3. Different features selected for every dimension!
slide-49
SLIDE 49

Features Selected

A

R

K

  • uni. bi. JK POS NER dep. sent. fr. Br. AM. Wl. LDA P AS AP

Capacity & resources L L

  • Crime & punishment

L

  • Cultural identity

L

  • SS
  • Economic

L

  • External regulation

W

  • MWE
  • Fairness & equality
  • Health & safety

L

  • Legality, jurisdiction
  • SS
  • Morality
  • Policy prescription

L

  • SS
  • Political

L W

  • Public sentiment

L L

  • MWE
  • Quality of life

W W

  • SS
  • Security & defense

W W

  • SS
slide-50
SLIDE 50

Outline

A

R

K

◮ Motivation: a study in which we’re using NLP

  • ◮ Building a text classifier:
  • 1. Define the classes
  • 2. Annotate training examples
  • 3. Featurize data
  • ◮ Brief tangent: creating new features
  • 4. Learn to classify
  • 5. Evaluate the classifier
  • ◮ Looking ahead
slide-51
SLIDE 51

Looking Ahead

A

R

K

◮ Taken together, frames imply a rich landscape of

perspectives—can we map it?

◮ Apps for “unframing” the news and other political discourse

◮ Fine-grained frames? ◮ Frame retrieval: expert describes frame, retrieve instances

from a corpus

◮ Framing as a strategic choice (Sim et al., 2015b,a) ◮ An inventory of the linguistic tools for framing and attention?

◮ Syntax (Greene and Resnik, 2009) ◮ Frame semantics (Fillmore, 1982) ◮ Discourse (Grosz and Sidner, 1986)

. . . or just representation learning?

slide-52
SLIDE 52

A

R

K

Thank you!

Sponsors: NSF, Google, UW Innovation Award More details: Bamman et al. (2013); Boydstun et al. (2014); Card et al. (2015) These slides: http://tinyurl.com/framing-noah

slide-53
SLIDE 53

References I

A

R

K

Bamman, D., O’Connor, B., and Smith, N. A. (2013). Learning latent personas of film characters. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent Dirichlet allocation. Journal

  • f Machine Learning Research, 3:993–1022.

Blum, A. and Hardt, M. (2015). The ladder: A reliable leaderboard for machine learning competitions. http://arxiv.org/abs/1502.04585. Boydstun, A. E., Card, D., Gross, J. H., Resnik, P., and Smith, N. A. (2014). Tracking the development of media frames within and across policy issues. Presented at the American Political Science Association. Brown, P. F., deSouza, P. V., Mercer, R. L., Pietra, V. J. D., and Lai, J. C. (1992). Class-based n-gram models of natural language. Computational Linguistics, 18(4):467–479. Card, D., Boydstun, A. E., Gross, J. H., Resnik, P., and Smith, N. A. (2015). The Media Frames Corpus: Annotations of frames across issues. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. Das, D., Chen, D., Martins, A. F. T., Schneider, N., and Smith, N. A. (2014). Frame-semantic parsing. Computational Linguistics, 40(1):9–56. Entman, R. M. (1993). Framing: Toward clarification of a fractured paradigm. Journal of Communication, 43(4):51–58.

slide-54
SLIDE 54

References II

A

R

K

Entman, R. M. (2007). Framing bias: Media in the distribution of power. Journal of Communication, 57:163–173. Fillmore, C. (1982). Frame semantics. In Linguistics in the Morning Calm, pages 111–137. Hanshin. Greene, S. and Resnik, P. (2009). More than words: Syntactic packaging and implicit

  • sentiment. In Proceedings of HLT-NAACL.

Grosz, B. J. and Sidner, C. L. (1986). Attention, intentions, and the structure of

  • discourse. Computational Linguistics, 12(3):175–204.

Hopkins, D. J. and King, G. (2010). A method of automated nonparametric content analysis for social science. American Journal of Political Science, 54(1):229–247. Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J. R., Bethard, S., and McClosky, D. (2014). The Stanford CoreNLP natural language processing toolkit. In Proc. of ACL. Mosteller, F. and Wallace, D. L. (1963). Inference in an authorship problem: A comparative study of discrimination methods applied to the authorship of the disputed Federalist Papers. Journal of the American Statistical Association, 58(302):275–309. Schneider, N. and Smith, N. A. (2015). A corpus and model integrating multiword expressions and supersenses. In Proceedings of NAACL. Sim, Y., Routledge, B. R., and Smith, N. A. (2015a). A utility model of authors in the scientific community. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.

slide-55
SLIDE 55

References III

A

R

K

Sim, Y., Routledge, B. R., and Smith, N. A. (2015b). The utility of text: The case of amicus briefs and the Supreme Court. In Proceedings of the AAAI Conference on Artificial Intelligence. Singh, S., Subramanya, A., Pereira, F., and McCallum, A. (2012). Wikilinks: A large-scale cross-document coreference corpus labeled via links to Wikipedia. Technical Report UMASS-CS-2012-015, University of Massachusetts. Stimson, J. (2014). Public policy mood data. Yogatama, D., Kong, L., and Smith, N. A. (2015). Bayesian optimization of text

  • representations. In Proceedings of the Conference on Empirical Methods in Natural

Language Processing. Zaidan, O., Eisner, J., and Piatko, C. D. (2007). Using “annotator rationales” to improve machine learning for text categorization. In Proceedings of HLT-NAACL.

slide-56
SLIDE 56

What about Deep Learning?

A

R

K

Deep learning refers to a set of tools for discovering continuous features, based mostly on neural networks (non-linear, parameterized functions).

slide-57
SLIDE 57

What about Deep Learning?

A

R

K

Deep learning refers to a set of tools for discovering continuous features, based mostly on neural networks (non-linear, parameterized functions). Pros:

◮ Features that improve

accuracy, when you have enough data. Cons:

◮ Computational expense ◮ Lack of interpretability

slide-58
SLIDE 58

What about Deep Learning?

A

R

K

Deep learning refers to a set of tools for discovering continuous features, based mostly on neural networks (non-linear, parameterized functions). Pros:

◮ Features that improve

accuracy, when you have enough data. Cons:

◮ Computational expense ◮ Lack of interpretability

  • Cf. Bamman personas, Blei topics, and Brown clusters, which offer

discrete features based on probabilistic graphical models and explicit assumptions.