Sentence stress in presidential speeches A R T O A N T T I L A , T I - - PowerPoint PPT Presentation

sentence stress in presidential speeches
SMART_READER_LITE
LIVE PREVIEW

Sentence stress in presidential speeches A R T O A N T T I L A , T I - - PowerPoint PPT Presentation

Sentence stress in presidential speeches A R T O A N T T I L A , T I M O T H Y D O Z AT , D A N I E L G A L B R A I T H , A N D N A O M I S H A P I R O 39th Annual Meeting of the DGfS Workshop on Prosody in Syntactic Encoding Saarbrcken,


slide-1
SLIDE 1

Sentence stress in presidential speeches

39th Annual Meeting of the DGfS Workshop on Prosody in Syntactic Encoding Saarbrücken, March 10, 2017 A R T O A N T T I L A , T I M O T H Y D O Z AT, D A N I E L G A L B R A I T H , A N D N A O M I S H A P I R O

slide-2
SLIDE 2

Why are sentences stressed the way they are? We are gòing to begìn to áct, begìnning TODÁY.

(Ronald Reagan, Inaugural Address, January 20, 1981, Sentence 21)

slide-3
SLIDE 3

Two kinds of sentence stress (Jespersen 1920: 212-222)

(a) Mechanical stress, rhythmic stress, “physiological”

(rhythmischer Druck, Einheitsdruck)

(b) Meaningful stress, semantic stress, “psychological”

(Wertdruck, Neuheitsdruck, Gegensatzdruck)

slide-4
SLIDE 4

Semantic stress

In the Gilmore Girls universe, Luke and Lorelai seemed inevitable. He served the coffee; she needed the coffee. (Correction: NEEDED the coffee.) Entertainment Weekly, November 17, 2016 http://www.ew.com/article/2016/11/17/gilmore-girls-luke-

  • riginally-woman
slide-5
SLIDE 5

Mechanical stress

How much did they pay you for participating in the experiment? Five francs. (Ladd 1996: 166)

slide-6
SLIDE 6

Semantic stress is related to new information

How is information packaged in the sentence? (a) Evenly spread (Uniform Information Density)

(Levy and Jaeger 2007; Jaeger 2010)

(b) Piles up towards the end (Communicative Dynamism)

(Prague School, e.g., Firbas 1971)

(c) Seeks out stress peaks (Stress-Information Alignment)

(Bolinger 1957, 1972; Calhoun 2010; Cohen Priva 2012)

slide-7
SLIDE 7

(a) Uniform Information Density

slide-8
SLIDE 8

(b) Communicative Dynamism

slide-9
SLIDE 9

(c) Stress-Information Alignment

slide-10
SLIDE 10

Bolinger (1957: 235)

“The recipe for reconciling the two functions [semantic and mechanical] is simple: the writer should make them coincide as nearly as he can by maneuvering the semantic heavy stress into the position

  • f the mechanical loud stress; that is, toward the end.”
slide-11
SLIDE 11

Stress-Information Alignment: A Proposal

(a) Phrasal stress is assigned by syntax.

(Chomsky, Halle, and Lukoff 1956; Chomsky and Halle 1968; Liberman and Prince 1977; Cinque 1993)

(b) Information seeks out stress peaks, especially in good prose.

(Bolinger 1957, 1972; Calhoun 2010; Cohen Priva 2012)

stress = metrical strength

slide-12
SLIDE 12

Plan of work

  • 1. Find a text performed by an individual (script + audio + video).

Inaugural addresses of Carter (1977), Reagan (1981), Bush Sr., (1989), Clinton (1993), Bush Jr. (2001), Obama (2009)

  • 2. Assign mechanical stress to text by a computer.

(MetricalTree, Dozat 2015-7)

  • 3. Collect perceived stress judgments from native speakers.

(MetricGold, Shapiro 2016-7)

  • 4. Figure out to what extent perceived stress is explained by

(i) the mechanical stress contour (ii) the distribution of information

slide-13
SLIDE 13

Why is this interesting?

Sentence stress is difficult to pin down:

  • not represented in writing
  • hard to measure by phonometric methods

Yet it exists and is a hidden variable in many studies. Understanding sentence stress may help solve other linguistic puzzles.

slide-14
SLIDE 14

Preview of findings

  • 1. Both kinds of stress matter, but not uniformly:
  • Noun and adjective stresses tend to be loud and mechanical.
  • Verb and function word stresses tend to be soft and semantic.
  • 2. Stress levels vary significantly across parts of speech:

nouns > adjectives

> verbs > function words

slide-15
SLIDE 15
  • 1. Predicting mechanical stress
slide-16
SLIDE 16

Rules vs. variability

The Nuclear Stress Rule (NSR) and the Compound Stress Rule (CSR)

(Chomsky and Halle 1968, Liberman and Prince 1977, Cinque 1993)

Sentence stress is variable. Why?

  • Free will
  • Ambiguity in lexical stress results in variation in phrasal stress:

unstressed words (e.g., expletive it) stress-ambiguous words (e.g., in, into) stressed words (e.g., balloon). Assumption: No variation in the phrasal stress rules themselves.

slide-17
SLIDE 17

The stress rules (Chomsky and Halle 1968)

The Nuclear Stress Rule (NSR): Assign [1 stress] to the rightmost vowel bearing the feature [1 stress]. Applies to phrases (NP, VP, AP, S). The Compound Stress Rule (CSR): Skip over the rightmost word and assign [1 stress] to the rightmost remaining [1 stress] vowel; if there is no [1 stress] to the left of the rightmost word, then try again without skipping the word. Applies to words (N, A, V).

slide-18
SLIDE 18

Sample derivation

[[[John's] [[[black] [board]] [eraser]]] [was stolen]] 1 1 1 1 1

slide-19
SLIDE 19

First cycle

[[[John's] [[[black] [board]] [eraser]]] [was stolen]] 1 1 1 1 1 [ 1 2 ]

slide-20
SLIDE 20

Second cycle

[[[John's] [[[black] [board]] [eraser]]] [was stolen]] 1 1 1 1 1 [ 1 2 ] [ 1 3 2 ]

slide-21
SLIDE 21

Third cycle

[[[John's] [[[black] [board]] [eraser]]] [was stolen]] 1 1 1 1 1 [ 1 2 ] [ 1 3 2 ] [ 2 1 4 3 ]

slide-22
SLIDE 22

Final cycle

[[[John's] [[[black] [board]] [eraser]]] [was stolen]] 1 1 1 1 1 [ 1 2 ] [ 1 3 2 ] [ 2 1 4 3 ] [ 3 2 5 4 1 ]

slide-23
SLIDE 23

Liberman and Prince’s (1977) version

The rules are defined on local syntactic trees as follows: In a configuration [A B], if the constituent is a phrase, B is strong (= NSR) if the constituent is a word, B is strong iff it branches (= CSR)

slide-24
SLIDE 24

Syntax

  • To assign phrasal stress we need a syntactic parse.
  • We used the Stanford Parser (Chen and Manning 2014)
  • http://nlp.stanford.edu/software/lex-parser.shtml
slide-25
SLIDE 25

Lexical stress

(a) Unstressed words: it Unstressed tags: CC, PRP$, TO, UH, DT Unstressed deps: det, expl, cc, mark (b) Ambiguous words: this, that, these, those Ambiguous tags: MD, IN, PRP, WP$, PDT, WDT, WP, WRB Ambiguous deps: cop, neg, aux, auxpass (c) All other words, tags, and deps are stressed.

slide-26
SLIDE 26

Phrasal stress

A sentence has 2n stress paths where n = the # of ambiguous words. Example: I ask you to share with me today the majesty of this moment

(Richard Nixon, Inaugural Address, January 20, 1969, Sentence 2)

Stress paths: 26 = 64

slide-27
SLIDE 27

Phrasal stress

Instead of examining all parses we limit ourselves to the following: Model 1: All ambiguous words  unstressed Model 2: All monosyllabic ambiguous words  unstressed; all polysyllabic ambiguous words  stressed Model 3: All ambiguous words  stressed Model 4: The ensemble model (= mean model)

slide-28
SLIDE 28

the savings of many years in thousands of families are gone

(FDR, Inaugural Address, March 4, 1933, Sentence 19)

slide-29
SLIDE 29
  • 2. Perceived stress
slide-30
SLIDE 30

What is perceived stress?

Perceived stress = syllable prominence felt by a native speaker Syllable prominence is “for the large part the work of the perceiver, generating his internal accent pattern on the basis of a strategy by which he assigns structures to the utterances. These structures, however, are not fabrications of the mind only, for they can be related to sound cues.”

(van Katwijk 1974: 5, cited in Baart 1987: 4)

slide-31
SLIDE 31

No attempt to eliminate variation

  • Two native speakers may perceive the same prominence contour

differently: transcriptions reflect the grammar of the annotators.

  • Variation is not noise, but data. We did not attempt to eliminate

variation from transcriptions, beyond loose annotation guidelines.

  • Interannotator reliability is good (Cronbach’s alpha = 0.85).
slide-32
SLIDE 32

The Metric Gold annotation interface

slide-33
SLIDE 33

Predicted stress (the mean model)

“We are going to begin to act, beginning today” (Reagan 1981)

slide-34
SLIDE 34

Perceived stress (Annotator 1)

“We are going to begin to act, beginning today” (Reagan 1981)

slide-35
SLIDE 35

Predicted vs. perceived stress (Annotator 1)

slide-36
SLIDE 36

Predicted vs. perceived stress (Annotator 2)

slide-37
SLIDE 37

The information-theoretic view

“The error of attributing to syntax what belongs to semantics comes from concentrating on the commonplace. In phrases like bóoks to write, wórk to do, clóthes to wear, fóod to eat, léssons to learn, gróceries to get - as they occur in most contexts - the verb is highly predictable: food is to eat, clothes are to wear, work is to do, lessons are to learn. Less predictable verbs are less likely to be de-accented-where one has léssons to learn, one will probably have pássages to mémorize. It is only incidental that the syntax favors one or the other accent pattern.”

(Bolinger 1972, pp. 634)

slide-38
SLIDE 38

Approximating the information of a word

doc.freq Document lexical frequency d.cp.1 Document conditional probability (unigram) d.cp.2 Document conditional probability (bigram) d.cp.3 Document conditional probability (trigram) d.inform.2 Document informativity (bigram) d.inform.3 Document informativity (trigram) corpus.freq Corpus lexical frequency c.cp.1 Corpus conditional probability (unigram) c.cp.2 Corpus conditional probability (bigram) c.cp.3 Corpus conditional probability (trigram) c.inform.2 Corpus informativity (bigram) c.inform.3 Corpus informativity (trigram)

slide-39
SLIDE 39

Perceived stress vs. corpus frequency (Annotator 1)

slide-40
SLIDE 40

Perceived stress vs. corpus frequency (Annotator 2)

slide-41
SLIDE 41

Perceived stress vs. bigram informativity (Annotator 1)

slide-42
SLIDE 42

Perceived stress vs. bigram informativity (Annotator 2)

slide-43
SLIDE 43

Informativity vs. linear position (Prague school)

(Pearson correlation = 0.02, p = 0.01643)

slide-44
SLIDE 44

Information vs. predicted stress (Bolinger 1957)

(Pearson correlation = 0.40, p < 2.2e-16)

slide-45
SLIDE 45

Bolinger correlations (stress vs. information)

Perceived (A1) Perceived (A2) Predicted Bush, Jr.

  • 0.5989862
  • 0.5461876
  • 0.4721894

Bush, Sr.

  • 0.5623127
  • 0.5184216
  • 0.4511763

Carter

  • 0.5693488
  • 0.5368974
  • 0.4670454

Clinton

  • 0.5652638
  • 0.5078555
  • 0.4798756

Obama

  • 0.5259391
  • 0.5275555
  • 0.470289

Reagan

  • 0.5326005
  • 0.5143632
  • 0.4535813

green = highest score red = lowest score

slide-46
SLIDE 46

Regression modeling: Predicting perceived stress

Favors: high bigram informativity high mechanical stress being a noun late sentence position (but this depends on the annotator) Disfavors: being a verb being a function word

slide-47
SLIDE 47

Modeling perceived stress (Annotator 1)

lm(formula = annotator1.log ~ c.inform.2 + mmean + category + widx, data = all.presidents.data.core) Residuals: Min 1Q Median 3Q Max

  • 1.4781 -0.3108 -0.0468 0.2918 1.5699

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.7171844 0.0232962 30.786 < 2e-16 *** c.inform.2 0.0647866 0.0021650 29.924 < 2e-16 *** mmean 0.0634744 0.0049675 12.778 < 2e-16 *** categoryFUNC -0.4661109 0.0168874 -27.601 < 2e-16 *** categoryNOUN 0.1014131 0.0155456 6.524 7.17e-11 *** categoryVERB -0.1791090 0.0162495 -11.022 < 2e-16 *** widx 0.0015645 0.0004046 3.867 0.000111 ***

  • Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3911 on 10945 degrees of freedom (30 observations deleted due to missingness) Multiple R-squared: 0.5026, Adjusted R-squared: 0.5023 F-statistic: 1843 on 6 and 10945 DF, p-value: < 2.2e-16

slide-48
SLIDE 48

Modeling perceived stress (Annotator 2)

lm(formula = annotator2.log ~ c.inform.2 + mmean + category + widx, data = all.presidents.data.core) Residuals: Min 1Q Median 3Q Max

  • 1.2963 -0.2193 -0.1133 0.2089 1.5097

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.5530634 0.0219427 25.205 < 2e-16 *** c.inform.2 0.0695084 0.0020395 34.082 < 2e-16 *** mmean 0.0346362 0.0046788 7.403 1.43e-13 *** categoryFUNC -0.5283504 0.0158994 -33.231 < 2e-16 *** categoryNOUN 0.0695119 0.0146388 4.748 2.08e-06 *** categoryVERB -0.1949493 0.0153017 -12.740 < 2e-16 *** widx 0.0004550 0.0003811 1.194 0.233

  • Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3685 on 10946 degrees of freedom (29 observations deleted due to missingness) Multiple R-squared: 0.5382, Adjusted R-squared: 0.5379 F-statistic: 2126 on 6 and 10946 DF, p-value: < 2.2e-16 >

slide-49
SLIDE 49

The relative importance of predictors of perceived stress

slide-50
SLIDE 50

The relative importance of predictors of perceived stress

slide-51
SLIDE 51

The POS effect: Perceived stress vs. informativity

slide-52
SLIDE 52

The POS effect: Perceived stress vs. informativity

slide-53
SLIDE 53

“confronting” vs. “have”

We must show courage in a time of blessing by confronting problems instead of passing them on to future generations . (stress = 5) Can we solve the problems confronting us? (stress = 3) We don't have to talk late into the night about which form of government is better . (stress = 5) We have every right to dream heroic dreams . (stress = 0) (There are 51 examples like this in the corpus.) inform(confronting) = 6.054915 inform(have) = 3.809777

slide-54
SLIDE 54

The POS effect: Perceived stress vs. corpus frequency

slide-55
SLIDE 55

The POS effect: Perceived stress vs. corpus frequency

slide-56
SLIDE 56

Mean stresses for different parts of speech

slide-57
SLIDE 57

Partial effects: Perceived stress in nouns

slide-58
SLIDE 58

Partial effects: Perceived stress in verbs

slide-59
SLIDE 59
  • 3. General implications
slide-60
SLIDE 60

Lexical frequency effects

Observation: High-frequency words reduce, low frequency words don’t. Conjecture:

  • Low-frequency words (especially nouns) are high in information and

tend to occur in nuclear stress positions.

  • Hence they get high levels of phrasal stress.
  • Stress prevents reduction.
  • Hence low-frequency words (especially nouns) resist reduction.

If this is correct, lexical frequency effects reflect crystallized phrasal stress (cf. Coetzee and Kawahara 2013).

slide-61
SLIDE 61

Lexical category effects

There are accentual differences among parts of speech (Ladd 1996). Here’s one “accentability hierarchy”:

nouns > other lexical words > function words

Accent rule:

  • Accent is placed on the most accentable element of the focused

constituent.

  • If two elements belong to the same category, the one further to the

right in the sentence is more accentable.

slide-62
SLIDE 62

Two accentability hierarchies (cited in Baart 1987)

  • command verbs > quantifiers > nouns > sentence adverbs >

adjectives > main verbs > negatives > pronouns > auxiliary verbs > copulatives > relatives > possessive determiners > prepositions > conjunctions > articles (Lea 1979)

  • sentential adverbs > negatives > dummy auxiliaries in positive

sentences > quantifiers > certain modals > adjectives > regular adverbs > nouns > negative contractions > verbs > demonstrative pronouns > prepositions > auxiliaries > articles (O'Shaughnessy and Allen 1983)

slide-63
SLIDE 63

What on earth are these hierarchies?

  • Are they primitives of grammar?
  • Perhaps they reflect the typical distribution of mechanical stress:

nuclear stress falls typically on nouns, less typically on verbs, and least typically on function words.

  • Conjecture: Accentability hierarchies derive from sentence stress.
slide-64
SLIDE 64

More lexical category effects

  • A phonological “privilege scale” N > A > V manifests itself in

segmental phonology and is near-universal (Smith 2011).

  • Mean word lengths in the presidents corpus:

SEGMENTS SYLLABLES FEET Nouns: 5.641571 2.115994 1.098343 Adjectives: 5.153664 1.973995 1.056738 Verbs: 3.862823 1.40507 1.027336

  • Smith (1997) proposes special “noun faithfulness” constraints.
slide-65
SLIDE 65

Recall Bolinger’s proposal: Accent goes by information

(a) I have LESSONS to learn. I have PASSAGES to MEMORIZE. (b) Those are CRAWLING things. Those are CRAWLING INSECTS. (c) I've got to SEE a guy. I've got to SEE a DOCTOR. But the verbs also differ phonologically (monosyllable vs. polysyllable)

slide-66
SLIDE 66

Number of segments: Nouns

slide-67
SLIDE 67

Number of segments: Verbs

slide-68
SLIDE 68

More lexical category effects: Foot structure

  • Nouns tend to be exhaustively footed; adjectives and verbs variably

so; function words tend to be extrametrical. Finnish: /tavara-i-ta/ (tá.va)(ròi.ta) ‘thing-PL-PAR’ noun vowel and consonant preserved /avara-i-ta/ (á.va)ri.a ‘wide-PL-PAR’ adjective vowel and consonant deleted

slide-69
SLIDE 69
  • 4. Summary
slide-70
SLIDE 70

Summary of results

  • 1. Both mechanical and semantic stress are real:
  • Noun and adjective stresses tend to be loud and mechanical.
  • Verb and function word stresses tend to be soft and semantic.
  • 2. Stress levels vary significantly across parts of speech:

nouns > adjectives

> verbs > function words

  • 3. Speculation: Lexical frequency and lexical category effects

reflect the differential distribution of sentence stress.

slide-71
SLIDE 71

References (1)

Baart, Joan. 1987. Focus, Syntax, and Accent Placement, Ph.D. Dissertation, University of Leiden. Bolinger, Dwight L. 1957. Maneuvering for Stress and Intonation. College Composition and Communication, 8(4), 234-238. Bolinger, Dwight L. 1972. Accent is predictable (if you are a mind reader). Language 48, 633- 644. Bybee, Joan L. 2001. Phonology and Language Use, Cambridge University Press, Cambridge, U.K. Calhoun, Sasha. 2010. How does informativeness affect prosodic prominence? Language and Cognitive Processes 25(7-9), 1099-1140. Chen, Danqi and Christopher D Manning. 2014. A Fast and Accurate Dependency Parser using Neural Networks, Proceedings of EMNLP 2014. Cinque, Guglielmo. 1993. A null-theory of phrase and compound stress. Linguistic Inquiry 24: 239-298.

slide-72
SLIDE 72

References (2)

Chomsky, Noam, Morris Halle and Fred Lukoff. 1956. ‘On accent and juncture in English’, in

  • M. Halle et al. (eds.), For Roman Jakobson: Essays on the occasion of his sixtieth birthday,

Mouton & Co., The Hague, pp. 65-80. Chomsky, Noam and Morris Halle. 1968. The Sound Pattern of English, Harper and Row, New York. Coetzee, Andries and Shigeto Kawahara. 2013. ‘Frequency biases in phonological variation’, Natural Language and Linguistic Theory, 31, 47-89. Cohen Priva, Uriel. 2012. Sign and signal: Deriving linguistic generalizations from information

  • utility. Unpublished doctoral dissertation, Stanford University.

Firbas, Jan. 1971. On the concept of communicative dynamism in the theory of functional sentence perspective. Sbornik Prací Filosofické Fakulty Brnénské University (Studia Minora Facultatis Philosophicae Universitatis Brunensis) A-19. 135-144. Gussenhoven, Carlos. 1983. Focus, mode and the nucleus. Journal of linguistics 19(2), 377-417. Jaeger, T. Florian. 2010. Redundancy and reduction: Speakers manage syntactic information

  • density. Cognitive psychology 61(1), 23-62.
slide-73
SLIDE 73

References (3)

Jespersen, Otto. 1920. Lehrbuch der Phonetik: Mit 2 Tafeln. BG Teubner. Ladd, D. Robert. 1996. Intonational Phonology. Cambridge University Press. Levy RP, Jaeger TF. 2007. Speakers optimize information density through syntactic reduction. In Proceedings of the 20th Conference on Advances in Neural Information Processing Systems (NIPS 2007), ed. J. C. Platt, D. Koller, Y. Singer, S. T. Roweis, pp. 849-56. Curran Assoc., Red Hook, N.Y. Liberman, Mark, and Alan Prince. On stress and linguistic rhythm. Linguistic Inquiry 8(2), 249- 336. Smith, Jennifer L. 1997. Noun faithfulness: On the privileged behavior of nouns in phonology. Rutgers Optimality Archive, http://ruccs. rutgers. edu/roa. Smith, Jennifer L. 2011. Category-specific effects. The Blackwell Companion to Phonology 4, 2439-2463.

slide-74
SLIDE 74

Credits

Thanks for collaboration:

  • Alex Wade

Thanks for funding:

  • Stanford University, Office of the Vice-Provost for Undergraduate Education
  • The Roberta Bowman Denning Initiative Committee, H&S Dean’s Office

Thanks for advice, comments, suggestions, and criticisms:

  • Jared Bernstein, Joan Bresnan, Penny Eckert, Ryan Heuser, Paul Kiparsky,

Mark Liberman