Weighting the Contraints on Word-order Variation in German Markus - - PowerPoint PPT Presentation

weighting the contraints on word order variation in german
SMART_READER_LITE
LIVE PREVIEW

Weighting the Contraints on Word-order Variation in German Markus - - PowerPoint PPT Presentation

Weighting the Contraints on Word-order Variation in German Markus Bader & Jana H aussler University of Konstanz QITL 2 - Osnabr uck 2006 p. 1 Introduction The expression of focus syntax vs. phonology Sentence-focus (Context:


slide-1
SLIDE 1

Weighting the Contraints on Word-order Variation in German

Markus Bader & Jana H¨ aussler University of Konstanz

QITL 2 - Osnabr¨ uck 2006 – p. 1

slide-2
SLIDE 2

Introduction

The expression of focus – syntax vs. phonology Sentence-focus (Context: What happened?): Rightmost stress and canonical word-order in English and Italian (1)

  • a. English: [F John has LAUGHED]
  • b. Italian: [F Gianni ha RISO]

Subject-focus (Context: Who has laughed?): English retains canonical word-order, Italian retains rightmost stress (2)

  • a. English: [F JOHN] has laughed
  • b. Italian: Ha riso [F GIANNI]

QITL 2 - Osnabr¨ uck 2006 – p. 2

slide-3
SLIDE 3

Introduction

Sentence-focus (Context: What happened?): Rightmost stress and canonical word-order (3) [F Jan hat GELACHT] John has laughed Subject-focus (Context: Who has laughed?): Either canonical word-order is retained, or rightmost stress (4)

  • a. [F JAN] hat gelacht

John has laughed

  • b. Gelacht hat [F JAN]

laughed has John

QITL 2 - Osnabr¨ uck 2006 – p. 3

slide-4
SLIDE 4

Introduction

Strategies to express focus English: fixed word-order and flexible focus assignment Italian: flexible syntax and fixed focus structure German: flexible syntax and flexible focus assignment Flexibility in German includes the ordering between subject and object: Who called the minister? (5)

  • a. [F Der VATER]

hat den Pfarrer angerufen the father-NOM has the minister-ACC called

  • b. Den Pfarrer

hat [F der VATER] angerufen the minister-ACC has the father-NOM called

QITL 2 - Osnabr¨ uck 2006 – p. 4

slide-5
SLIDE 5

Introduction

Questions of the current study What are the relevant factors determining the order between subject and object in German? (Are all factors suggested in the literature indeed necessary to determine word-order in German?) What is the weight of each of these factors? Do the factors (or their weight) differ for embedded vs. main clauses?

QITL 2 - Osnabr¨ uck 2006 – p. 5

slide-6
SLIDE 6

Introduction

Outline of the talk The grammar of word-order variation Previous corpus studies of word-order variation in German The current corpus study A comparison with results from language comprehension General discussion

QITL 2 - Osnabr¨ uck 2006 – p. 6

slide-7
SLIDE 7

The grammar of word-order variation

‘Word-order’ freedom involving the prefield (SpecCP): (Only possible in main clauses) (6)

  • a. Der Vater

hat den Pfarrer besucht the father-NOM has the minister-ACC visited

  • b. Den Pfarrer

hat der Vater besucht the minister-ACC has the father-NOM visited

QITL 2 - Osnabr¨ uck 2006 – p. 7

slide-8
SLIDE 8

The grammar of word-order variation

‘Word-order’ freedom involving the middlefield (between C◦ and the clause-final verb(s)): (7)

  • a. Sicher hat der Vater

den Pfarrer besucht Surely has the father-NOM the minister-ACC visited

  • b. Sicher hat den Pfarrer

der Vater besucht Surely has the minister-ACC the father-NOM visited (8)

  • a. dass der Vater

den Pfarrer besucht hat that the father-NOM the minister-ACC visited has

  • b. dass den Pfarrer

der Vater besucht hat that the minister-ACC the father-NOM visited has

QITL 2 - Osnabr¨ uck 2006 – p. 8

slide-9
SLIDE 9

The grammar of word-order variation

For German, SO is considered to be the canonical word-order. Two sources have been proposed for using OS: Information structure: In order to have focus in its default pre-verb position, a non-focused object may be moved to the left. In order to have the topic in its preferred initial position, an object in topic function might be moved to the left. Argument structure: Particular verbs (e.g. psych-verbs, unaccusative verbs) license the use of OS.

QITL 2 - Osnabr¨ uck 2006 – p. 9

slide-10
SLIDE 10

The grammar of word-order variation

Factors that have been suggested to affect word-order discourse related properties: definiteness: definite < indefinite focus/background topichood: topic first semantic properties: agency: agent < non-agent animacy: animate < inanimate length: favoring OS when the object is shorter than the subject

QITL 2 - Osnabr¨ uck 2006 – p. 10

slide-11
SLIDE 11

Previous corpus studies

Hoberg (1981): Base word order (N–D–A)pron – ((N–D–A)+ani – (N–D–A)−ani)nom – (N,D,A)FN pronominal argument < non-pronominal argument

  • rdering of pronouns: NOM < ACC < DAT

non-pronominal arguments: animate < inanimate Semantically opaque arguments are adjacent to the verb (‘Funktionsverbgefüge’)

QITL 2 - Osnabr¨ uck 2006 – p. 11

slide-12
SLIDE 12

The current corpus study

Data base Newspaper corpus of the IDS (Mannheim) We queried for den (object introduced by the definite article den) with further restrictions (see below). Motivation for den: combination of corpus and comprehension studies Focus of comprehension studies: Syntactic ambiguity resolution Depending on the noun, objects with den are ambiguous between dative and accusative

QITL 2 - Osnabr¨ uck 2006 – p. 12

slide-13
SLIDE 13

The current corpus study

Sentences were randomly sampled by the COSMAS-System four sets: 2 clause types x 2 positional restrictions dass (...) den: embedded clauses, unconstrained

  • wrt. to the position of the object

dass den: embedded clauses, object-initial (immediately following the complementizer) den: main clauses, unconstrained wrt. to the position of the object Den: main clauses, object-initial Sentences in which den was not a verbal argument were subsequently removed (e.g. den within PPs)

QITL 2 - Osnabr¨ uck 2006 – p. 13

slide-14
SLIDE 14

The current corpus study

The sentence sets were annotated for the following properties: case voice animacy definiteness pronominality length of subject and object (number of words)

QITL 2 - Osnabr¨ uck 2006 – p. 14

slide-15
SLIDE 15

The current corpus study

Table 1: Overview of embedded clauses

Unconstrained Object-initial Total 1178 930 S=Nominal 835 838 S=Pronominal 335 no S 8 91 1 NP Arg 8 91 2 NP Args 1019 828 3 NP Args 151 11 4 NP Args

QITL 2 - Osnabr¨ uck 2006 – p. 15

slide-16
SLIDE 16

The current corpus study

Table 2: Overview of main clauses

Unconstrained Object-initial Total 668 827 S=Nominal 518 602 S=Pronominal 146 198 no S 4 27 1 NP Arg 4 27 2 NP Args 559 719 3 NP Args 104 80 4 NP Args 1 1

QITL 2 - Osnabr¨ uck 2006 – p. 16

slide-17
SLIDE 17

The current corpus study

The position of subject pronouns: Embedded clauses: In our data, a pronominal subject precedes the non-pronominal den-object without exception This confirms earlier observations of a categorial constrain “S[+pron] < O[−pron]” in the middlefield Main clauses: Both orderings are attested, but when the object precedes a pronominal subject, the object is always in the prefield. This is expected given that the prefield and the middlefield-initial position have different syntactic characteristics.

QITL 2 - Osnabr¨ uck 2006 – p. 17

slide-18
SLIDE 18

Focus of this Presentation

Factors determining word-order in German Comparison of word order in the middlefield and word-order involving the prefield Word-order variation in the middlefield: corpora containing embedded clauses (possible differences between the middlefield in main and embedded clauses won’t be discussed) Word-order variation involving the prefield: corpora containing main clauses, with all sentences in which both subject and object are in the middlefield removed Note: In the following, we will present data for sentences with a non-pronominal subject only.

QITL 2 - Osnabr¨ uck 2006 – p. 18

slide-19
SLIDE 19

Word-order: Basic Results

Table 3: Percentages of SO-sentences by case and

  • nr. of arguments

Middlefield (emb.) Prefield (main) Accusative Dative Accusative Dative 2 Args. 99.25 50.93 89.2 49.2 3 Args. 98.65 96.10 87.5 94.3

QITL 2 - Osnabr¨ uck 2006 – p. 19

slide-20
SLIDE 20

Word-order: Basic Results

Table 4: Percentages of accusative-sentences by

  • rder and nr. of arguments

Middlefield (emb.) Prefield (main) Sub>Obj Obj>Sub Sub>Obj Obj>Sub 2 Args. 88.0 5.4 (6.4) 91.6 56.2 (76.0) 3 Args. 49.7 25.0 (0) 48.5 71.4 (67.0)

QITL 2 - Osnabr¨ uck 2006 – p. 20

slide-21
SLIDE 21

Word-order: Basic Results

Summary for 2-argument sentences: OS-order in the middlefield occurs mainly with dative

  • bjects

For dative objects, there is no difference between middlefield and prefield For accusative objects, prefield OS is much more frequent than middlefield OS Summary for 3-argument sentences: Sentences with three arguments were almost always realized with SO-order.

QITL 2 - Osnabr¨ uck 2006 – p. 21

slide-22
SLIDE 22

Lexical-Conceptual Factors

Animacy in accusative sentences

20 40 60 80 100 Distribution of Animacy Features Percentage of sentences S[+an] O[+an] S[+an] O[−an] S[−an] O[+an] S[−an] O[−an] SO Middlefield SO Prefield OS Middlefield OS Prefield

QITL 2 - Osnabr¨ uck 2006 – p. 22

slide-23
SLIDE 23

Lexical-Conceptual Factors

Animacy in accusative sentences

20 40 60 80 100 Distribution of Animacy Features Percentage of sentences S[+an] O[+an] S[+an] O[−an] S[−an] O[+an] S[−an] O[−an] SO Middlefield SO Prefield OS Middlefield OS Prefield

QITL 2 - Osnabr¨ uck 2006 – p. 23

slide-24
SLIDE 24

Lexical-Conceptual Factors

Animacy in dative sentences

20 40 60 80 100 Distribution of Animacy Features Percentage of sentences S[+an] O[+an] S[+an] O[−an] S[−an] O[+an] S[−an] O[−an] SO Middlefield SO Prefield OS Middlefield OS Prefield

QITL 2 - Osnabr¨ uck 2006 – p. 24

slide-25
SLIDE 25

Lexical-Conceptual Factors

Animacy in dative sentences

20 40 60 80 100 Distribution of Animacy Features Percentage of sentences S[+an] O[+an] S[+an] O[−an] S[−an] O[+an] S[−an] O[−an] SO Middlefield SO Prefield OS Middlefield OS Prefield

QITL 2 - Osnabr¨ uck 2006 – p. 25

slide-26
SLIDE 26

Lexical-Conceptual Factors

Animacy in OS sentences

20 40 60 80 100 Distribution of Animacy Features Percentage of sentences S[+an] O[+an] S[+an] O[−an] S[−an] O[+an] S[−an] O[−an] Acc Middle Acc Pre Dat Middle Dat Pre

QITL 2 - Osnabr¨ uck 2006 – p. 26

slide-27
SLIDE 27

Lexical-Conceptual Factors

Summary: In the middlefield: OS-order occurs mainly with S[−animate] and O[+animate] Animate subjects occur mainly in the order SO. For dative objects, middlefield OS and prefield OS pattern together For accusative objects, prefield OS and middlefield SO pattern together

QITL 2 - Osnabr¨ uck 2006 – p. 27

slide-28
SLIDE 28

Verb-Related Factors

What kind of constructions occur with inanimate subject and animate object? Passivized ditransitive verbs (cf. (9)) → dative object Unaccusative verbs (cf. (10)) → dative object (9) . . . dass dem Kind ein Bär geschenkt wurde that the child-DAT a bear-NOM given was (10) . . . dass dem Kind ein Witz eingefallen ist that the child-DAT a joke-NOM come-to-mind is

QITL 2 - Osnabr¨ uck 2006 – p. 28

slide-29
SLIDE 29

Verb-Related Factors

What kind of constructions occur with inanimate subject and animate object? Object-experiencer verbs take either an accusative (cf. (11)) or a dative object (cf. (12)). (11) . . . dass das Kind der Witz gelangweilt hat that the child-ACC the joke-NOM annoyed has (12) . . . dass dem Kind der Witz gefallen hat that the child-DAT the joke-NOM pleased has

QITL 2 - Osnabr¨ uck 2006 – p. 29

slide-30
SLIDE 30

Verb-Related Factors

Table 5: Percentages of passive usage and sein- auxiliary by order

Middlefield (emb.) Prefield (main) Sub>Obj Obj>Sub Sub>Obj Obj>Sub % Pass. 9.1 46.7 (41.0) 18.0 21.0 (37.0) % ‘sein’ 27.1 69.6 (66.7) 27.8 33.3 (43.9)

QITL 2 - Osnabr¨ uck 2006 – p. 30

slide-31
SLIDE 31

Verb-Related Factors

Summary: Passivized ditransitive verbs and unaccusative verbs account for a substantial amount of OS sentences with dative objects. These constructions are not compatible with accusative

  • case. This accounts for the finding that OS-order in the

middlefield is rare with accusative objects. Topics under current investigation: Can the factor ‘animacy’ and the verb-related factors (passivization, unaccusativity) be reduced to the lexical semantics of verbs?

QITL 2 - Osnabr¨ uck 2006 – p. 31

slide-32
SLIDE 32

Constituent Length

Question: Does constituent weight affect the ordering between subject and object? So far, we have only computed weight in terms of constituent length measured by number of words. Phenomena like extraposition have not yet been taken into account.

QITL 2 - Osnabr¨ uck 2006 – p. 32

slide-33
SLIDE 33

Constituent Length

Table 6: Mean difference between length of object and length of subject (measured in words)

Middlefield (emb.) Prefield (main) Sub>Obj Obj>Sub Sub>Obj Obj>Sub Accusative 0.84

  • 0.45
  • 0.02

0.34 Dative 0.48 0.05

  • 0.45

0.02

QITL 2 - Osnabr¨ uck 2006 – p. 33

slide-34
SLIDE 34

Constituent Length

Middlefield SO sentences: Difference in length (O−S)

Density −10 −5 5 10 15 20 25 0.00 0.20

Middlefield OS sentences: Difference in length (O−S)

Density −10 −5 5 10 15 20 25 0.00 0.20

QITL 2 - Osnabr¨ uck 2006 – p. 34

slide-35
SLIDE 35

Constituent Length

Prefield SO sentences: Difference in length (O−S)

Density −40 −30 −20 −10 10 20 0.00 0.20

Prefield OS sentences: Difference in length (O−S)

Density −40 −30 −20 −10 10 20 0.00 0.10

QITL 2 - Osnabr¨ uck 2006 – p. 35

slide-36
SLIDE 36

Constituent Length

Table 7: Mean length of subject (measured in words)

Middlefield (emb.) Prefield (main) Sub>Obj Obj>Sub Sub>Obj Obj>Sub Accusative 2.67 3.91 3.94 4.34 Dative 2.82 3.16 3.32 4.43

QITL 2 - Osnabr¨ uck 2006 – p. 36

slide-37
SLIDE 37

Constituent Length

Table 8: Mean length of object (measured in words)

Middlefield (emb.) Prefield (main) Sub>Obj Obj>Sub Sub>Obj Obj>Sub Accusative 3.51 3.45 3.92 4.68 Dative 3.3 3.21 2.86 4.45

QITL 2 - Osnabr¨ uck 2006 – p. 37

slide-38
SLIDE 38

Constituent Length

Middlefield OS sentences: Object length

Density 5 10 15 20 25 30 0.0 0.3 0.6

Prefield OS sentences: Object length

Density 5 10 15 20 25 30 0.0 0.3 0.6

QITL 2 - Osnabr¨ uck 2006 – p. 38

slide-39
SLIDE 39

Constituent Length

Summary: In the middlefield, there is a slight tendency for initial subjects to be shorter than non-initial subjects, but no effect for objects. For prefield sentences, no clear tendency is visible. Subjects are somewhat shorter when in the prefield, but the reverse is true for objects. Overall, there is a tendency for “short before long” in the middlefield but “long before short” when S or O is in the prefield.

QITL 2 - Osnabr¨ uck 2006 – p. 39

slide-40
SLIDE 40

A Logistic Regression Model

A logistic regression model with the following factors was fitted to the sentence set “embedded clauses with order unconstrained”:

  • bject case

accusative or dative animacy of subject animate or inanimate animacy of object animate or inanimate voice active or passive perfect auxiliary ‘haben’ (have) or ‘sein’ (be) determiner of subject indefinite or definite nr of arguments 2 or 3

∆ length (O minus S)

QITL 2 - Osnabr¨ uck 2006 – p. 40

slide-41
SLIDE 41

A Logistic Regression Model

Estimate Pr(>|z|) (Intercept) 0.7332 0.59955

  • bject case = DAT
  • 2.1063

3.4e-05 *** sAni = inanimate

  • 1.8847

1.1e-05 ***

  • Ani = inanimate

1.9801 7.8e-06 *** voice = passive

  • 1.3986

0.00842 ** aux = ‘sein’(be)

  • 1.4053

0.00067 *** sDet = definite 1.5183 5.0e-05 *** nr of arguments 1.2614 0.03964 *

∆ length (O minus S)

  • 0.0443

0.42250

QITL 2 - Osnabr¨ uck 2006 – p. 41

slide-42
SLIDE 42

A Logistic Regression Model

The sentence set contains 86% SO-sentences The model correctly classifies 96% of all sentences (with p < 0.5 taken as OS and p ≥ 0.5 as SO). Correctness broken down by order and case: SO OS ACC 1.00 0.0 DAT 0.85 0.9 When applied to the set ‘embedded clauses with OS-sentences only’, the model correctly classified 80%

  • f all sentences (0% ACC, 86% DAT).

When applied to the set ‘main clauses with OS-sentences only’, the model correctly classified 15%

  • f all sentences (0% ACC, 61% DAT).

QITL 2 - Osnabr¨ uck 2006 – p. 42

slide-43
SLIDE 43

A Note on Comprehension

Method: Procedure: Speeded grammaticality judgments Material: Ambiguous and unambiguous sentences of various syntactic structures with den-objects. Only results for unambiguous embedded clauses are shown.

Table 9: Selected Results (% correct) from experi- ments with den-objects

S[+an]/O[+an] S[-an]/O[+an] Sub>Obj Obj>Sub Sub>Obj Obj>Sub Accusative 90 70 93 95 Dative 93 86 90 94

QITL 2 - Osnabr¨ uck 2006 – p. 43

slide-44
SLIDE 44

A Note on Comprehension

For comparison: Difficult garden-path sentences like (13) receive mean judgments of about 50%. (13) . . . dass Maria die Kinder besucht haben that Maria-ACC the children-NOM visited have Conclusion: Even the rarest kind of OS-sentences are clearly comprehensible, as shown in particular by the comparison to garden-path sentences. To some degree, the rareness of OS-structures is reflected in the comprehension results (in particular, ACC with S[+an]/O[+an])

QITL 2 - Osnabr¨ uck 2006 – p. 44

slide-45
SLIDE 45

Summary

Our corpus results are compatible with a rather conventional syntactic analysis which claims . . . The grammar can base-generate both SO- and OS-structures in the middlefield. The particular order is determined by argument structure properties of the verb. The prefield has to be filled by movement. Movement allows the deviation from the base-generated order. Movement within the middlefield (‘scrambling’) Fronting to SpecCP

QITL 2 - Osnabr¨ uck 2006 – p. 45

slide-46
SLIDE 46

Summary

The source of OS in our corpus In the middlefield, OS is by and large restricted to base-generation. Scrambling is rare (although people clearly can comprehend scrambled sentences). OS with O in the prefield has two sources: An OS structure is base-generated in the middlefield and O—as the highest argument—is fronted to SpecCP by default. An SO structure is base-generated in the middlefield and O is fronted to SpecCP for discourse reasons.

QITL 2 - Osnabr¨ uck 2006 – p. 46