Corpus Frequency of Relative Clause Association in Japanese [isi-ga - - PDF document

corpus frequency of relative clause association in
SMART_READER_LITE
LIVE PREVIEW

Corpus Frequency of Relative Clause Association in Japanese [isi-ga - - PDF document

Relative Clause Association Ambiguity (1) [ ] (global structural ambiguity) Corpus Frequency of Relative Clause Association in Japanese [isi-ga syokusinsiteiru] syzyo-no ani doctor-NOM palpating


slide-1
SLIDE 1

TL/MAPLL2014 @ UT August 13th, 2014 1/32

Corpus Frequency of Relative Clause Association in Japanese

Toshiyuki YAMADA Douglas ROLAND Manabu ARAI Yuki HIROSE (The University of Tokyo)

TL/MAPLL2014 @ UT August 13th, 2014

  • 0. Overview

2/32

Relative Clause Association Ambiguity

(global structural ambiguity)

(1) [医師が触診している]少女の兄

[isi-ga syokusinsiteiru] syôzyo-no ani

doctor-NOM palpating girl-GEN brother [RC] N1 N2

(two association sites available after the RC)

‘the brother of the girl [(that) the doctor is palpating]’ NP1 NP2 [RC]

(attachment ambiguity: two attachment sites available prior to the RC)

TL/MAPLL2014 @ UT August 13th, 2014

  • 0. Overview

3/32

Puzzle: Unforced Revision?

If we believe in Revision as Last Resort (Fodor & Frazier, 1980), the N1 association should be maintained.

[isi-ga syokusinsiteiru] doctor palpating

Unforced revision to N2 association (Yamada et al., 2014) Why is the N2 association preferred?

RC S N1 N2 syôzyo-no girl-GEN ani brother N1 Association N2 Association N1 Association

TL/MAPLL2014 @ UT August 13th, 2014

  • 0. Overview

4/32

Possibility: Structural Frequency Bias

RC association ambiguity: [RC] N1-GEN N2

  • Does a structural frequency bias lead to

the N2 association preference?

TL/MAPLL2014 @ UT August 13th, 2014 5/32

Roadmap

  • 0. Overview
  • 1. Introduction
  • 2. Yamada et al. (2014)
  • 3. The current study
  • 4. Discussion
  • 5. Conclusions

TL/MAPLL2014 @ UT August 13th, 2014

  • 1. Introduction

6/32

Structural Frequency Effects

RC attachment ambiguity (Cuetos & Mitchell, 1988) NP1 of-NP2 [RC]

  • native English speakers: low attachment preference
  • native Spanish speakers: high attachment preference

Corpus frequency (Mitchell et al., 1992)

  • English: 62% for low attachment low preferred
  • Spanish: 60% for high attachment high preferred

Frequency effects in PP attachment

(e.g., Katsika, 2011 for Greek, cf. Gibson et al., 1996 for English)

slide-2
SLIDE 2

TL/MAPLL2014 @ UT August 13th, 2014

  • 1. Introduction

7/32

Previous Findings on RC Association

Mixed results

  • N1 association preference (on-line evidence)
  • N2 association preference (off-line evidence)

(e.g., Kamide & Mitchell, 1997; Miyamoto et al., 2004; Uetsuki, 2007; Nakano, 2008)

a possibility: on-line revision N1 association N2 association Yamada et al. (2014): suggestive eye-tracking evidence

(cf. Bai et al., 2014: no commitment to the N1 association)

TL/MAPLL2014 @ UT August 13th, 2014

  • 2. Yamada et al. (2014)

8/32

(2a) N2-incompatible [Isi-ga tatta ima syokusinsiteiru] syôzyo-no ani-ga baiten-de kaimonositeiru.

doctor-NOM just now palpating girl-GEN brother-NOM store-at shopping ‘The brother of the girl that the doctor is palpating just now is shopping at a store.’

(2b) N2-incompatible [Isi-ga tatta ima syokusinsiteiru] kanzya-no ani-ga baiten-de kaimonositeiru.

doctor-NOM just now palpating patient-GEN brother-NOM store-at shopping ‘The brother of the patient that the doctor is palpating just now is shopping at a store.’

(2c) N2-compatible [Isi-ga tatta ima syokusinsiteiru] syôzyo-no ani-ga isu-de zittositeiru.

doctor-NOM just now palpating girl-GEN brother-NOM chair-at sitting still ‘The brother of the girl that the doctor is palpating just now is sitting still on a chair.’

(2d) N2-compatible [Isi-ga tatta ima syokusinsiteiru] kanzya-no ani-ga isu-de zittositeiru.

doctor-NOM just now palpating patient-GEN brother-NOM chair-at sitting still ‘The brother of the patient that the doctor is palpating just now is sitting still on a chair.’

On-line N2 Association Preference

TL/MAPLL2014 @ UT August 13th, 2014

  • 2. Yamada et al. (2014)

9/32

Unforced Revision

[RC] N1-GEN

  • initial commitment to the N1 association
  • revision to the N2 association

Although the N1 association is grammatical, Japanese comprehenders are willing to make an unforced revision to the N2 association at the point of N2. processing difficulty when the sentence-final main predicate is incompatible with the N2 association N2

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

10/32

Research Question 1

Is there a structural frequency bias towards the N2 association?

Prediction a strong frequency bias towards the N2 association when N1 and N2 associations are both plausible, the situation most similar to the experimental items in Yamada et al. (2014)

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

11/32

Method

  • corpus: Kyoto University Text Corpus

(http://nlp.ist.i.kyoto-u.ac.jp/EN/index.php?Kyoto%20University%20Text%20)

38,400 samples (i.e., sentences) in total

  • target samples: verb + N1 + no (GEN) + N2

4034 samples collected with restrictions, e.g.,

  • 1. verb dependent on either N1 or N2
  • 2. nouns could not be tame ‘for’, koto ‘the fact

(that )’, nado ‘etc.’, naka ‘inside’

  • procedure: false positive matches were excluded

861 samples remained for analysis

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

12/32

Data Analysis: Ambiguous RC Association

(5) Both N1 and N2 associations are plausible 押収した山林の土

  • osyuusita sanrin-no tuti

seized forest-GEN soil

‘ the soil of the forest that was seized ’

slide-3
SLIDE 3

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

13/32

Data Analysis: Forced RC Association

(6) a. N1 association forced 「オンリーワン」哲学を実践してきた三井氏の答え “Onrii wan” tetugaku-o zissensitekita Mitui-si-no kotae

  • nly one philosophy-ACC have.practiced

M.-Mr.-GEN answer

‘The answer of Mr. Mitsui that has practiced “Only one” philosophy ’

  • b. N2 association forced

低く抑えた男の声 Hikuku osaeta otoko-no koe

low made male-GEN voice

‘The voice of the male that is made low ’

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

14/32

Data Analysis: N1 or N2 Association

(7) a. N1 association intended この外交の意味と原則を理解しない政治家の言動

Kono gaikou-no imi to gensoku-o rikaisinai seizika-no gendou

this diplomacy-GEN meaning and principle-ACC understand.not politician-GEN behavior

‘The behavior of the politician that does not understand the meaning and principle of this diplomacy ’

  • b. N2 association intended

時代の変化に合った事業の見直し Zidai-no henka-ni atta zigyou-no minaosi

times-GEN change-for suitable project-GEN revision

‘the revision of the project that is suitable for the change accompanying the times ’

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

15/32

Frequency did not seem to explain RC association preferences in Japanese RC N1の N2

(GEN)

427 408 All examples N2 N1 Association

Frequency bias: English Spanish Japanese

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

16/32

But, there was a N2 bias when only ambiguous examples were considered RC N1の N2

(GEN)

336 237 Ambiguous examples only N2 N1 Association

Comprehenders may take ambiguity into account in generating association preferences.

TL/MAPLL2014 @ UT August 13th, 2014

  • 4. Discussion

17/32

Weak Bias towards the N2 Association

Finding 1 no bias overall, but N2 bias when both associations were plausible Contrary to the prediction, the bias towards the N2 association was not sufficiently strong to override the semantic preference for N1 association in Yamada et al. (2014) The structural frequency bias is not the only source, but comprehenders may take ambiguity into account in generating association preferences

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

18/32

Research Question 2

Do we observe the N2 bias when the situation is most similar to the experimental items in Yamada et al. (2014)?

Several interacting factors might affect comprehenders’ processing of [RC] N1-GEN N2

  • length of RC (short vs. long) (e.g., Hirose et al., 1998)
  • RC-type (subject- vs. object-extracted) (e.g., Uetsuki, 2006)
  • animacy of N1 and N2
  • properties of N1 and N2

(i) discourse status and (ii) uniqueness

slide-4
SLIDE 4

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

19/32

Possible Reasons for Modifying a Noun RC N1の N2

(GEN)

  • 1. Grounding a new referent

(e.g., Fox & Thompson, 1990)

new referents are more likely to be modified

  • 2. Identifying an old and non-

unique referent

  • ld and non-unique

referents are more likely to be modified

(Two forces push in opposite directions, but they are reasonable ones in the discourse.)

N2 is necessarily modified by N1

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

20/32

Data Analysis: Discourse Status

Criterion If the noun was either explicitly mentioned, or the existence

  • f its referent was implied in the five preceding sentences,

the N1/N2 was coded as discourse-old. (8) N1 coded as discourse-old gakkou-o riyou (preceding context) school-ACC use syakaisihon-no (the referent was implied) social.capital-GEN

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

21/32

Data Analysis: Uniqueness

Criterion If a referent of N1/N2 needed further modification for its identity such as politicians, it was coded as non-unique;

  • therwise, such as in the case of proper name, it was coded

as unique. (9) a. N2 coded as non-unique (nihon-no) seizika ‘the politician (of Japan)’ (Japanese politicians: still non-unique)

  • b. N1 coded as unique

Syaraku(-no mazo) ‘the mystery of Sharaku’

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

22/32

Research Question 2

Do we observe the N2 bias when the situation is most similar to the experimental items in Yamada et al. (2014)?

Prediction a frequency bias towards the N2 association when both referents of N1 and N2 are (i) discourse-new and (ii) non-unique

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

23/32

N2 more likely to be discourse-new RC N1の N2

(GEN)

100 100 Total 18 33 Old 82 67 New N2 N1 Discourse Status

Suggests that discourse-new referents are more likely to be modified

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

24/32

But, discourse status did not affect RC association RC N1の N2

(GEN)

13 9

Both old

52 63

Both new

30 24

Only N2 new

5 4

Only N1 new

N2 N1

Discourse Status

Association

Discourse status: N1-N2 modification RC association

slide-5
SLIDE 5

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

25/32

N2 more likely to be non-unique RC N1の N2

(GEN)

100 100 Total 17 41

Unique

83 59

Non-unique

N2 N1

Uniqueness Suggests that old and non-unique referents are more likely to be modified

TL/MAPLL2014 @ UT August 13th, 2014

  • 3. The Current Study

26/32

But, uniqueness did not affect RC association RC N1の N2

(GEN)

17 6

Neither non- unique

55 59

Both non- unique

28 35

Only N2 non- unique Only N1 non- unique

N2 N1

Uniqueness

Association

Uniqueness: N1-N2 modification RC association

TL/MAPLL2014 @ UT August 13th, 2014

  • 4. Discussion

27/32

No Effects of N1/N2 Properties

Finding 2 neither discourse status nor uniqueness affected RC association Contrary to the prediction, no bias towards the N2 association was found when both referents of N1 and N2 were (i) discourse-new and (ii) non-unique as in Yamada et al. (2014) While N1 may play a role in identifying N2 in the construction of [RC] N1-GEN N2, RC does not seem to play the same role

TL/MAPLL2014 @ UT August 13th, 2014

  • 5. Conclusions

28/32

Summary

Puzzle: unforced revision [RC] N1-GEN N1 association Answer: the frequency bias cannot account for the N2 association preference (Yamada et al., 2014) weak bias towards the N2 association when both associations are semantically plausible comprehenders may take ambiguity into account in generating RC association preferences N2 N2 association why?

TL/MAPLL2014 @ UT August 13th, 2014

  • 5. Conclusions

29/32

Future Issues

We need a better understanding of the role of RC:

Why do speakers choose to modify either N1 or N2 with a RC?

We need a better understanding of association biases in comprehension: Do comprehenders really have a strong bias towards the N2 association? Are the patterns in eye-movements in Yamada et al. (2014) due to other factors than the corpus frequency?

TL/MAPLL2014 @ UT August 13th, 2014

  • 5. Conclusions

30/32

Future Issues

It is still possible that distributional factors such as the following may lead to the N2 association bias:

  • 1. prosodic (phonetic-phonological) factors such as length of RC, length of

N1 and N2 (e.g., Hirose et al., 1998; Nakano & Kahraman, 2013)

  • 2. morpho-syntactic factors such as RC types (i.e., subject/object-

extracted), the matrix positions of N1 (i.e., subject/object), inherent adjunctness of N1 to N2 (e.g., Uetsuki, 2006)

  • 3. semantico-pragmatic factors such as semantic relations between RC

and N1 and between N1 and N2 (e.g., Aoyama & Inoue, 2005)

erratum in our paper text

slide-6
SLIDE 6

TL/MAPLL2014 @ UT August 13th, 2014 31/32

Thank you very much for your attention!

Comments welcome: toshiyamada@hotmail.co.jp

TL/MAPLL2014 @ UT August 13th, 2014 32/32

Selected References

Bai, Chunhua, Yuki Kobayashi, and Yuki Hirose (2014) Parsing of ambiguous relative clauses in Japanese: An event-related potentials study. IEICE Technical Report, TL2014-12, 1-6. Fodor, Janet Dean and Lyn Frazier (1980) Is the human sentence parsing mechanism an ATN? Cognition, 8, 417-459. Fox, Barbara A. and Sandra A. Thompson (1990) A discourse explanation of the grammar of relative clauses in English conversation. Language, 66(2), 297-316. Gibson, Edward, Carson T. Schütze, and Ariel Salomon (1996) The relationship between the frequency and the processing complexity of linguistic structure. Journal of Psycholinguistic Research, 25(1), 59-92. Hirose, Yuki, Atsu Inoue, Janet Dean Fodor, and Dianne C. Bradley (1998) Adjunct attachment ambiguity in Japanese: The role of constituent weight. Poster presented at the 11th Annual CUNY Conference on Human Sentence Processing, New Brunswick, NJ, March. Katsika, Kalliopi (2011) Attachment preferences and corpus frequencies in PP ambiguities: Evidence from Greek. Selected Papers from the 19th ISTAL. See our Proceedings paper for other references.