SLIDE 1
LSA 2018 | January 6, 2018
Feature change is not like deletion: Saltation in Harmonic Grammar
Jennifer L. Smith UNC Chapel Hill www.unc.edu/~jlsmith
SLIDE 2 Overview of the talk
- What happens when we model saltation (phonological derived-environment effects) in
Harmonic Grammar (HG)?
- We find that there are two distinct types of saltation:
- ne that cannot be modeled in HG
- ne that can
- This result identifies a domain where HG and OT make distinct empirical predictions
- This also has implications for theories of features and featural faithfulness
SLIDE 3
- 1. The problem: Saltation, cumulative constraint interaction, and HG
SLIDE 4 (1) Saltation process—Schematic abbreviation A–*B –C (a) Phonological process has the form /A/→[C] (b) ‘Skips over’ potential outcome B: *(/A/→[B]) (c) Even though:
- /A/ is more similar to [B] than it is to [C]
- /B/ surfaces faithfully, so [B] is not generally illegal
- Term saltation < Minkova (1993), Lass (1997) (diachronic); Hayes & White (2015) (synchronic)
- Also known as phonological non-derived environment blocking or phonological derived-
environment effects (e.g., Kiparsky 1993; Łubowicz 2002)
SLIDE 5 (2) Classical OT (Prince & Smolensky 1993/2004) cannot model saltation (Łubowicz 2002)
(a) A constraint ranking that maps /A/→[C] will map /B/→*[C] also (b) A constraint ranking that maps /B/→[B] will map /A/→*[B] also (assuming *A) (3) OT analyses of saltation typically involve cumulative constraint interaction (a) Local constraint conjunction (LCC) (Łubowicz 2002; Ito & Mester 2003)
(b) Comparative Markedness (McCarthy 2003) also essentially encodes faithfulness + markedness cumulative interaction
SLIDE 6 (4) What happens when we look at saltation in Harmonic Grammar (HG) (Legendre,
Miyata, & Smolensky 1990; Pater 2009, 2016)?
(a) Cumulative constraint interaction is intrinsic to HG
- HG constraints are weighted, rather than ranked
- The weighted violations for each candidate are summed to determine the
candidate’s harmony score (H) (b) But, Hayes & White (2015) have argued that saltation should not be something that the phonological grammar can easily model
- H&W argue that saltation processes are unnatural, so there must be a
learning bias against saltation patterns
- H&W specifically propose that devices like LCC (of markedness & faithfulness)
that easily model saltation should not be included in an OT grammar
SLIDE 7 (c) Does this mean that HG makes pathological predictions about saltation?
- No! The type of saltation most commonly seen in the literature cannot
arise from cumulative constraint interaction (gang effects) in HG. (5) I further identify a potential distinction between two types of saltation (a) Feature-scale saltation: A–*B –C, where A, B, C are all segments (b) Deletion saltation: A–*B –Ø, where A, B are segments and Ø is null
SLIDE 8
(6) Overview of results: (a) HG cannot model feature-scale saltation, but can model deletion saltation (b) By contrast, the two types are equivalent in OT: both are possible with LCC, impossible without LCC (c) If we find differences in learnability between the two types, this would be empirical support for HG over OT
SLIDE 9
- 2. Cumulative constraint interaction in HG: ATOs and gang effects
SLIDE 10 (7) Because HG uses weighted constraints, multiple lower-weighted constraints can ‘gang up’ to prevail against a higher-weighted constraint (not possible in OT)
- However, this only happens under particular circumstances
SLIDE 11 (8) HG gang effects only arise under asymmetric trade-offs (ATOs) (Pater 2009, 2016) /input/ C1 C2 C3 C4 H w=5 w=4 w=3 w=2 → (i) winner –1
(ii) competitor –1 –1
- 5
- Informally, an ATO occurs when:
(see Pater 2009, 2016 for more rigorous description)
(a) Some competitor does better than the winner on a higher-weighted constraint
- Here, the competitor (ii) does better on C2 than the winner (i)
(b) But, the competitor has a greater number of unshared violations of lower- weighted constraints
- Here, the competitor has violations of C3 and C4 that the winner does not
- 1 violation of C2 vs. 2 violations of C3, C4—asymmetric (not 1:1)
- A violation of only C3 or only C4 is not enough to overcome a violation of C2—but
C3 and C4 together can ‘gang up’ on C2
SLIDE 12 (9) Implications of ATOs (example): There is no ‘coda threshold’ in HG (Pater 2009, 2016)
- It can never be the case that n codas are allowed but >n codas are deleted,
because the trade-off between MAX and NOCODA is symmetric /CVC...CVC/ C1 NOCODA MAX C4 H w=5 w=i w=k w=2 ? (i) CV_...CV_ –1×n –n×i ? (ii) CVC...CVC –1×n –n×k
- Whether (i) or (ii) wins simply depends on whether i or k is a higher weight
- This is independent of n, the number of potential codas: no gang effect
SLIDE 13 (10) Crucially, shared violations can never contribute to gang effects (Pater 2009, 2016)
- See (17) below—this is a key difference between HG gang effects and LCC in OT
/input/ C1 C2 C3 C4 H w=5 w=4 w=3 w=2 (i) output1 –1 –1 | shared
→ (ii) output2 –1 –1 | shared
(a) No ATO here—the ‘trade-off’ is symmetric between 1 violation of C2 and1 of C3 (b) Without an ATO, there is no assignment of weights that will allow C3 and C4 to gang up on C2 so as to make candidate (i) win (given w(C2) > w(C3), w(C2) > w(C4))
SLIDE 14
- 3. Feature-scale saltation and cumulative constraint interaction
SLIDE 15 (11) Feature-scale saltation | A–*B –C ɡ–*k –x | Coda /ɡ/→[x] (skipping *[k]), in Colloquial Northern German
- Data from Ito & Mester (2003: 274, 291)
/ɡ/ /tso:ɡ/ [tso:x], *[tso:k]
‘pulled’, 1sg./1pl. /tʀu:ɡ/ [tʀu:x], *[tʀu:k] [tʀu:ɡ-ən] ‘carried’, 1sg./1pl. /fly:ɡ/ [flu:x], *[flu:k] [fly:ɡ-ə] ‘flight’, sg./pl.
[dɪk], *[dɪx] [dɪk-ə] ‘fat’, pred./attrib.pl.
- Points on scale /A/, (*B), [C] are all segments on continuum defined by features
A = /ɡ/ B = (*k) C = [x]
[±voice], [±continuant] [+voice] [–voice] [–voice] [–cont] [–cont] [+cont]
- Other cases include: Polish (Rubach 1984)
Sestu Campidanian Sardinian (Bolognesi 1998)
SLIDE 16 3.1 Feature-scale saltation with local constraint conjunction (LCC) in OT
(12) LCC analysis of saltation in OT follows Łubowicz (2002), Ito & Mester (2003)
- Works in the same way for both feature-scale and deletion saltation
- Tableaus all show unviolated *A at top—this is what drives /A/ to change
SLIDE 17 (13) LCC analysis for feature-scale saltation: A–*B –C | /A/→[C] (skipping *B)
- ɡ–*k –x | Coda /ɡ/→[x] (skipping *[k]), in Colloquial Northern German
/tso:ɡ/ *A *VOIOBCODA *B & ID(*A→B) *DORSST & ID[±voi] ID(*B→C) IDENT[±cont] *B *DORSSTOP ID(*A→B) IDENT[±voi] → (i) tso:x * * (ii) tso:k * W L * W * (a) Candidate (i) is the intended (saltation) winner (b) The crucial violation is IDENT(*B→C) because this favors the competitor, (ii)
- Here, IDENT[±cont] is violated by the winner (i) but not the competitor (ii)
(c) We know IDENT(*B→C) » *B because underlying /B/ doesn’t shift to [C]
- Here: IDENT[±cont] » *DORSSTOP, because /k/ does not become [x]
(d) LCC provides *B & IDENT(*A→B): Surface [B] is banned only if unfaithful
- Here: [k] loses to [x] (satisfying *DORSSTOP) only if IDENT[±voi] violated
(14) The OT tableau in (13) without the conjoined constraint (or something analogous) cannot model saltation
SLIDE 18
3.2 Feature-scale saltation in HG: No ATO, no gang effect
SLIDE 19 (15) No ATO in feature-scale saltation: A–*B –C | /A/→[C] (skipping *B)
- ɡ–*k –x | Coda /ɡ/→[x] (skipping *[k]), in Colloquial Northern German
/tso:ɡ/ *A *VOIOBCODA ID(*B→C) IDENT[±cont] *B *DORSSTOP ID(*A→B) IDENT[±voi]
w(*A)>w(ID(*A→B)) w(ID(*B→C))>w(*B) w(ID(*B→C))>w(*B) w(*A)>w(ID(*A→B))
?→ (i) tso:x –1 –1 | shared (ii) tso:k –1 –1 | shared (a) Candidate (i) is the intended (saltation) winner (b) The crucial violation (highest weighted) for (i) is IDENT(*B→C)
- Here, the saltation candidate (i) violates IDENT[±cont], but (ii) does not
(c) We know w(IDENT(*B→C)) > w(*B) because underlying /B/ doesn’t shift to [C]
- Here, w(IDENT[±cont]) > w(*DORSSTOP), because /k/ does not become [x]
SLIDE 20 (15) No ATO in feature-scale saltation: A–*B –C | /A/→[C] (skipping *B)
- ɡ–*k –x | Coda /ɡ/→[x] (skipping *[k]), in Colloquial Northern German
/tso:ɡ/ *A *VOIOBCODA ID(*B→C) IDENT[±cont] *B *DORSSTOP ID(*A→B) IDENT[±voi]
w(*A)>w(ID(*A→B)) w(ID(*B→C))>w(*B) w(ID(*B→C))>w(*B) w(*A)>w(ID(*A→B))
?→ (i) tso:x –1 –1 | shared (ii) tso:k –1 –1 | shared (d) There is no ATO between candidates (i) and (ii)
- IDENT(*A→B) violation is shared—cannot be part of a gang effect!
- Here: IDENT[±voi] is violated by both [x] and [k]
- Unshared violations: IDENT(*B→C) for (i) vs. *B for (ii)
- Here: IDENT[±cont] for (saltation) [x] vs. *DORSSTOP for (‘skipped’) [k]
- This is a 1:1 relation, not asymmetric
- Under no weighting conditions can (i) win by a gang effect
(e) There are no compatible weighting conditions under which (i) can win
SLIDE 21
(16) Result: HG does not make feature-scale saltation easy to model (a) Cumulative constraint interaction is an intrinsic part of HG (b) But—it is limited, because it arises only under an ATO (c) Because feature-scale saltation has no ATO, HG does not (pathologically) predict this saltation pattern for free
SLIDE 22 (17) The key difference between LCC and HG is the role of shared constraint violations (a) HG analysis—saltation candidate cannot win via gang effect
- The IDENT(*A→B) violation is shared by the two candidates—no ATO
- The fact that this violation is shared comes from the feature scale:
in ɡ–*k –x, the change in [±voi] is ‘inherited’ by [x]
- So no cumulative interaction with *B, no way to gang up on IDENT(*B→C)
(b) LCC analysis—saltation candidate can win
- *B & IDENT(*A→B) contributes its own * marks to the evaluation
- So the fact that the violation of IDENT(*A→B) is shared has no relevance; all
that matters is that the ‘skipped’ candidate violates both IDENT(*A→B) and *B
SLIDE 23
- 4. Deletion saltation and cumulative constraint interaction
SLIDE 24 (18) Deletion saltation | A–*B –Ø ɡ–*k –Ø | Coda /ɡ/→Ø (skipping *[k]) after /ŋ/, in Standard German
- Data from Ito & Mester (2003: 274, 289)
/dɪftɔŋɡ/ [dɪftɔŋ_], *[dɪftɔŋk] ‘diphthong’
‘to diphthongise’
[baŋk] ‘bank’
- Endpoint of scale (outcome [C]) is null
A = /ɡ/ B = (*k) C = Ø [+voice] [–voice] —
SLIDE 25
4.1 Deletion saltation with local constraint conjunction (LCC) in OT
SLIDE 26 (19) LCC analysis for deletion saltation: A–*B –Ø | /A/→Ø (skipping *B)
- ɡ–*k –Ø | Coda /ɡ/→Ø (skipping *[k]) after /ŋ/, in Standard German
/dɪftɔŋɡ/ *A *NG# * B & ID(*A→B) *CMPLXCODA & ID[±voi] MAX *B *CMPLXCODA ID(*A→B) IDENT[±voi] → (i) dɪftɔŋ_ * (ii) dɪftɔŋk * W L * W * W (a) Candidate (i) is the intended (saltation) winner; crucial violation is MAX (b) We know MAX » *B because underlying /B/ doesn’t delete
- Here: MAX » *COMPLEXCODA, because /k/ survives in a complex coda
(c) We know MAX » IDENT(*A→B) because /A/ doesn’t generally delete
- Here: MAX » IDENT[±voi], because /ɡ/ generally devoices rather than deleting
(d) LCC provides *B & IDENT(*A→B): [B] banned only if unfaithful
- Here: [k] loses to deletion (satisfying *CMPLXCODA) only if IDENT[±voi] violated
(20) Again, we see that (i) cannot win without the conjoined constraint
SLIDE 27
(21) OT predicts the same status of feature-scale saltation and deletion saltation with respect to learnability: (a) Can model both with LCC (b) Can model neither without LCC
SLIDE 28
4.2 Deletion saltation in HG: ATO and gang effect
SLIDE 29 (22) ATO in deletion saltation: A–*B –Ø | /A/→Ø (skipping *B)
- ɡ–*k –Ø | Coda /ɡ/→Ø (skipping *[k]) after /ŋ/, in Standard German
/dɪftɔŋɡ/ *A *NG# MAX *B *COMPLEXCODA ID(*A→B) IDENT[±voi]
w(*A)>w(ID(*A→B)) w(MAX)>w(*B) w(MAX)>w(*B) w(*A)>w(ID(*A→B))
→ (i) dɪftɔŋ_ –1 no violation! (ii) dɪftɔŋk –1 –1 (a) Candidate (i) is the intended (saltation) winner; crucial violation is MAX (b) This time, the IDENT(*A→B) violation is not shared—deletion satisfies IDENT (as
defined by McCarthy & Prince 1995)
- Here: Deletion candidate avoids the IDENT[±voi] violation incurred by [k]
(c) Asymmetric trade-off between MAX and { *B + IDENT(*A→B) }: If w(MAX) < w(*B) + w(IDENT(*A→B))—gang effect—then (i) can win
- Here: The combined weights of *COMPLEXCODA, IDENT[±voi] can overcome MAX
(23) Result: HG can model deletion saltation via a straightforward gang effect
SLIDE 30
4.3 Deletion saltation in HG: Implications for models of features and faithfulness
SLIDE 31 (24) Gang effects for deletion saltation depend on models of features and faithfulness (a) Is featural faithfulness regulated via IDENT, or MAX/DEP? (McCarthy & Prince 1995)
- IDENT[F]: Corresponding segments must match for [F]
- Satisfied under deletion: no corresponding segment = no mismatch
- MAX[F]/DEP[F]: Features themselves are in a correspondence relation and
incur violations of MAX[F] when deleted and DEP[F] when inserted (b) Are features binary, or privative?
- If features are binary, (for example) [ɡ] is [+voi] while [k] is [–voi]
- If features are privative, (for example) [ɡ] is [voi] while [k] is unspecified (or
[k] is [spread glottis] while [g] is unspecified, etc.)
SLIDE 32 (25) With IDENT[±F] (as above): Unshared violations are asymmetric—gang effects are possible
- [k] incurs a violation of IDENT[±voi] that the deletion candidate does not share
/dɪftɔŋɡ/ *NG# MAX-seg *COMPLEXCODA IDENT[±voi] → (i) dɪftɔŋ_ –1 no violation! (ii) dɪftɔŋk –1 –1 (26) With MAX/DEP[F] but binary features [±F]: Unshared violations are asymmetric—gang effects are possible
- [k] incurs a violation of DEP[–voi] that the deletion candidate does not share
/dɪftɔŋɡ/ *NG# MAX-other *COMPLEXCODA DEP[–voi] MAX[+voi] → (i) dɪftɔŋ_ –1 –1 | shared (ii) dɪftɔŋk –1 –1 –1 | shared
- MAX-other encapsulates MAX-seg and all relevant MAX[F] constraints
- Note: this shows that the formal definition of ATO (Pater 2009, 2016) needs refining
SLIDE 33 (27) With MAX/DEP[F] and privative features [F], if [F] is deleted: Unshared violations are symmetric—gang effects not possible
- No DEP[F] violation for [k] this time, because no [–voi] feature to be added
(If feature change involves [F] insertion rather than [F] deletion, there is still an ATO)
/dɪftɔŋɡ/ *NG# MAX-other *COMPLEXCODA MAX[voi] → (i) dɪftɔŋ_ –1 –1 | shared (ii) dɪftɔŋk –1 –1 | shared
SLIDE 34 (28) Summary: HG makes deletion saltation easy to model—under certain assumptions (a) An ATO arises for deletion saltation when a feature-change candidate violates faithfulness constraints that a deletion candidate does not violate
- True with IDENT: Only deletion satisfies IDENT[±F]
- True with binary [±F]: Only deletion satisfies DEP[–αF]
- Not true with MAX[F]/DEP[F] and privative [F]: MAX[F] violation is shared by
segment-deletion and feature-deletion candidates, so no ATO (b) With an ATO, deletion-saltation candidate is a potential winner in HG
SLIDE 35
- 5. Conclusions and implications
SLIDE 36 (29) Results: HG and saltation (a) HG cannot model feature-scale saltation (b) HG can model deletion saltation (under appropriate models of features, faithfulness) (30) Hayes & White (2015) propose that there is a learning bias against saltation (a) Difficult to learn in the laboratory (b) May be diachronically unstable
- But the cases they consider all involve feature-scale saltation
(31) H&W implement their learning bias in OT by excluding any formalism that makes saltation straightforward to model (such as LCC)
- Results here show that HG is compatible with this proposal—at least for feature-
scale saltation
SLIDE 37
(32)
How can we account for the fact that feature-scale saltation is attested? (a) Diachronic origin involves multiple stages—no single sound change creates a saltation pattern (Hayes & White 2015, citing Minkova 1993, Labov 1994, Lass 1997) (b) Synchronic grammar must have a way to overcome the anti-saltation learning bias, given appropriate data (see Hayes & White 2015 for one proposal)
SLIDE 38 (33) Distinguishing the two saltation types creates an arena for testing the predictions of HG vs. OT (a) Feature-scale and deletion saltation are equivalent in OT
- Both are possible with LCC, impossible without LCC
(b) If we find empirical differences between the two saltation types, this would be support for HG over OT
SLIDE 39 (34) Is deletion saltation typologically rare (even compared to feature-scale saltation)?
- I know of only the Colloquial Northern German case discussed here
- No deletion saltation examples are discussed by Hayes & White (2015)
- But in any case, typological frequency is influenced by factors other than
grammar-internal biases...
SLIDE 40 (35) Key empirical question for future research:
- Is deletion saltation easier to learn than feature-scale saltation?
(a) If yes, this would be empirical support for:
- HG (vs. OT without LCC)
- IDENT[±F] and/or binary features (vs. MAX/DEP[F] with privative [F])
(b) If no, at least one of the above assumptions should be reconsidered Acknowledgements
Many thanks to Elliott Moreton, Armin Mester, Junko Ito, and students in Phonology 3 at UC Santa Cruz (Winter 2017) for comments and discussion.
SLIDE 41
References
Hayes, Bruce, & James White. 2015. Saltation and the P-map. Phonology 32: 267-302. Ito, Junko, & Armin Mester. 2003. On the sources of opacity in OT: Coda processes in German. In Caroline Féry & Ruben van de Vijver (eds.), The syllable in Optimality Theory, 271–303. Cambridge: Cambridge University Press. Kiparsky, Paul. 1993. Blocking in non-derived environments. In Sharon Hargus & Ellen Kaisse (eds.), Studies in Lexical Phonology, 277-31. Phonetics and Phonology 4. San Diego: Academic Press. Lass, Roger. 1997. Historical linguistics and language change. Cambridge: Cambridge University Press. Legendre, Géraldine, Yoshiro Miyata, & Paul Smolensky. 1990. Can connectionism contribute to syntax?: Harmonic grammar, with an application. Report CU-CS-485-90. Computer Science Department, U. of Colorado at Boulder. Łubowicz, Anna. 2002. Derived environment effects in Optimality Theory. Lingua 112: 243-280. McCarthy, John. 2003. Comparative markedness. Theoretical Linguistics 29: 1-51. McCarthy, John, & Alan Prince. 1995. Faithfulness and reduplicative identity. In Jill Beckman, Suzanne Urbanczyk, & Laura Walsh Dickey (eds.), Papers in Optimality Theory, 249-384. UMOP 18. Minkova, Donka. 1993. On leapfrogging in historical phonology. In Jaap van Marle (ed.), Historical Linguistics 1991, 211–228. Amsterdam: John Benjamins. Pater, Joe. 2009. Weighted constraints in generative linguistics. Cognitive Science 33: 999-1035. Pater, Joe. 2016. Universal Grammar with weighted constraints. In John McCarthy & Joe Pater (eds.), Harmonic Grammar and Harmonic Serialism. London: Equinox Press. Prince, Alan & Paul Smolensky. 1993. Optimality Theory: Constraint interaction in generative grammar. Ms., Rutgers University & University of Colorado, Boulder. [Published 2004, Malden, Mass: Blackwell.] Smolensky, Paul. 1995. On the internal structure of the constraint component Con of UG. Handout from colloquium presented at UCLA. Rutgers Optimality Archive #86 [http://roa.rutgers.edu].