Roberts/HT 2014 Week 3


Extensions of OT:

A problem: opacity

There are four possible interactions between rules that affect the same structures (terminology due to Kiparsky 1968):


Feeding: the first rule creates the structure the second rule targets.

E → D / A _CAD
A → B / C _ DCBD

Bleeding: the first rule destroys the structure the second rule targets.

D → ∅ / _#CA
A → B / C _ D

Counterfeeding: the inverse of feeding. The second rule creates the target “too late” for the first rule to affect it. The first rule underapplies.

A → B / C _ D
E → D / A _CAD

Counterbleeding: the inverse of bleeding. The second rule destroys the target “too late” for the first rule to ignore it. The first rule overapplies.

A → B / C _ DCBD
D → ∅ / _#CB

Classic OT (OT as originally formulated by Prince and Smolensky 1993), can only model the transparent interactions: feeding and bleeding.

Nevertheless, counterbleeding and counterfeeding are attested, e.g. in American English, where Pre-Fortis Clipping and (in Canada) Diphthong Raising are counterbled by [t]-Flapping.

Pre-Fortis Clipping: a vowel/diphthong preceding a voiceless (fortis) consonant is realized as shorter than one preceding a voiced (lenis) consonant. ride → [ɹɑɪd]; write → [ɹɑ̆ɪt].
Canadian Diphthong Raising: a historical development of Pre-Fortis Clipping. Diphthongs in the fortis context have higher on-glides: write → [ɹəɪt] (see Chomsky and Halle 1968: 342).
[t]-Flapping: the coronal stops /t/ and /d/ are realized as [ɾ] when between vowels in a syllable that is not foot-initial (roughly).

The interaction between these rules is opaque in that rider [ɹɑɪɾɚɹ] and writer [ɹɑ̆ɪɾɚɹ]/[ɹəɪɾɚɹ] are not homophones. Pre-Fortis Clipping or Canadian Raising applies, even though the [t] that conditions them is eliminated by Flapping.

If we try to model this interaction in Classic OT, we arrive at a harmonic bounding condition: [ɹəɪɾɚɹ] cannot possibly win.


ClipDiph: Requires the distance between elements of diphthongs occuring before voiceless obstruents to be minimal. (Bermúdez-Otero, 2003, p. 8)
ClearDiph: Requires the distance between elements of diphthongs to be maximal. (Bermúdez-Otero, 2003, p. 8)
*ˈVtV: Penalises occurrence of tV where it is preceded by a stressed vowel
Input: /ɹɑɪtɚɹ/ Clip­Diph *ˈVtV Clear­Diph Ident
a. ɹɑɪtɚɹ *! *
b. ɹəɪtɚɹ *! *
c. 💣 ɹɑɪɾɚɹ *
d. ɹəɪɾɚɹ *! **

Note that candidate c.’s violations are a proper subset of those of candidate d. Candidate d, which is the attested form, cannot win under any ranking of these constraints.

Proposed solutions

Sympathy Theory (McCarthy 1999; summary in Kager 1999:387‒392)

Sympathy theory handles opacity by enforcing similarity between pairs of output candidates.

One faithfulness constraint is used as the selector constraint (marked ✯). This constraint selects the sympathetic candidate (marked ❀), which is the most harmonic candidate that has no violations of the selector constraint.

Similarity between the other candidates and the sympathetic candidate is enforced by sympathy constraints, which follow the same Max, Dep and Ident schemas as IO-faithfulness constraints, and are also marked with ❀.

Input: /ɹɑɪtɚɹ/ Clip­Diph *ˈVtV ❀Ident-V Clear­Diph ✯Ident-C Ident-V
a. ɹɑɪtɚɹ *! * *
b. ɹəɪtɚɹ *! *
c. ɹɑɪɾɚɹ *! *
d. ɹəɪɾɚɹ * * *

Output-output correspondence

Opacity effects are more often (though not exclusively) to be found in morphologically complex environments. It has been suggested that the grammar should enforce phonological similarity between morphologically related forms (see e.g. the notion of uniform exponence in Kenstowicz 1996).

The theory of output-output correspondence in Benua (1997) formalises this notion in OT. For every morphological alternation in which one form is phonologically opaque (it is claimed), there is a transparent base form, and a family of OO-faithfulness constraints enforcing similarity to it.

Input: /ɹɑɪtɚɹ/ Clip­Diph *ˈVtV OO-Ident-V Clear­Diph Ident-C Ident-V
a. ɹəɪt ~ ɹɑɪtɚɹ *! * *
b. ɹəɪt ~ ɹəɪtɚɹ *! *
c. ɹəɪt ~ ɹɑɪɾɚɹ *! *
d. ɹəɪt ~ ɹəɪɾɚɹ * * *

But, note that the opaque interaction between Raising and Flapping is not confined to morphologically complex environments: we also have mitre [məɪɾɚɹ]!

Stratal Optimality Theory…

Stratal OT (Bermúdez-Otero 1999, Kiparsky 2000) imports the mechanism for dealing with the morphology-phonology interface from Lexical Phonology and Morphology into Optimality Theory (in fact, Kiparsky’s original name for it was LPM-OT).

Lexical Phonology deals with morphological effects by incorporating the notion of domain. Each domain in LPM has its own co-phonology, with its own set of ordered rewrite rules.


For each domain in Stratal OT, there is a separate Optimality-Theoretic co-phonology, with its own constraint ranking. Therefore, there are two (and only two!) intermediate representations between UR and SR.

Raising and Flapping: once more with counterbleeding

Stem-level constraint ranking:

Input: /ɹɑɪt/ Clip­Diph Clear­Diph Ident-V
a. ɹɑɪt *! *
b. ɹəɪt *

The suffix -er is appended at the word level. We assume that the word-level co-phonology outputs [ɹəɪtɚɹ], which is taken as the input to the phrase-level co-phonology, along with any other words in the utterance.

Phrase-level constraint ranking:

Ident-V, *ˈVtVIdent-C
Input: /ðə ɹəɪtɚɹ əv bʊks/ Ident-V *ˈVtV Ident-C
a. ðə ˌɹɑɪtɚɹ əv ˈbʊks *! *
b. ðə ˌɹəɪtɚɹ əv ˈbʊks *!
c. ðə ˌɹɑɪɾɚɹ əv ˈbʊks *!
d. ðə ˌɹəɪɾɚɹ əv ˈbʊks *

Predictions of Stratal OT

Organising our model of phonology has certain empirical consequences

Opacity only possible between levels

Each level’s co-phonology is a Classic OT machine. Classic OT cannot handle opacity, therefore generalisations with the same domain cannot interact opaquely.

So far this prediction holds up: wherever we observe opacity, we find that the rule that applies first is sensitive to more kinds of boundary than the ones that apply later.

Factorial typology now factorial-cubed typology

To specify a grammar in full, you need to know three constraint rankings.

For each level (assuming they all use the same Con), there are n! possible rankings.

Rankings are formally independent of one another, so there are (n!)3 possible combinations of stem-level, word-level and phrase-level rankings.

This seems like a lot, but there are two reasons not to be worried:

  1. Depending how many new constraints it adds, a theory that tries to handle opacity in a single level may end up with more possible rankings than Stratal OT. The factorial series grows faster than the exponential series, so (3n)! is actually bigger than (n!)3 for values of n > 1

    Therefore, a method of dealing with opacity that involves multiplying |Con| by more than about 2.35 will actually predict more possible grammars than Stratal OT.

  2. In any case, although there is no formal coupling between the levels in Stratal OT, we tend to find that, diachronically, the word-level ranking arises out of previous generations’ phrase-level rankings, and the stem-level ranking out of their word-level rankings. This restricts the number of possible Stratal OT grammars we expect to find in practice even further, and is known as the life cycle of phonological generalisations.

Diachronic predictions — the life cycle

Over time, phrase-level processes tend to become word-level, and word-level processes tend to become stem-level. This observation predates the terminology I’m using, going back at least as far as Baudouin de Courtenay (1895).

An example of this in action is Latin rhotacism (Roberts 2012):

Phrase-level: intervocalic /s/ tends to be voiced (not directly attested in Latin, but a common precursor to rhotacism in other languages; see Catford 2001).
Word-level generalisation: /s/ surfaces as [r] between vowels. Endings containing [VsV] sequences become lexically [VrV] at this stage, e.g. -ārum, -ōrum, -ērum from earlier [aːzom] etc.
Stem-level generalisation: rhotacism becomes sensitive to morpheme boundaries, e.g. in de-sili-ō ’I jump down’, ni-sī ‘unless’. This replaces the earlier word-level generalisation (cf. Touratier 1975).
Lexical listing: finally, the domain of the generalisation becomes impossible to learn (e.g. because loanwords like basis ‘pedestal’ and cisium ‘cabriolet’ obscure its environment beyond recognition. At this point, the generalisation is lexicalised, i.e. surface [r] is taken to reflect underlying /r/ wherever it is found.


Baudouin de Courtenay, Jan N. I. (1895) Versuch einer Theorie phonetischer Alternationen: ein Kapitel aus der Psychophonetik. Strasbourg: Trübner.

Benua, Laura (1997) Transderivational identity: phonological relations between words. PhD thesis, University of Massachusetts at Amherst.

Bermúdez-Otero, Ricardo (1999) Constraint interaction in language change: quantity in English and Germanic. PhD thesis, University of Manchester and Universidad de Santiago de Compostela.

Catford, J. C. (2001) “On Rs, rhotacism and paleophonyJournal of the International Phonetic Association 31:171‒186.

Chomsky, Noam and Halle, Morris (1968) The sound pattern of English. New York: Harper and Row.

Kager, René (1999) Optimality Theory. Cambridge University Press.

Kenstowicz, Michael (1996) “Base-Identity and Uniform Exponence: alternatives to cyclicityin Durand, Jacques and Bernard Laks (eds.) Current trends in phonology: models and methods. University of Salford: ESRI.

Kiparsky, Paul (1968) “Linguistic universals and linguistic change.” in Bach, Emmon W., Robert T. Harms and Charles J. Fillmore (eds.) Universals in Linguistic Theory. London: Holt, Rinehart and Winston. pp. 170‒202.

Kiparsky, Paul (2000) “Opacity and Cyclicity.The Linguistic Review 17:351‒365.

McCarthy, John J. (1999) “Sympathy and phonological opacity.Phonology 16:331‒399.

Prince, Alan S. and Smolensky, Paul (1993) [2004] Optimality Theory: constraint interaction in generative grammar. Oxford: Blackwell.

Roberts, Philip J. (2012) “Latin rhotacism: a case study in the life cycle of phonological processes.Transactions of the Philological Society 110:80‒93.

Smolensky, Paul (1995) On the structure of the constraint component Con of UG. Handout of talk at UCLA, 4/7/95. Available from the Rutgers Optimality Archive as ROA-86.

Touratier, Christian (2006) “Rhotacisme Synchronique du latin classique et Rhotacisme diachronique.” Glotta 53:246‒281.