Learning morphology and phonology John Goldsmith University of - - PowerPoint PPT Presentation

learning morphology and phonology
SMART_READER_LITE
LIVE PREVIEW

Learning morphology and phonology John Goldsmith University of - - PowerPoint PPT Presentation

Learning morphology and phonology John Goldsmith University of Chicago MoDyCo/Paris X Learning morphology and phonology John Goldsmith University of Chicago MoDyCo/Paris X All the particular properties that give a language its unique


slide-1
SLIDE 1

Learning morphology and phonology

John Goldsmith University of Chicago MoDyCo/Paris X

slide-2
SLIDE 2

Learning morphology and phonology

John Goldsmith University of Chicago MoDyCo/Paris X

All the particular properties that give a language its unique phonological character can be expressed in numbers.

  • Nicolai T

rubetzkoy, Grundzüge der Phonologie

slide-3
SLIDE 3

Acknowledgments

My thanks for many conversations to Aris Xanthos, Yu Hu, Mark Johnson, Carl de Marcken, Bernard Laks, Partha Niyogi, Jason Riggle, Irina Matveeva, and others…

slide-4
SLIDE 4

Roadmap

  • 1. Unsupervised word segmentation
  • 2. MDL: Minimum Description Length
  • 3. Unsupervised morphological analysis

Model; heuristics.

  • 4. Elaborating the morphological model
  • 5. Improving the phonological model:

categories: consonants/vowels vowel harmony

  • 6. What kind of linguistics is this?
  • 1. Word segmentation
slide-5
SLIDE 5
  • 0. Why mathematics?

Why phonology?

One answer: mathematics provides an alternative to cognitivism, the view that linguistics is a cognitive science. Cognitivism is the latest form, in linguistics, of psychologism, a view that has faded in and out of favor in all of the social sciences for the last 150 years: the view that the way to understand x is to understand how people analyze x.

slide-6
SLIDE 6
  • This work provides an answer to the

challenge: if linguistics is not a science

  • f what does on in a speaker’s head,

then what is it a science of?

  • 1. introduction
slide-7
SLIDE 7
  • 1. Word segmentation

The inventory of words in a language is a major component of the language, and very little of it (if any) can be attributed to universal grammar, or be viewed as part of the essence of language. So how is it learned?

slide-8
SLIDE 8
  • 1. Word segmentation

Reporting work by Michael Brent and by Carl de Marcken at MIT in the mid 1990s.

slide-9
SLIDE 9

Okay, Ginger! I’ve had it! You stay out of the garbage! Understand, Ginger? Stay out of the garbage, or else! Blah blah, Ginger! Blah blah blah blah blah blah Ginger blah blah blah blah blah blah blah…

slide-10
SLIDE 10
  • 1. Word segmentation
  • Strategy: We assume that a speaker has

a lexicon, with a probability distribution assigned to it; and that the parse assigned to a string is the parse with the greatest probability.

  • That is already a (partial) hypothesis about

word-parsing: given a lexicon, choose the parse with the greatest probability.

  • It can also serve as part of a hypothesis

about lexicon-selection.

slide-11
SLIDE 11

Assume an alphabet A. An utterance is a string of letters chosen from A *; a corpus is a set of utterances. Language model used: multigram model (variable length words). A lexicon L is a pair of objects (L, pL ): a set L ⊂ A *, and a probability distribution pL that is defjned on A* for which L is the support of pL. We call L the words.

  • We insist that A ⊂ L: all individual letters

are words.

  • We defjne a sentence as a member of L*.
  • Each sentence can be uniquely associated

with an utterance (an element in A *) by a mapping F:

  • 1. Word segmentation
slide-12
SLIDE 12

L*: All strings of words A *: All strings of letters

F

(Lexicon) (Alphabet)

  • 1. Word segmentation
slide-13
SLIDE 13

L*: All strings of words A *: All strings of letters

F

au début était le verbe audébutétaitleverbe

(Lexicon) (Alphabet)

  • 1. Word segmentation
slide-14
SLIDE 14

L*: All strings of words A *: All strings of letters

F

au début était le verbe audébutétaitleverbe S U

If F(S) = U then we say that S is a parse of U. (Lexicon) (Alphabet)

  • 1. Word segmentation
slide-15
SLIDE 15
  • The distribution p over L is extended to a

distribution p* over L* in the natural way:

– We assume a probability distribution λ over sentence length l:

  • If S is a sentence of length l=|S|, then

=

=

1

1 ) (

i

i λ

=

=

l i

i S p l S p

1

]) [ ( ) ( ) ( * λ

  • 1. Word segmentation
slide-16
SLIDE 16

Now we can defjne the probability of a corpus, given a lexicon

  • U is an utterance; L, a lexicon.

You might think it should be the sum of the probabilities of the parses of U. That would be reasonable. Calculating either argmax or sum requires dynamic programming techniques.

{ }

) ( max arg ) | (

) (

q pr L U p

U parses q∈

=

{ }

=

) (

) ( ) | (

U parses q

q pr L U p

slide-17
SLIDE 17

Best lexicon for a corpus U?

You might expect that the best lexicon for a corpus would be the lexicon that assigns the highest probability to the joint object which is the corpus C: But no: such a lexicon would simply be all the members of the corpus. A sentence is its own best probability model.

) | (

max arg

L C prL

L pr A*,

L

= 

  • 1. Word segmentation
slide-18
SLIDE 18
  • 2. Minimum Description Length

(MDL) analysis

MDL is an approach to statistical analysis that assumes that prior to analyzing any data, we have a universe of possible models (= UG); each element G∈UG is a probabilistic model for the set of possible corpora; and A prior distribution π ( ) has been defjned over UG based on the length of the shortest binary encoding of each G, where the encoding method has the prefjx property: π (G) = 2-length(En(G))

slide-19
SLIDE 19

2.1 Bayes’ rule

= = =

UG g G G

dg g C p G C p C pr G C p C pr G pr G C pr C G pr ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) | ( ) | (

* * *

π π π

  • 2. MDL
slide-20
SLIDE 20

. ) ( ) ( log ) | ( log K G H C p C G pr

G

− − =

log prob of corpus, in grammar G Length of G’s encoding

= = = dg g C p G C p C pr G C p C pr G pr G C pr C G pr

g G G

) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) | ( ) | ( π π π

  • 2. MDL
slide-21
SLIDE 21

. ) ( ) ( log ) | ( log K G H C p C G pr

G

− − =

log prob of corpus, in grammar G Length of G’s encoding

We already fjgured out how to compute this, given G=(L,p)

) 26 log( * | |

G w

w G

  • 2. MDL
slide-22
SLIDE 22

How one talks in MDL…

It is sensible to call –log prob (X) = the information content of an item X, and also to refer to that quantity as the optimal compressed length of X. In light of that, we can call the following quantity the description length of corpus C, given grammar G: = Compressed length of corpus + compressed length of grammar = -log prob (G|C) + a constant

) 1 log( x prob

[ ]

] ) ) ( ( [ ) | ( log G Enc length G C prob + −

  • 2. MDL
slide-23
SLIDE 23

How one talks in MDL…

It is sensible to call –log prob (X) = the information content of an item X, and also to refer to that quantity as the optimal compressed length of X. In light of that, we can call the following quantity the description length of corpus C, given grammar G: = Compressed length of corpus + compressed length of grammar = -log prob (G|C) + a constant

) 1 log( x prob

[ ]

] ) ) ( ( [ ) | ( log G Enc length G C prob + −

= evaluation metric of early generative grammar

  • 2. MDL
slide-24
SLIDE 24

MDL dialect

  • MDL analysis: fjnd the grammar G for

which the total description length is the smallest: Compressed length of data, given G + Compressed length of G

  • 2. MDL
slide-25
SLIDE 25

Essence of MDL

  • 2. MDL
slide-26
SLIDE 26

2.2 Search heuristic

Easy! start small: initial lexicon = A; if l1 and l2 are in L, and l1.l2 occurs in the corpus, add l1.l2 to the lexicon if that modifjcation decreases the description length. Similarly, remove l3 from the lexicon if that decreases the description length.

  • 2. MDL
slide-27
SLIDE 27

MDL: tells us when to stop growing the lexicon

If we search for words in a bottom-up fashion, we need a criterion for when to stop making bigger pieces. MDL plays that role in this approach.

  • 2. MDL
slide-28
SLIDE 28

How do these two multigram models of English compare? Why is Number 2 better?

  • 2. MDL

A little example to fjx ideas…

Lexicon 1: {a,b,…s,t,u… z} Lexicon 2: {a,b,… s,t,th,u…z}

slide-29
SLIDE 29

Notation: [t] = count of t [h] = count of h [th] = count of th Z = total number of words (tokens)

lexicon in m

Z m m ] [ log ] [

=

lexicon l

l Z ] [

Log probability of corpus:

  • 2. MDL

A little example to fjx ideas…

slide-30
SLIDE 30

1 1 1

] [ log ] [ Z t t

2 2 2

] [ log ] [ Z t t

+

h t m

Z m m

, 1

] [ log ] [

2 2 2

] [ log ] [ Z h h +

2 2 2

] [ log ] [ Z th th +

1 1 1

] [ log ] [ Z h h +

+

h t m

Z m m

, 2

] [ log ] [

All letters are separate th is treated as a separate chunk Log prob

  • f sentence C

lexicon in m

Z m m ] [ log ] [

=

lexicon l

l Z where ] [

] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [

1 2 1 2 1 2

th Z Z th h h th t t − = − = − =

  • 2. MDL
slide-31
SLIDE 31

1 1 1

] [ log ] [ Z t t

2 2 2

] [ log ] [ Z t t

+

h t m

Z m m

, 1

] [ log ] [

2 2 2

] [ log ] [ Z h h +

2 2 2

] [ log ] [ Z th th +

1 1 1

] [ log ] [ Z h h +

+

h t m

Z m m

, 2

] [ log ] [

= ∆ ∆ ) ( ; log

1 2

C pr then f f as f define

) ( ) ( ) ( log ] [ ] [ ] [

2 2 2 1 1 1

h pr t pr th pr th h h t t Z Z + ∆ + ∆ + ∆ −

All letters are separate th is treated as a separate chunk This is positive if Lexicon 2 is better

slide-32
SLIDE 32

Efgect of having fewer “words” altogether = ∆ ∆ ) ( ; log

1 2

C pr then f f as f define

) ( ) ( ) ( log ] [ ] [ ] [

2 2 2 1 1 1

h pr t pr th pr th h h t t Z Z + ∆ + ∆ + ∆ −

  • 2. MDL

This is positive if Lexicon 2 is better

slide-33
SLIDE 33

Efgect of frequency

  • f /t/ and /h/ decreasing

= ∆ ∆ ) ( ; log

1 2

C pr then f f as f define

) ( ) ( ) ( log ] [ ] [ ] [

2 2 2 1 1 1

h pr t pr th pr th h h t t Z Z + ∆ + ∆ + ∆ −

  • 2. MDL

This is positive if Lexicon 2 is better

slide-34
SLIDE 34

Efgect /th/ being treated as a unit rather than separate pieces = ∆ ∆ ) ( ; log

1 2

C pr then f f as f define

) ( ) ( ) ( log ] [ ] [ ] [

2 2 2 1 1 1

h pr t pr th pr th h h t t Z Z + ∆ + ∆ + ∆ −

  • 2. MDL

This is positive if Lexicon 2 is better

slide-35
SLIDE 35

2.3 Results

  • The Fulton County Grand Ju ry s aid Friday

an investi gation of At l anta 's recent prim ary e lection produc ed no e videnc e that any ir regul ar it i e s took place .

  • Thejury further s aid in term - end

present ment s thatthe City Ex ecutive Commit t e e ,which had over - all charg e

  • fthe e lection , d e serv e s the pra is e and

than k softhe City of At l anta forthe man ner in whichthe e lection was conduc ted.

Chunks are too big Chunks are too small

  • 2. MDL
slide-36
SLIDE 36

Summary

  • 1. Word segmentation is possible, using

(1) variable length strings (multigrams), (2) a probabilistic model of a corpus and (3) a search for maximum likelihood, if (4) we use MDL to tell us when to stop adding to the lexicon.

  • 2. The results are interesting, but they

sufger from being incapable of modeling real linguistic structure beyond simple chunks.

  • 2. MDL
slide-37
SLIDE 37

Summary

  • 1. Word segmentation is possible, using

(1) variable length strings (multigrams), (2) a probabilistic model of a corpus and (3) a search for maximum likelihood, if (4) we use MDL to tell us when to stop adding to the lexicon.

  • 2. The results are interesting, but they

sufger from being incapable of modeling real linguistic structure beyond simple chunks.

  • 2. MDL
slide-38
SLIDE 38

Question:

Will we fjnd that types of linguistic structure correspond naturally to ways

  • f improving our MDL model, either to

increase the probability of the data, or to decrease the size of the grammar?

  • 2. MDL
slide-39
SLIDE 39
  • 3. Morphology (primo)

Problem: Given a set of words, fjnd the best morphological structure for the words – where “best” means it maximally agrees with linguists (where they agree with each other!). Because we are going from larger units to smaller units (words to morphemes), the probability of the data is certain to decrease. The improvement will come from drastically shortening the grammar = discover regularities.

slide-40
SLIDE 40

Naïve MDL

Corpus: jump, jumps, jumping laugh, laughed, laughing sing, sang, singing the, dog, dogs total: 62 letters Analysis: Stems: jump laugh sing sang dog (20 letters) Suffjxes: s ing ed (6 letters) Unanalyzed: the (3 letters) total: 29 letters.

  • 3. Morphology
slide-41
SLIDE 41

Model/heuristic

1st approximation: a morphology is:

  • 1. a list of stems,
  • 2. a list of affjxes

(prefjxes, suffjxes), and

  • 3. a list of pointers

indicating which combinations are permissible. Unlike the word segmentation problem, now we have no obvious search heuristics. These are very important (for that reason)—and I will not talk about them.

  • 3. Morphology
slide-42
SLIDE 42

Size of model

M[orphology] = { Stems T, Affjxes F, Signatures Σ }

∑ ∑

= =

− = =

| | 1 | | 1

]) [ ( log ] [

s i s i

i s freq i s

  • r

=

T t

t T

=

F f

f F

= Σ

T σ

σ

) 26 log( * ) (s length string s =

What is a signature, and what is its length? stems affjxes sig’s

  • 3. Morphology

Σ + + = F T M

extensivit y

slide-43
SLIDE 43

What is a signature?

                        ing ed NULL more attack appeal account ... 40

                            es s e NULL more étonnant équipé élevé 78

  • 3. Morphology
slide-44
SLIDE 44

What is the length (=information content) of a signature?

A signature is an ordered pair of two sets of pointers: (i) a set of pointers to stems; and (ii) a set of pointers to affjxes. The length of a pointer p is –log freq (p): So the total length of the signatures is:

) ] [ ] [ log ] [ ] [ log (

) ( ) (

∑ ∑ ∑

∈ ∈ ∈

+

σ σ σ

σ σ

Suffixes f Sigs Stems t

in f t W

Sum over signatures Sum over stem ptrs

slide-45
SLIDE 45

Generation 1 Linguistica

http://linguistica.uchicago.edu Initial pass: assumes that words are composed of 1 or 2 morphemes; fjnds all cases where signatures exist with at least 2 stems and 2 affjxes:

                ing ed NULL walk jump

  • 3. Morphology
slide-46
SLIDE 46

Generation 1

Then it refjnes this initial approximation in a large number of ways, always trying to decrease the description length of the initial corpus.

  • 3. Morphology
slide-47
SLIDE 47
slide-48
SLIDE 48

Refjnements

  • 1. Correct errors in segmentation
  • 2. Create signatures with only one
  • bserved stem: we have NULL, ed,

ion, s as suffjxes, but only one stem (act) with exactly those suffjxes.

                    ⇒                     ive ion more attent aggress affirm ve

  • n

more attenti aggressi affirmati 20 20

  • 3. Morphology
slide-49
SLIDE 49
  • 3. Find recursive structure: allow stems

to be analyzed.

Words1

Signa- tures1

Stems1 Affixes Minilexicon 1 Words2= Stems1

Signa- tures2

Stems2 Affixes2 Minilexicon 2

  • 3. Morphology
slide-50
SLIDE 50

French roots

  • 3. Morphology
slide-51
SLIDE 51
slide-52
SLIDE 52
  • 4. Detect allomorphy

Signature: <e>ion . NULL composite concentrate corporate détente discriminate evacuate infmate

  • pposite

participate probate prosecute tense What is this? composite and composition composite  composit  composit + ion It infers that ion deletes a stem-fjnal ‘e’ before attaching.

  • 3. Morphology
slide-53
SLIDE 53
  • 3. Summary

Works very well on European languages. Challenges:

  • 1. Works very poorly on languages

with richer morphologies (average # morphemes/word >> 2 ). (Most

languages have rich morphologies.)

  • 2. Various other defjciencies.
  • 3. Morphology
slide-54
SLIDE 54
  • 4. Morphology (secundo)

The initial bootstrap in the previous version does not even work on most languages, where the expected morphology contains sequences of 5 or more morphemes.

slide-55
SLIDE 55

ni u a tu wa li ka ta taka ni ku m tu w a pend fik sem som

  • n

l imb chaku a w Ø na

Swahili verb

  • 4. Morphology
slide-56
SLIDE 56

Subject marker

ni u a tu wa li ka ta taka ni ku m tu wa pend fik sem som

  • n

l imb chaku a w Ø na

Swahili verb

  • 4. Morphology
slide-57
SLIDE 57

ni u a tu wa li ka ta taka ni ku m tu wa pend fik sem som

  • n

l imb chaku a w Ø na

Swahili verb

Tense marker Subject marker

  • 4. Morphology
slide-58
SLIDE 58

ni u a tu wa li ka ta taka ni ku m tu wa pend fik sem som

  • n

l imb chaku a w Ø na

Swahili verb

Subject marker Tense marker Object marker

  • 4. Morphology
slide-59
SLIDE 59

ni u a tu wa li ka ta taka ni ku m tu wa pend fik sem som

  • n

l imb chaku a w Ø na wa

Swahili verb

Subject marker Tense marker Root Object marker

  • 4. Morphology
slide-60
SLIDE 60

ni u a tu wa li ka ta taka ni ku m tu wa pend fik sem som

  • n

l imb chaku a w Ø na

Swahili verb

Subject marker Tense marker Root Object marker Voice (active/passive)

  • 4. Morphology
slide-61
SLIDE 61

ni u a tu wa li ka ta taka ni ku m tu wa pend fik sem som

  • n

l imb chaku a w Ø na

Swahili verb

Subject marker Root Object marker Voice (active/passive) Final vowel Tense marker

  • 4. Morphology
slide-62
SLIDE 62

Finite state automaton (FSA)

SF1 SF3 PF1 SF2

  • 4. Morphology
slide-63
SLIDE 63

Signature:

reduces false positives

                ing ed NULL walk jump

PF1 SF1 PF3 SF3 SF2

  • 4. Morphology
slide-64
SLIDE 64

Generalize the signature…

M1 M4 M3 M6 M2 M7 M9 M5 M8

Sequential FSA: each state has a unique successor.

  • 4. Morphology
slide-65
SLIDE 65

Alignments

1 4 3 2 4 m1 m4 m3 m2

  • 4. Morphology
slide-66
SLIDE 66

Alignments: String edit distance algorithm

n i l i m u p e n d a n i t a k a m u p e n d a

slide-67
SLIDE 67

Alignments: make cuts

n i l i m u p e n d a n i t a k a m u p e n d a

  • 4. Morphology
slide-68
SLIDE 68

Elementary alignment

1 4 3 2 4 m1 m4 m3 m2

  • 4. Morphology
slide-69
SLIDE 69

Collapsing elementary alignments

1 4 3 2 4 m1 m4 1 4 3 2 4 m1 m4 m7 m8 m3 m2

context context

slide-70
SLIDE 70

T wo or more sequential FSAs with identical contexts are collapsed:

1 4 3 2 4 m1 m4 m8 m7 m3 m2

slide-71
SLIDE 71
  • 3. Further collapsing FSAs

1 4 3 2 4

a yesema na li

1 4 3 2 4

a mfuata na li

1 4 3 2 4

a na li yesema mfuata

slide-72
SLIDE 72

4.3 T

  • p templates: 8,200 Swahili

words

State 1 State 2 State 3 a, wa (sg., pl. human subject markers) 246 stems ku, hu (infinitive, habitual markers) 51 stems wa (pl. subject marker) ka, li (tense markers) 25 stems a (sg. subject marker) ka, li (tense markers) 29 stems a (sg. subject marker) ka, na (tense markers) 28 stems 37 strings

w (passive marker) / Ø

a

slide-73
SLIDE 73

Precision and recall

Precision Recall F-score String edit distance 0.77 0.57 0.65 Stem- affix 0.54 0.14 0.22 Affix- stem 0.68 0.20 0.31

  • 4. Morphology
slide-74
SLIDE 74

One Template The other template Collapsed Template % found on Yahoo search 1 {a}-{ka,na}- {stems} {a}-{ka,ki}-{stems} {a}-{ka,ki,na}-{stems} 86 (37/43) 2 {wa}-{ka,na}- {stems} {wa}-{ka,ki}-{stems} {wa}-{ka,ki,na}-{stems} 95 (21/22) 3 {a}-{ka,ki,na}- {stems} {wa}-{ka,ki,na}- {stems} {a,wa}-{ka,ki,na}- {stems} 84 (154/183) 4 {a}-{liye,me}- {stems} {a}-{liye,li}-{stems} {a}-{liye,li,me}-{stems} 100 (21/21) 5 {a}-{ki,li}-{stems} {wa}-{ki,li}-{stems} {a,wa}-{ki,li}-{stems} 90 (36/40) 6 {a}-{lipo,li}- {stems} {wa}-{lipo,li}-{stems} {a,wa}-{lipo,li}-{stems} 90 (27/30) 7 {a,wa}-{ki,li}- {stems} {a,wa}-{lipo,li}- {stems} {a,wa}-{ki,lipo,li}- {stems} 74 (52/70) 8 {a}-{na,naye}- {stems} {a}-{na,ta}-{stems} {a}-{na,ta,naye}-{stems} 80 (12/15)

Collapsed templates

slide-75
SLIDE 75
  • 4. 1 Evaluating the robustness
  • f these templates (sequential

FSAs)

  • Measure: How many letters do we

save by expressing words in a template rather than by writing each

  • ne out individually?

Answer: 36 -17 = 19.

1 4 3 2 4

a na li yesema mfuata

slide-76
SLIDE 76

com cre-

  • emos
  • es
  • e

car- pequeñ-

  • a-
  • Ø

s rubi- negr-

Most edges are convergent…

adjectives verbs

  • 4. Morphology
slide-77
SLIDE 77

But some diverge (Spanish):

Participle-forming suffix

slide-78
SLIDE 78

English has much the same:

  • 4. Morphology
slide-79
SLIDE 79
  • 4. Summary

We need to enrich the heuristics and consider a broader set of possible grammars. With that, improvements seem to be unlimited at this point in time. Focus: Decrease the length of the analysis, especially in the length of the substance (morphemes) described.

  • 4. Morphology
slide-80
SLIDE 80
  • 5. Phonology

So far we have said little about phonology. We have assumed no interesting probabilistic model of segment (=phoneme) placement. (0th or 1st

  • rder Markov model).

But we can shorten the length of the grammar by taking this into consideration.

slide-81
SLIDE 81

These slides present material done jointly with Aris Xanthos and with Jason Riggle.

  • 5. Phonology
slide-82
SLIDE 82

Much more interesting model:

C V

x 1-x y 1-y

For state transitions; and the same model for emissions: both states emit all of the symbols, but with difgerent probabilities….

  • 5. Phonology
slide-83
SLIDE 83

C V

x 1-x y 1-y

V v1 v3 v6 v4 v2 v5 v7 v8

C c1 c3 c6 c4 c2 c5 c7 c8

=

i i

c 1

=

i i

v 1

  • 5. Phonology
slide-84
SLIDE 84

The question is…

  • How could we obtain the best

probabilities for x and y (transition probabilities), and all of the emission probabilities for the two states?

  • Bear in mind: each state generates all
  • f the symbols. The only way to

ensure that a state does not generate a symbol s is to assign a zero probability for the emission of the symbol s in that state.

  • 5. Phonology
slide-85
SLIDE 85

Hidden Markov model

With a well-understood training algorithm, an HMM will fjnd the optimal parameters to generate the data so as to assign it the highest probability. How does it organize the phonological data?

  • 5. Phonology
slide-86
SLIDE 86

English FSA

  • 5. Phonology
slide-87
SLIDE 87

Pr (State 1  State 1) Pr (State 2  State 2)

Rhythm, syllabifjcation

slide-88
SLIDE 88

start end

slide-89
SLIDE 89

English: Log ratios of the emission probabilities of the 2 states:

) ( ) ( log

2 1

φ φ p p

negative positive

slide-90
SLIDE 90
  • 5. Phonology
slide-91
SLIDE 91

French: Log ratios of the emission probabilities of the 2 states:

positive negative

) ( ) ( log

2 1

φ φ p p

slide-92
SLIDE 92
slide-93
SLIDE 93

start end

slide-94
SLIDE 94

Finnish: Log ratios of the emission probabilities of the 2 states:

positive negative

) ( ) ( log

2 1

φ φ p p

slide-95
SLIDE 95

Finnish vowels and their harmony

  • 5. Phonology
slide-96
SLIDE 96
slide-97
SLIDE 97
slide-98
SLIDE 98
slide-99
SLIDE 99
slide-100
SLIDE 100
slide-101
SLIDE 101
  • 6. What kind of linguistics is

this?

It is an approach to linguistic analysis which is non-cognitivist: It makes no claims about hidden or

  • ccult properties of the human

system (for which linguistic tools are not designed to provide answers). It welcomes psychologists, without claiming to replace them, or to do their job.

slide-102
SLIDE 102

It asks linguists to study language as a natural phenomenon, and to evaluate their success like any other natural science.

I have not addressed two important areas of phonology: automatic morphophonology, and the geometry

  • f phonological representations.

That will have to wait à la prochaine.

slide-103
SLIDE 103
  • 6. What kind of linguistics is

this?

Facts about a language L may be divided into (type 1) those facts that are particular to L, and (type 2) those that are shared by all languages. In all likelihood, type 1 information is vastly larger than type 2 information.

slide-104
SLIDE 104
slide-105
SLIDE 105

T ype 1 information is: universal; in all likelihood, not learned, and not even learnable in a short time period; innate; not infmuenced by historical or cultural concerns.

  • 6. What is this?
slide-106
SLIDE 106

It seems clear to me that linguistics is the study of both T ype 1 and T ype 2

  • information. Much of the focus in

linguistic theory has focused on T ype 1 information (what is common to all acquisition paths). This work

  • 6. What is this?
slide-107
SLIDE 107

Linguistics seeks the essence common to all languages. This essence can exist nowhere other than in the biological nature of the human being. This essence does not need to be

  • learned. This essence can probably

not be learned (in a reasonable time). This essence is UG.

  • 6. What is this?
slide-108
SLIDE 108
  • Linguistics seeks to analyze each

human language. Languages vary, due to their history, to their speakers’ history, and to the ends to which they are put. Finding ways to characterize each language adequately is the primary goal of linguistics; it is best accomplished by analyzing linguistic data in the same way that other sciences proceed, ceteris paribus.

  • 6. What is this?
slide-109
SLIDE 109