3 RCD as Topological Sort in this paperis an attested set all of - PDF document

In: Eisner, J., L. Karttunen and A. Th´ eriault (eds.), Finite-State Phonology: Proc. of the 5th Workshop of the ACL Special Interest Group in Computational Phonology (SIGPHON) , pp. 22-33, Luxembourg, Aug. 2000. [Online proceedings version: small corrections and clarifications to printed version.] Easy and Hard Constraint Ranking in Optimality Theory: ∗ Algorithms and Complexity Jason Eisner Dept. of Computer Science / University of Rochester Rochester, NY 14607-0226 USA / jason@cs.rochester.edu Abstract ier than previously known. The harder versions turn out to be harder than previously known. We consider the problem of ranking a set of OT constraints in a manner consistent with data. (1) We 2 Formalism speed up Tesar and Smolensky’s RCD algorithm to be linear on the number of constraints. This finds a An OT grammar G consists of three elements, ranking so each attested form x i beats or ties a par- any or all of which may need to be learned: ticular competitor y i . (2) We also generalize RCD so each x i beats or ties all possible competitors. • a set L of underlying forms produced by Alas, neither ranking as in (2) nor even generation a lexicon or morphology, has any polynomial algorithm unless P = NP —i.e., • a function Gen that maps any underlying one cannot improve qualitatively upon brute force: form to a set of candidates , and (3) Merely checking that a single (given) ranking is consistent with given forms is coNP -complete if the � • a vector C = � C 1 , C 2 , . . . C n � of con- surface forms are fully observed and ∆ p 2 -complete if straints , each of which is a function from not. Indeed, OT generation is OptP -complete. (4) candidates to the natural numbers N . As for ranking, determining whether any consistent ranking exists is coNP -hard (but in ∆ p 2 ) if the forms C i is said to rank higher than (or outrank ) are fully observed, and Σ p 2 -complete if not. C j in � C iff i < j . We say x satisfies C i if Finally, we show (5) generation and ranking are easier in derivational theories: P , and NP -complete. C i ( x ) = 0, else x violates C i . The grammar G defines a relation that 1 Introduction maps each u ∈ L to the candidate(s) Optimality Theory (OT) is a grammatical def x ∈ Gen ( u ) for which the vector � C ( x ) = paradigm that was introduced by Prince and � C 1 ( x ) , C 2 ( x ) , . . . C n ( x ) � is lexicographically Smolensky (1993) and suggests various compu- minimal. Such candidates are called optimal . tational questions, including learnability. One might then say that the grammatical Following Gold (1967) we might ask: Is the forms are the pairs ( u, x ) of this relation. But language class { L ( G ) : G is an OT grammar } for simplicity of notation and without loss of learnable in the limit? That is, is there a learn- generality, we will suppose that the candidates ing algorithm that will converge on any OT- x are rich enough that u can always be recov- ered from x . 1 Then u is redundant and we may describable language L ( G ) if presented with an enumeration of its grammatical forms? simply take the candidate x to be the grammat- In this paper we consider an orthogonal ques- ical form. Now the language L ( G ) is simply the tion that has been extensively investigated by image of L under G . We will write u x for the Tesar and Smolensky (1996), henceforth T&S. underlying form, if any, such that x ∈ Gen ( u x ). Rather than asking whether a learner can even- An attested form of the language is a candi- tually find an OT grammar compatible with an date x that the learner knows to be grammatical unbounded set of positive data, we ask: How (i.e., x ∈ L ( G )). y is a competitor of x if they efficiently can it find a grammar (if one exists) are both in the same candidate set: u x = u y . If compatible with a finite set of positive data? x, y are competitors with � C ( y ) < � C ( x ), we say Sections 3–5 present successively more realis- that y beats x (and then x is not optimal). tic versions of the problem (sketched in the ab- 1 This is necessary in any case if C j ( x ) is to depend stract). The easiest version turns out to be eas- on (all of) the underlying form u . In general, we expect ∗ Many thanks go to Lane and Edith Hemaspaandra that each candidate x ∈ Gen ( u ) encodes an alignment of for references to the complexity literature, and to Bruce the underlying form u with some possible surface form Tesar for comments on an earlier draft. s , and C j ( x ) evaluates this pair on some criterion. 22

An ordinary learner does not have access to Throughout this paper, we follow T&S in attested forms, since observing that x ∈ L ( G ) supposing that the learner already knows the would mean observing an utterance’s entire correct set of constraints C = { C 1 , C 2 , . . . C n } , but must learn their order � prosodic structure and underlying form, which C = � C 1 , C 2 , . . . C n � , ordinarily are not vocalized. An attested set known as a ranking of C . The assumption fol- of the language is a set X such that the learner lows from the OT philosophy that C is univer- knows that some x ∈ X is grammatical (but not sal across languages, and only the order of con- necessarily which x ). The idea is that a set is at- straints differs. The algorithms for learning a tested if it contains all possible candidates that ranking, however, are designed to be general for are consistent with something a learner heard. 2 any C , so they take C as an input. 4 An attested surface set —the case considered 3 RCD as Topological Sort in this paper—is an attested set all of whose elements are competitors; i.e., the learner is sure T&S investigate the problem of ranking a of the underlying form but not the surface form. C constraint set given a set of attested Some computational treatments of OT place forms x 1 , . . . x m and corresponding competitors restrictions on the grammars that will be con- y 1 , . . . y m . The problem is to determine a rank- sidered. The finite-state assumptions (Elli- ing � C such that for each i , � C ( x i ) ≤ � C ( y i ) lexi- son, 1994; Eisner, 1997a; Frank and Satta, 1998; cographically. Otherwise x i would be ungram- Karttunen, 1998; Wareham, 1998) are that matical, as witnessed by y i . • candidates and underlying forms are repre- In this section we give a concise presentation sented as strings over some alphabet; and analysis of T&S’s Recursive Constraint Demotion (RCD) algorithm for this problem. • Gen is a regular relation; 3 Our presentation exposes RCD’s connection to • each C j can be implemented as a topological sort, from which we borrow a simple weighted deterministic finite-state automa- bookkeeping trick that speeds it up. ton (WDFA) (i.e., C j ( x ) is the total weight of the path accepting x in the WDFA); 3.1 Compiling into Boolean Formulas • L and any attested sets are regular. The first half of the RCD algorithm extracts the relevant information from the { x i } and The bounded-violations assumption (Frank { y i } , producing what T&S call mark-data pairs . and Satta, 1998; Karttunen, 1998) is that the We use a variant notation. For each con- value of C j ( x ) cannot increase with | x | , but is straint C ∈ C , we construct a negation-free, bounded above by some k . conjunctive-normal form (CNF) Boolean for- In this paper, we do not always impose these mula φ ( C ) whose literals are other constraints: additional restrictions. However, when demon- strating that problems are hard, we usually � � C ′ φ ( C ) = adopt both restrictions to show that the problems are hard even for the restricted case. i : C ( x i ) >C ( y i ) C ′ : C ′ ( x i ) <C ′ ( y i ) 4 That is, these methods are not tailored (as others 2 This is of course a simplification. Attested sets corresponding to laugh and laughed can represent the learner’s might be) to exploit the structure of some specific, pu- tatively universal C . Hence they require time at least uncertainty about the respective underlying forms, but linear on n = |C| , if only to read all the constraints. not the knowledge that the underlying forms are related . Given the variety of cross-linguistic constraints in the In this case, we can solve the problem by packaging the entire morphological paradigm of laugh as a single candi- literature, one must worry: is n huge? Most authors following Ellison (1994) allow as constraints all the reg- date, whose attested set is constrained by the two surface ular languages over some alphabet Σ; then n > s s ( | Σ |− 1) observations and by the requirement of a shared underlying stem. (A k -member paradigm may be encoded in distinct constraints can be described by DFAs of size s , a form suitable to a finite-state system by interleaving where Σ (or s ) must be large to accommodate all fea- symbols from 2 k aligned tapes that describe the k under- tures and prosodic constituents. One solution: let each lying and k surface forms.) Alas, this scheme only works constraint constrain only a few symbols in Σ (e.g., bound within disjoint finite paradigms: while it captures the the number of non-default transitions per DFA). Indeed, shared underlying stem of laugh and laughed , it ignores Eisner (1997a; 1997b) proposes that C is the union of the shared underlying suffix of laughed and frowned . two “primitive” constraint families. If each primitive 3 Ellison (1994) makes only the weaker assumption constraint may mention at most t of T autosegmental tiers, then n = O ( T t ), which is manageable for small t . that Gen ( u ) is a regular set for each u . 23

3 RCD as Topological Sort in this paperis an attested set all of - PDF document

In: Eisner, J., L. Karttunen and A. Th eriault (eds.), Finite-State Phonology: Proc. of the 5th Workshop of the ACL Special Interest Group in Computational Phonology (SIGPHON) , pp. 22-33, Luxembourg, Aug. 2000. [Online proceedings version:

Insertion-Sort M. Esponda Insertion-Sort M. Esponda Insertion-Sort M. Esponda Insertion-Sort

Topological Sort Shivam Patel Viktor Zenkov Questions 1. Who first described topological sort?

Selection Sort Section 10.2 Code for Selection Sort (cont.) Code for an Array Sort Code for an

EE 355 Unit 18 DFS and Topological Sort Mark Redekopp 2 Topological Sort Given a graph of

R A D I X S O R T Radix Sort 147 dnc CS 16: Radix Sort Radix Sort Unlike other sorting

Graphs-Topological Sort November 9, 2016 CMPE 250 Graphs-Topological Sort November 9, 2016 1 /

W4231: Analysis of Algorithms Topological Sort 10/26/1999 Given a directed graph G = ( V, E ) , a

RADIX SORT Parosh Aziz Abdulla Uppsala University September 21, 2008 Parosh Aziz Abdulla

Sort Algorithms 15-110 - Friday 10/09 Learning Objectives Recognize the general algorithm and

Sorting a List: bubble sort selection sort insertion sort Sept. 22, 2017 1 Sorting BEFORE

Bucket-Sort and Radix-Sort 1, c 3, a 3, b 7, d 7, g 7, e B 0 1

SORTING Chapter 8 Sorting 2 Why sort? To make searching faster! How? Binary Search gives

MA/CSSE 473 Day 12 Interpolation Search Insertion Sort quick review DFS, BFS Topological Sort

MA/CSSE 473 Day 13 Finish Topological Sort Permutation Generation MA/CSSE 473 Day 13

CSE 326: Data Structures Graph representations Graphs Topological Sort Topological

Sorting Lower Bound Radix Sort Radix sort to the rescue sort of After today, you should

1 An Approach for Secure Edge Computing in the Internet of Things Markus Endler,

Repairing Entities using Star Constraints in Multi-relational Graphs Peng Lin 1 Qi Song 1 Yinghui

Hidden Markov Models Steven J Zeil Old Dominion Univ. Fall 2010 1 Discrete Markov Processes

Bridging the gap between Optimal Transport and MMD with Sinkhorn Divergences Aude Genevay MIT

DS595/CS525 Reinforcement Learning Prof. Yanhua Li Time: 6:00pm 8:50pm R Zoom Lecture Fall

UTSA Community-Based Secure Information and Resource Sharing in AWS Public Cloud Cyber Incident

Optimal Planning and Shortcut Learning: An Unfulfilled Promise Erez Karpas Carmel Domshlak

Deliberation for Social Choice Brandon Fain*[1], Ashish Goel[2], Kamesh Munagala[1] [1] Duke