Pattern Languages Seminar Algorithmic Learning Theory, SS 2015 - - PowerPoint PPT Presentation

pattern languages
SMART_READER_LITE
LIVE PREVIEW

Pattern Languages Seminar Algorithmic Learning Theory, SS 2015 - - PowerPoint PPT Presentation

Pattern Languages Seminar Algorithmic Learning Theory, SS 2015 Michael Krause RWTH Aachen 07.05.2015 Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 1 / 44 Basic ideas 1 Finding Patterns Common to a Set of Strings 2 Other


slide-1
SLIDE 1

Pattern Languages

Seminar Algorithmic Learning Theory, SS 2015 Michael Krause

RWTH Aachen

07.05.2015

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 1 / 44

slide-2
SLIDE 2

1

Basic ideas

2

Finding Patterns Common to a Set of Strings

3

Other results

4

Conclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 2 / 44

slide-3
SLIDE 3

Basic ideas

1

Basic ideas

2

Finding Patterns Common to a Set of Strings

3

Other results

4

Conclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 3 / 44

slide-4
SLIDE 4

Basic ideas

What are pattern languages?

Type of formal languages Introduced by Dana Angluin in 1980

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 4 / 44

slide-5
SLIDE 5

Basic ideas

Why do they interest us?

Gold ’67: Language Identification in the Limit Learning from positive and negative data more powerful than from positive data only

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 5 / 44

slide-6
SLIDE 6

Basic ideas

Why do they interest us?

Gold ’67: Language Identification in the Limit Learning from positive and negative data more powerful than from positive data only Angluin ’80: Inductive Inference of Formal Languages from Positive Data inductive inference - generalizing rules from examples

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 5 / 44

slide-7
SLIDE 7

Basic ideas

Why do they interest us?

Gold ’67: Language Identification in the Limit Learning from positive and negative data more powerful than from positive data only Angluin ’80: Inductive Inference of Formal Languages from Positive Data inductive inference - generalizing rules from examples Finding Patterns Common to a Set of Strings Pattern languages

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 5 / 44

slide-8
SLIDE 8

Basic ideas

Why do they interest us?

Gold ’67: Language Identification in the Limit Learning from positive and negative data more powerful than from positive data only Angluin ’80: Inductive Inference of Formal Languages from Positive Data inductive inference - generalizing rules from examples Finding Patterns Common to a Set of Strings Pattern languages

can be learned from positive data

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 5 / 44

slide-9
SLIDE 9

Basic ideas

Why do they interest us?

Gold ’67: Language Identification in the Limit Learning from positive and negative data more powerful than from positive data only Angluin ’80: Inductive Inference of Formal Languages from Positive Data inductive inference - generalizing rules from examples Finding Patterns Common to a Set of Strings Pattern languages

can be learned from positive data are a natural model for inductive inference

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 5 / 44

slide-10
SLIDE 10

Basic ideas

Example

Let p = x1y2x a pattern

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 6 / 44

slide-11
SLIDE 11

Basic ideas

Example

Let p = x1y2x a pattern By substituting x := 10 y := 3 we get: 1013210

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 6 / 44

slide-12
SLIDE 12

Basic ideas

Example

Let p = x1y2x a pattern By substituting x := 10 y := 3 we get: 1013210 By substituting x := 0x y := z3 we get: 0x1z320x

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 6 / 44

slide-13
SLIDE 13

Basic ideas

Example

Let p = x1y2x a pattern By substituting x := 10 y := 3 we get: 1013210 By substituting x := 0x y := z3 we get: 0x1z320x By substituting x := y y := x we get: y1x2y

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 6 / 44

slide-14
SLIDE 14

Basic ideas

Example

Let p = x1y2x a pattern By substituting x := 10 y := 3 we get: 1013210 By substituting x := 0x y := z3 we get: 0x1z320x By substituting x := y y := x we get: y1x2y Many more substitutions possible!

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 6 / 44

slide-15
SLIDE 15

Basic ideas

Formal definition

A pattern is any finite string of constants and variables

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 7 / 44

slide-16
SLIDE 16

Basic ideas

Formal definition

A pattern is any finite string of constants and variables Σ: finite alphabet of constants, for us: Σ = {0, 1}

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 7 / 44

slide-17
SLIDE 17

Basic ideas

Formal definition

A pattern is any finite string of constants and variables Σ: finite alphabet of constants, for us: Σ = {0, 1} X: set of variables disjoint from Σ, for us: X = {x1, x2, . . . }

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 7 / 44

slide-18
SLIDE 18

Basic ideas

Formal definition

A pattern is any finite string of constants and variables Σ: finite alphabet of constants, for us: Σ = {0, 1} X: set of variables disjoint from Σ, for us: X = {x1, x2, . . . } A substitution replaces symbols in a pattern so that

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 7 / 44

slide-19
SLIDE 19

Basic ideas

Formal definition

A pattern is any finite string of constants and variables Σ: finite alphabet of constants, for us: Σ = {0, 1} X: set of variables disjoint from Σ, for us: X = {x1, x2, . . . } A substitution replaces symbols in a pattern so that

constants remain the same

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 7 / 44

slide-20
SLIDE 20

Basic ideas

Formal definition

A pattern is any finite string of constants and variables Σ: finite alphabet of constants, for us: Σ = {0, 1} X: set of variables disjoint from Σ, for us: X = {x1, x2, . . . } A substitution replaces symbols in a pattern so that

constants remain the same variables are mapped to any non-null string

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 7 / 44

slide-21
SLIDE 21

Basic ideas

Formal definition

A pattern is any finite string of constants and variables Σ: finite alphabet of constants, for us: Σ = {0, 1} X: set of variables disjoint from Σ, for us: X = {x1, x2, . . . } A substitution replaces symbols in a pattern so that

constants remain the same variables are mapped to any non-null string

The language of a pattern is the set of all strings of constants we get through substitutions

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 7 / 44

slide-22
SLIDE 22

Basic ideas

Our main question

We call a set of strings of constants a sample e.g: S = {101, 10010, 0110011}

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 8 / 44

slide-23
SLIDE 23

Basic ideas

Our main question

We call a set of strings of constants a sample e.g: S = {101, 10010, 0110011} Given a sample S, which pattern generates every string in S?

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 8 / 44

slide-24
SLIDE 24

Basic ideas

Our main question

We call a set of strings of constants a sample e.g: S = {101, 10010, 0110011} Given a sample S, which pattern generates every string in S?

p = x works, but here q = x0x is more precise

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 8 / 44

slide-25
SLIDE 25

Basic ideas

Our main question

We call a set of strings of constants a sample e.g: S = {101, 10010, 0110011} Given a sample S, which pattern generates every string in S?

p = x works, but here q = x0x is more precise

We call a pattern p descriptive of S iff

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 8 / 44

slide-26
SLIDE 26

Basic ideas

Our main question

We call a set of strings of constants a sample e.g: S = {101, 10010, 0110011} Given a sample S, which pattern generates every string in S?

p = x works, but here q = x0x is more precise

We call a pattern p descriptive of S iff

it generates every string in S

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 8 / 44

slide-27
SLIDE 27

Basic ideas

Our main question

We call a set of strings of constants a sample e.g: S = {101, 10010, 0110011} Given a sample S, which pattern generates every string in S?

p = x works, but here q = x0x is more precise

We call a pattern p descriptive of S iff

it generates every string in S no other pattern q generates S so that the language of q is a strict subset of the language of p

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 8 / 44

slide-28
SLIDE 28

Basic ideas

Our main question

We call a set of strings of constants a sample e.g: S = {101, 10010, 0110011} Given a sample S, which pattern generates every string in S?

p = x works, but here q = x0x is more precise

We call a pattern p descriptive of S iff

it generates every string in S no other pattern q generates S so that the language of q is a strict subset of the language of p

Given a sample S, which pattern is descriptive of S?

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 8 / 44

slide-29
SLIDE 29

Finding Patterns Common to a Set of Strings

1

Basic ideas

2

Finding Patterns Common to a Set of Strings Learning pattern languages in the limit Finding descriptive patterns Properties of pattern languages Finding descriptive one-variable patterns

3

Other results

4

Conclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 9 / 44

slide-30
SLIDE 30

Finding Patterns Common to a Set of Strings Learning pattern languages in the limit

1

Basic ideas

2

Finding Patterns Common to a Set of Strings Learning pattern languages in the limit Finding descriptive patterns Properties of pattern languages Finding descriptive one-variable patterns

3

Other results

4

Conclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 10 / 44

slide-31
SLIDE 31

Finding Patterns Common to a Set of Strings Learning pattern languages in the limit

Repetition: Gold’s model

Objects: formal languages

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 11 / 44

slide-32
SLIDE 32

Finding Patterns Common to a Set of Strings Learning pattern languages in the limit

Repetition: Gold’s model

Objects: formal languages Presentation: sequence of strings from a language, where each string appears at least once (a text)

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 11 / 44

slide-33
SLIDE 33

Finding Patterns Common to a Set of Strings Learning pattern languages in the limit

Repetition: Gold’s model

Objects: formal languages Presentation: sequence of strings from a language, where each string appears at least once (a text) The learner outputs hypotheses after receiving a string

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 11 / 44

slide-34
SLIDE 34

Finding Patterns Common to a Set of Strings Learning pattern languages in the limit

Repetition: Gold’s model

Objects: formal languages Presentation: sequence of strings from a language, where each string appears at least once (a text) The learner outputs hypotheses after receiving a string The learner learns the language, if, after some finite amount of time, the hypotheses are correct and remain the same

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 11 / 44

slide-35
SLIDE 35

Finding Patterns Common to a Set of Strings Learning pattern languages in the limit

In our case

Assuming a learner is presented with a text s1, s2, s3, . . . of some pattern language

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 12 / 44

slide-36
SLIDE 36

Finding Patterns Common to a Set of Strings Learning pattern languages in the limit

In our case

Assuming a learner is presented with a text s1, s2, s3, . . . of some pattern language The hypothesis space is the set of all patterns

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 12 / 44

slide-37
SLIDE 37

Finding Patterns Common to a Set of Strings Learning pattern languages in the limit

In our case

Assuming a learner is presented with a text s1, s2, s3, . . . of some pattern language The hypothesis space is the set of all patterns The hypotheses are patterns descriptive of the strings seen so far

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 12 / 44

slide-38
SLIDE 38

Finding Patterns Common to a Set of Strings Learning pattern languages in the limit

In our case

Assuming a learner is presented with a text s1, s2, s3, . . . of some pattern language The hypothesis space is the set of all patterns The hypotheses are patterns descriptive of the strings seen so far Assuming there exists an algorithm to find descriptive patterns

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 12 / 44

slide-39
SLIDE 39

Finding Patterns Common to a Set of Strings Learning pattern languages in the limit

In our case

Assuming a learner is presented with a text s1, s2, s3, . . . of some pattern language The hypothesis space is the set of all patterns The hypotheses are patterns descriptive of the strings seen so far Assuming there exists an algorithm to find descriptive patterns Then paper by Angluin shows: Pattern languages can be learned in the limit from positive data

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 12 / 44

slide-40
SLIDE 40

Finding Patterns Common to a Set of Strings Finding descriptive patterns

1

Basic ideas

2

Finding Patterns Common to a Set of Strings Learning pattern languages in the limit Finding descriptive patterns Properties of pattern languages Finding descriptive one-variable patterns

3

Other results

4

Conclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 13 / 44

slide-41
SLIDE 41

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Our first attempt

Let S be a sample

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 14 / 44

slide-42
SLIDE 42

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Our first attempt

Let S be a sample Enumerate all patterns of shorter or equal length of the shortest string in S

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 14 / 44

slide-43
SLIDE 43

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Our first attempt

Let S be a sample Enumerate all patterns of shorter or equal length of the shortest string in S Test for each pattern if its language contains S

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 14 / 44

slide-44
SLIDE 44

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Our first attempt

Let S be a sample Enumerate all patterns of shorter or equal length of the shortest string in S Test for each pattern if its language contains S From all patterns that pass the test: Select one which is minimal with regards to inclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 14 / 44

slide-45
SLIDE 45

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Our first attempt

Let S be a sample Enumerate all patterns of shorter or equal length of the shortest string in S → exponential growth Test for each pattern if its language contains S From all patterns that pass the test: Select one which is minimal with regards to inclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 14 / 44

slide-46
SLIDE 46

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Our first attempt

Let S be a sample Enumerate all patterns of shorter or equal length of the shortest string in S → exponential growth Theorem (3.6, Angluin) The membership problem for pattern languages is NP-complete Test for each pattern if its language contains S From all patterns that pass the test: Select one which is minimal with regards to inclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 14 / 44

slide-47
SLIDE 47

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Our first attempt

Let S be a sample Enumerate all patterns of shorter or equal length of the shortest string in S → exponential growth Theorem (3.6, Angluin) The membership problem for pattern languages is NP-complete Test for each pattern if its language contains S → NP-complete From all patterns that pass the test: Select one which is minimal with regards to inclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 14 / 44

slide-48
SLIDE 48

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Our first attempt

Let S be a sample Enumerate all patterns of shorter or equal length of the shortest string in S → exponential growth Theorem (3.6, Angluin) The membership problem for pattern languages is NP-complete Test for each pattern if its language contains S → NP-complete Theorem (5.1, Jiang et al.) The inclusion problem for arbitrary pattern languages is undecidable From all patterns that pass the test: Select one which is minimal with regards to inclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 14 / 44

slide-49
SLIDE 49

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Our first attempt

Let S be a sample Enumerate all patterns of shorter or equal length of the shortest string in S → exponential growth Theorem (3.6, Angluin) The membership problem for pattern languages is NP-complete Test for each pattern if its language contains S → NP-complete Theorem (5.1, Jiang et al.) The inclusion problem for arbitrary pattern languages is undecidable From all patterns that pass the test: Select one which is minimal with regards to inclusion → !

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 14 / 44

slide-50
SLIDE 50

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Our second attempt

Let S be a sample Enumerate all patterns of shorter or equal length of the shortest string in S → exponential growth Test for each pattern if its language contains S → NP-complete

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 15 / 44

slide-51
SLIDE 51

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Our second attempt

Let S be a sample Enumerate all patterns of shorter or equal length of the shortest string in S → exponential growth Test for each pattern if its language contains S → NP-complete Corollary (3.4, Angluin) Let p, q be patterns with the same length. Then the language of q includes the language of p iff there is a substitution from q to p

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 15 / 44

slide-52
SLIDE 52

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Our second attempt

Let S be a sample Enumerate all patterns of shorter or equal length of the shortest string in S → exponential growth Test for each pattern if its language contains S → NP-complete Corollary (3.4, Angluin) Let p, q be patterns with the same length. Then the language of q includes the language of p iff there is a substitution from q to p From all patterns that pass the test select the longest

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 15 / 44

slide-53
SLIDE 53

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Our second attempt

Let S be a sample Enumerate all patterns of shorter or equal length of the shortest string in S → exponential growth Test for each pattern if its language contains S → NP-complete Corollary (3.4, Angluin) Let p, q be patterns with the same length. Then the language of q includes the language of p iff there is a substitution from q to p From all patterns that pass the test select the longest From the resulting set of patterns, output any which cannot be gained by substituting from another

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 15 / 44

slide-54
SLIDE 54

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Our second attempt

Let S be a sample Enumerate all patterns of shorter or equal length of the shortest string in S → exponential growth Test for each pattern if its language contains S → NP-complete Corollary (3.4, Angluin) Let p, q be patterns with the same length. Then the language of q includes the language of p iff there is a substitution from q to p From all patterns that pass the test select the longest From the resulting set of patterns, output any which cannot be gained by substituting from another → NP-complete

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 15 / 44

slide-55
SLIDE 55

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Results so far

Let S be a sample Theorem (4.2) If P = NP then there is no polynomial-time algorithm to find a pattern of maximum possible length descriptive of S

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 16 / 44

slide-56
SLIDE 56

Finding Patterns Common to a Set of Strings Finding descriptive patterns

Results so far

Let S be a sample Theorem (4.2) If P = NP then there is no polynomial-time algorithm to find a pattern of maximum possible length descriptive of S We may still solve this efficiently in special cases!

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 16 / 44

slide-57
SLIDE 57

Finding Patterns Common to a Set of Strings Properties of pattern languages

1

Basic ideas

2

Finding Patterns Common to a Set of Strings Learning pattern languages in the limit Finding descriptive patterns Properties of pattern languages Finding descriptive one-variable patterns

3

Other results

4

Conclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 17 / 44

slide-58
SLIDE 58

Finding Patterns Common to a Set of Strings Properties of pattern languages

Comparison to other language types

The pattern language L(xx) is not context-free

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 18 / 44

slide-59
SLIDE 59

Finding Patterns Common to a Set of Strings Properties of pattern languages

Comparison to other language types

The pattern language L(xx) is not context-free The regular language L(0|1) = {0, 1} is not a pattern language

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 18 / 44

slide-60
SLIDE 60

Finding Patterns Common to a Set of Strings Properties of pattern languages

Comparison to other language types

The pattern language L(xx) is not context-free The regular language L(0|1) = {0, 1} is not a pattern language Theorem (3.4, Jiang) Every pattern language is context-sensitive

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 18 / 44

slide-61
SLIDE 61

Finding Patterns Common to a Set of Strings Properties of pattern languages

Comparison to other language types

The pattern language L(xx) is not context-free The regular language L(0|1) = {0, 1} is not a pattern language Theorem (3.4, Jiang) Every pattern language is context-sensitive Language Membership Emptiness Equivalence Inclusion Context-sens. D U U U Context-free D D U U Regular D D D D Pattern lang. D D D U

Table: D=decidable, U=undecidable

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 18 / 44

slide-62
SLIDE 62

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

1

Basic ideas

2

Finding Patterns Common to a Set of Strings Learning pattern languages in the limit Finding descriptive patterns Properties of pattern languages Finding descriptive one-variable patterns

3

Other results

4

Conclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 19 / 44

slide-63
SLIDE 63

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Overview

1

Introduce necessary conditions for one-variable patterns that could generate a string

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 20 / 44

slide-64
SLIDE 64

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Overview

1

Introduce necessary conditions for one-variable patterns that could generate a string

2

Bound the number of one-variable patterns that could generate every string in a sample

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 20 / 44

slide-65
SLIDE 65

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Overview

1

Introduce necessary conditions for one-variable patterns that could generate a string

2

Bound the number of one-variable patterns that could generate every string in a sample

3

Construct automata that recognize exactly these patterns

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 20 / 44

slide-66
SLIDE 66

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Overview

1

Introduce necessary conditions for one-variable patterns that could generate a string

2

Bound the number of one-variable patterns that could generate every string in a sample

3

Construct automata that recognize exactly these patterns

4

Finally, select a specific automaton that recognizes descriptive

  • ne-variable patterns

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 20 / 44

slide-67
SLIDE 67

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Feasible triples

Let p be a one-variable pattern and s a string of constants

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 21 / 44

slide-68
SLIDE 68

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Feasible triples

Let p be a one-variable pattern and s a string of constants We define a mapping τ(p) = (i, j, k) where

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 21 / 44

slide-69
SLIDE 69

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Feasible triples

Let p be a one-variable pattern and s a string of constants We define a mapping τ(p) = (i, j, k) where

i is the number of constants in p

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 21 / 44

slide-70
SLIDE 70

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Feasible triples

Let p be a one-variable pattern and s a string of constants We define a mapping τ(p) = (i, j, k) where

i is the number of constants in p j is the number of occurences of x in p

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 21 / 44

slide-71
SLIDE 71

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Feasible triples

Let p be a one-variable pattern and s a string of constants We define a mapping τ(p) = (i, j, k) where

i is the number of constants in p j is the number of occurences of x in p k is the position of the first occurence of x in p

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 21 / 44

slide-72
SLIDE 72

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Feasible triples

Let p be a one-variable pattern and s a string of constants We define a mapping τ(p) = (i, j, k) where

i is the number of constants in p j is the number of occurences of x in p k is the position of the first occurence of x in p

A pattern p can only generate s, if τ(p) is feasible for s

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 21 / 44

slide-73
SLIDE 73

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Feasible triples

Let p be a one-variable pattern and s a string of constants We define a mapping τ(p) = (i, j, k) where

i is the number of constants in p j is the number of occurences of x in p k is the position of the first occurence of x in p

A pattern p can only generate s, if τ(p) is feasible for s Let S = {s1, . . . , sm} a sample

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 21 / 44

slide-74
SLIDE 74

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Feasible triples

Let p be a one-variable pattern and s a string of constants We define a mapping τ(p) = (i, j, k) where

i is the number of constants in p j is the number of occurences of x in p k is the position of the first occurence of x in p

A pattern p can only generate s, if τ(p) is feasible for s Let S = {s1, . . . , sm} a sample Let F be the set of all triples feasible for every string in S

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 21 / 44

slide-75
SLIDE 75

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Feasible triples

Let p be a one-variable pattern and s a string of constants We define a mapping τ(p) = (i, j, k) where

i is the number of constants in p j is the number of occurences of x in p k is the position of the first occurence of x in p

A pattern p can only generate s, if τ(p) is feasible for s Let S = {s1, . . . , sm} a sample Let F be the set of all triples feasible for every string in S We can bound |F| = O(l2 log l) where l is the length of the shortest string in S

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 21 / 44

slide-76
SLIDE 76

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Anguin’s algorithm for finding descriptive

  • ne-variable patterns

Let S be a sample

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 22 / 44

slide-77
SLIDE 77

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Anguin’s algorithm for finding descriptive

  • ne-variable patterns

Let S be a sample Construct F by enumerating all feasible triples

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 22 / 44

slide-78
SLIDE 78

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Anguin’s algorithm for finding descriptive

  • ne-variable patterns

Let S be a sample Construct F by enumerating all feasible triples For each triple f ∈ F

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 22 / 44

slide-79
SLIDE 79

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Anguin’s algorithm for finding descriptive

  • ne-variable patterns

Let S be a sample Construct F by enumerating all feasible triples For each triple f ∈ F

For each string s ∈ S

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 22 / 44

slide-80
SLIDE 80

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Anguin’s algorithm for finding descriptive

  • ne-variable patterns

Let S be a sample Construct F by enumerating all feasible triples For each triple f ∈ F

For each string s ∈ S

Construct automaton which recognizes patterns p that

  • fulfill τ(p) = f
  • generate s

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 22 / 44

slide-81
SLIDE 81

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Anguin’s algorithm for finding descriptive

  • ne-variable patterns

Let S be a sample Construct F by enumerating all feasible triples For each triple f ∈ F

For each string s ∈ S

Construct automaton which recognizes patterns p that

  • fulfill τ(p) = f
  • generate s

Intersect these automata

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 22 / 44

slide-82
SLIDE 82

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Anguin’s algorithm for finding descriptive

  • ne-variable patterns

Let S be a sample Construct F by enumerating all feasible triples For each triple f ∈ F

For each string s ∈ S

Construct automaton which recognizes patterns p that

  • fulfill τ(p) = f
  • generate s

Intersect these automata

From the resulting set of automata: discard those whose language is empty

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 22 / 44

slide-83
SLIDE 83

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Anguin’s algorithm for finding descriptive

  • ne-variable patterns

Let S be a sample Construct F by enumerating all feasible triples For each triple f ∈ F

For each string s ∈ S

Construct automaton which recognizes patterns p that

  • fulfill τ(p) = f
  • generate s

Intersect these automata

From the resulting set of automata: discard those whose language is empty Lemma (6.3) Any pattern accepted by an automaton built from a triple that maximizes i + j is descriptive of S among one variable patterns

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 22 / 44

slide-84
SLIDE 84

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Example

Let S = {s1, s2, s3} a sample with s1 = 1101011, s2 = 10011, s3 = 11111 We construct F through enumeration We get: F = {(1, 1, k), (1, 2, k), (2, 1, k), (3, 1, k), (3, 2, k), (4, 1, k)} 1 ≤ k ≤ i + 1

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 23 / 44

slide-85
SLIDE 85

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Example

Let S = {s1, s2, s3} a sample with s1 = 1101011, s2 = 10011, s3 = 11111 We construct F through enumeration We get: F = {(1, 1, k), (1, 2, k), (2, 1, k), (3, 1, k), (3, 2, k), (4, 1, k)} 1 ≤ k ≤ i + 1 We construct three automata per triple in F

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 23 / 44

slide-86
SLIDE 86

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Example

Let S = {s1, s2, s3} a sample with s1 = 1101011, s2 = 10011, s3 = 11111 We construct F through enumeration We get: F = {(1, 1, k), (1, 2, k), (2, 1, k), (3, 1, k), (3, 2, k), (4, 1, k)} 1 ≤ k ≤ i + 1 We construct three automata per triple in F In this example we do this for: (3, 2, 2) ∈ F

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 23 / 44

slide-87
SLIDE 87

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Triple: (3, 2, 2), String: s1 = 1101011

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 24 / 44

slide-88
SLIDE 88

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Triple: (3, 2, 2), String: s1 = 1101011 Substring starts at position 2, length: (|s1| − 3) /2 = 2

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 24 / 44

slide-89
SLIDE 89

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Triple: (3, 2, 2), String: s1 = 1101011 Substring starts at position 2, length: (|s1| − 3) /2 = 2 Substring: x = 10 (0, 0)

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 24 / 44

slide-90
SLIDE 90

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Triple: (3, 2, 2), String: s1 = 1101011 Substring starts at position 2, length: (|s1| − 3) /2 = 2 Substring: x = 10 (0, 0) (1, 0) 1

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 24 / 44

slide-91
SLIDE 91

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Triple: (3, 2, 2), String: s1 = 1101011 Substring starts at position 2, length: (|s1| − 3) /2 = 2 Substring: x = 10 (0, 0) (1, 0) (1, 1) 1 x

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 24 / 44

slide-92
SLIDE 92

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Triple: (3, 2, 2), String: s1 = 1101011 Substring starts at position 2, length: (|s1| − 3) /2 = 2 Substring: x = 10 (0, 0) (1, 0) (1, 1) (1, 2) 1 x x

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 24 / 44

slide-93
SLIDE 93

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Triple: (3, 2, 2), String: s1 = 1101011 Substring starts at position 2, length: (|s1| − 3) /2 = 2 Substring: x = 10 (0, 0) (1, 0) (1, 1) (1, 2) (2, 2) 1 x x 1

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 24 / 44

slide-94
SLIDE 94

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Triple: (3, 2, 2), String: s1 = 1101011 Substring starts at position 2, length: (|s1| − 3) /2 = 2 Substring: x = 10 (0, 0) (1, 0) (1, 1) (1, 2) (2, 2) (3, 2) 1 x x 1 1

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 24 / 44

slide-95
SLIDE 95

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Triple: (3, 2, 2), String: s1 = 1101011 Substring starts at position 2, length: (|s1| − 3) /2 = 2 Substring: x = 10 (0, 0) (1, 0) (1, 1) (2, 1) (1, 2) (2, 2) (3, 2) 1 x x 1 1 1

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 24 / 44

slide-96
SLIDE 96

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Triple: (3, 2, 2), String: s1 = 1101011 Substring starts at position 2, length: (|s1| − 3) /2 = 2 Substring: x = 10 (0, 0) (1, 0) (1, 1) (2, 1) (3, 1) (1, 2) (2, 2) (3, 2) 1 x x 1 1 1

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 24 / 44

slide-97
SLIDE 97

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Triple: (3, 2, 2), String: s1 = 1101011 Substring starts at position 2, length: (|s1| − 3) /2 = 2 Substring: x = 10 (0, 0) (1, 0) (1, 1) (2, 1) (3, 1) (4, 1) (1, 2) (2, 2) (3, 2) 1 x x 1 1 1 1

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 24 / 44

slide-98
SLIDE 98

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Triple: (3, 2, 2), String: s1 = 1101011 Substring starts at position 2, length: (|s1| − 3) /2 = 2 Substring: x = 10 (0, 0) (1, 0) (1, 1) (2, 1) (3, 1) (4, 1) (5, 1) (1, 2) (2, 2) (3, 2) 1 x x 1 1 1 1 1

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 24 / 44

slide-99
SLIDE 99

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Triple: (3, 2, 2), String: s2 = 10011, Substring length: (|s2| − 3) /2 = 1 Substring: x = 0 (0, 0) (1, 0) (1, 1) (2, 1) (3, 1) (4, 1) (1, 2) (2, 2) (3, 2) 1 x x 1 1 1 1

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 25 / 44

slide-100
SLIDE 100

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Triple: (3, 2, 2), String: s3 = 11111, Substring length: (|s3| − 3) /2 = 1 Substring: x = 1 (0, 0) (1, 0) (1, 1) (2, 1) (3, 1) (4, 1) (1, 2) (2, 2) (3, 2) 1 x x 1 1 x 1 x 1 1

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 26 / 44

slide-101
SLIDE 101

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Intersection of all three automata: (0, 0) (1, 0) (1, 1) (1, 2) (2, 2) (3, 2) 1 x x 1 1

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 27 / 44

slide-102
SLIDE 102

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Intersection of all three automata: (0, 0) (1, 0) (1, 1) (1, 2) (2, 2) (3, 2) 1 x x 1 1 Clearly the automaton recognizes the language {1xx11}

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 27 / 44

slide-103
SLIDE 103

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Recall: S = {s1, s2, s3} with s1 = 1101011, s2 = 10011, s3 = 11111, F = {(1, 1, k), (1, 2, k), (2, 1, k), (3, 1, k), (3, 2, k), (4, 1, k)}, 1 ≤ k ≤ i + 1 Example automata for (3, 2, 2) ∈ F

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 28 / 44

slide-104
SLIDE 104

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Recall: S = {s1, s2, s3} with s1 = 1101011, s2 = 10011, s3 = 11111, F = {(1, 1, k), (1, 2, k), (2, 1, k), (3, 1, k), (3, 2, k), (4, 1, k)}, 1 ≤ k ≤ i + 1 Example automata for (3, 2, 2) ∈ F Clearly 3 + 2 maximizes i + j in F

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 28 / 44

slide-105
SLIDE 105

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Recall: S = {s1, s2, s3} with s1 = 1101011, s2 = 10011, s3 = 11111, F = {(1, 1, k), (1, 2, k), (2, 1, k), (3, 1, k), (3, 2, k), (4, 1, k)}, 1 ≤ k ≤ i + 1 Example automata for (3, 2, 2) ∈ F Clearly 3 + 2 maximizes i + j in F The language recognized by the automaton for (3, 2, 2) ∈ F is {1xx11} = ∅

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 28 / 44

slide-106
SLIDE 106

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Recall: S = {s1, s2, s3} with s1 = 1101011, s2 = 10011, s3 = 11111, F = {(1, 1, k), (1, 2, k), (2, 1, k), (3, 1, k), (3, 2, k), (4, 1, k)}, 1 ≤ k ≤ i + 1 Example automata for (3, 2, 2) ∈ F Clearly 3 + 2 maximizes i + j in F The language recognized by the automaton for (3, 2, 2) ∈ F is {1xx11} = ∅ Thus, 1xx11 is descriptive of S among one-variable patterns

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 28 / 44

slide-107
SLIDE 107

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Summary

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 29 / 44

slide-108
SLIDE 108

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Summary

Let S be a sample Construct F by enumerating all feasible triples

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 29 / 44

slide-109
SLIDE 109

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Summary

Let S be a sample Construct F by enumerating all feasible triples For each triple f ∈ F

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 29 / 44

slide-110
SLIDE 110

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Summary

Let S be a sample Construct F by enumerating all feasible triples For each triple f ∈ F

For each string s ∈ S construct automaton

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 29 / 44

slide-111
SLIDE 111

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Summary

Let S be a sample Construct F by enumerating all feasible triples For each triple f ∈ F

For each string s ∈ S construct automaton Intersect these automata

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 29 / 44

slide-112
SLIDE 112

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Summary

Let S be a sample Construct F by enumerating all feasible triples For each triple f ∈ F

For each string s ∈ S construct automaton Intersect these automata

Discard automata whose language is empty

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 29 / 44

slide-113
SLIDE 113

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Summary

Let S be a sample Construct F by enumerating all feasible triples For each triple f ∈ F

For each string s ∈ S construct automaton Intersect these automata

Discard automata whose language is empty Choose any pattern recognized by an automaton that was built from a triple maximizing i + j

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 29 / 44

slide-114
SLIDE 114

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Summary

Let S be a sample Construct F by enumerating all feasible triples For each triple f ∈ F

For each string s ∈ S construct automaton Intersect these automata

Discard automata whose language is empty Choose any pattern recognized by an automaton that was built from a triple maximizing i + j We can bound the number of feasible triples and construct the automata in time polynomial in their sizes

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 29 / 44

slide-115
SLIDE 115

Finding Patterns Common to a Set of Strings Finding descriptive one-variable patterns

Summary

Let S be a sample Construct F by enumerating all feasible triples For each triple f ∈ F

For each string s ∈ S construct automaton Intersect these automata

Discard automata whose language is empty Choose any pattern recognized by an automaton that was built from a triple maximizing i + j We can bound the number of feasible triples and construct the automata in time polynomial in their sizes The algorithm runs in time polynomial in the length of the input

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 29 / 44

slide-116
SLIDE 116

Other results

1

Basic ideas

2

Finding Patterns Common to a Set of Strings

3

Other results Lange and Wiehagen’s algorithm Further work Practical applications

4

Conclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 30 / 44

slide-117
SLIDE 117

Other results Lange and Wiehagen’s algorithm

1

Basic ideas

2

Finding Patterns Common to a Set of Strings

3

Other results Lange and Wiehagen’s algorithm Further work Practical applications

4

Conclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 31 / 44

slide-118
SLIDE 118

Other results Lange and Wiehagen’s algorithm

What if we allowed wrong results?

Paper by Steffen Lange and Rolf Wiehagen published in 1991 Polynomial-time Inference of Arbitrary Pattern Languages

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 32 / 44

slide-119
SLIDE 119

Other results Lange and Wiehagen’s algorithm

What if we allowed wrong results?

Paper by Steffen Lange and Rolf Wiehagen published in 1991 Polynomial-time Inference of Arbitrary Pattern Languages Presents an algorithm that identifies any pattern language in the limit

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 32 / 44

slide-120
SLIDE 120

Other results Lange and Wiehagen’s algorithm

What if we allowed wrong results?

Paper by Steffen Lange and Rolf Wiehagen published in 1991 Polynomial-time Inference of Arbitrary Pattern Languages Presents an algorithm that identifies any pattern language in the limit Each hypothesis is found in polynomial time

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 32 / 44

slide-121
SLIDE 121

Other results Lange and Wiehagen’s algorithm

Lange and Wiehagen’s algorithm

Idea: Only look at strings of minimal length (discard the others)

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 33 / 44

slide-122
SLIDE 122

Other results Lange and Wiehagen’s algorithm

Lange and Wiehagen’s algorithm

Idea: Only look at strings of minimal length (discard the others) Output pattern descriptive of strings of minimal length

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 33 / 44

slide-123
SLIDE 123

Other results Lange and Wiehagen’s algorithm

Lange and Wiehagen’s algorithm

Idea: Only look at strings of minimal length (discard the others) Output pattern descriptive of strings of minimal length Result:

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 33 / 44

slide-124
SLIDE 124

Other results Lange and Wiehagen’s algorithm

Lange and Wiehagen’s algorithm

Idea: Only look at strings of minimal length (discard the others) Output pattern descriptive of strings of minimal length Result: Will identify pattern language in the limit

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 33 / 44

slide-125
SLIDE 125

Other results Lange and Wiehagen’s algorithm

Lange and Wiehagen’s algorithm

Idea: Only look at strings of minimal length (discard the others) Output pattern descriptive of strings of minimal length Result: Will identify pattern language in the limit Polynomial run time - finding descriptive patterns of the same length is easy

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 33 / 44

slide-126
SLIDE 126

Other results Lange and Wiehagen’s algorithm

Lange and Wiehagen’s algorithm

Idea: Only look at strings of minimal length (discard the others) Output pattern descriptive of strings of minimal length Result: Will identify pattern language in the limit Polynomial run time - finding descriptive patterns of the same length is easy Algorithm will sometimes output wrong hypotheses

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 33 / 44

slide-127
SLIDE 127

Other results Further work

1

Basic ideas

2

Finding Patterns Common to a Set of Strings

3

Other results Lange and Wiehagen’s algorithm Further work Practical applications

4

Conclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 34 / 44

slide-128
SLIDE 128

Other results Further work

Possible extensions of pattern languages

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 35 / 44

slide-129
SLIDE 129

Other results Further work

Possible extensions of pattern languages

In extended pattern languages, empty substitutions are allowed

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 35 / 44

slide-130
SLIDE 130

Other results Further work

Possible extensions of pattern languages

In extended pattern languages, empty substitutions are allowed A regular pattern contains each variable at most once

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 35 / 44

slide-131
SLIDE 131

Other results Further work

Possible extensions of pattern languages

In extended pattern languages, empty substitutions are allowed A regular pattern contains each variable at most once

Language Membership Equivalence Inclusion Standard NP P U Regular P P P Extended NP Open U Extended Regular P P P

Table: U=undecidable

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 35 / 44

slide-132
SLIDE 132

Other results Further work

Possible extensions of pattern languages

In extended pattern languages, empty substitutions are allowed A regular pattern contains each variable at most once

Language Membership Equivalence Inclusion Standard NP P U Regular P P P Extended NP Open U Extended Regular P P P

Table: U=undecidable

Polynomial update time does not guarantee good learning time

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 35 / 44

slide-133
SLIDE 133

Other results Further work

Possible extensions of pattern languages

In extended pattern languages, empty substitutions are allowed A regular pattern contains each variable at most once

Language Membership Equivalence Inclusion Standard NP P U Regular P P P Extended NP Open U Extended Regular P P P

Table: U=undecidable

Polynomial update time does not guarantee good learning time One variable patterns can be learned very efficiently - will be covered in next talk!

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 35 / 44

slide-134
SLIDE 134

Other results Practical applications

1

Basic ideas

2

Finding Patterns Common to a Set of Strings

3

Other results Lange and Wiehagen’s algorithm Further work Practical applications

4

Conclusion

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 36 / 44

slide-135
SLIDE 135

Other results Practical applications

Shinohara ’82: Data entry systems

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 37 / 44

slide-136
SLIDE 136

Other results Practical applications

Shinohara ’82: Data entry systems Nix ’83: Automatic text editing by examples

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 37 / 44

slide-137
SLIDE 137

Other results Practical applications

Shinohara ’82: Data entry systems Nix ’83: Automatic text editing by examples Arimura ’94: Finding patterns in amino acid sequences

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 37 / 44

slide-138
SLIDE 138

Other results Practical applications

Shinohara ’82: Data entry systems Nix ’83: Automatic text editing by examples Arimura ’94: Finding patterns in amino acid sequences Much work done in related fields!

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 37 / 44

slide-139
SLIDE 139

Conclusion

1

Basic ideas

2

Finding Patterns Common to a Set of Strings

3

Other results

4

Conclusion References

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 38 / 44

slide-140
SLIDE 140

Conclusion

Summary

Pattern languages: model for inductive inference

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 39 / 44

slide-141
SLIDE 141

Conclusion

Summary

Pattern languages: model for inductive inference Finding descriptive patterns: generally not efficiently possible

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 39 / 44

slide-142
SLIDE 142

Conclusion

Summary

Pattern languages: model for inductive inference Finding descriptive patterns: generally not efficiently possible Special case: polynomial-time algorithm for one-variable patterns

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 39 / 44

slide-143
SLIDE 143

Conclusion

Summary

Pattern languages: model for inductive inference Finding descriptive patterns: generally not efficiently possible Special case: polynomial-time algorithm for one-variable patterns Lange/Wiehagen algorithm: inconsistent algorithm turns out to be very effective

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 39 / 44

slide-144
SLIDE 144

Conclusion References

References I

◮ Dana Angluin.

Finding patterns common to a set of strings. Journal of Computer and System Sciences, 21(1):46 – 62, 1980.

◮ Dana Angluin.

Inductive inference of formal languages from positive data. Information and Control, 45(2):117 – 135, 1980.

◮ Hiroki Arimura, Ryoichi Fujino, Takeshi Shinohara, and Setsuo

Arikawa. Protein motif discovery from positive examples by minimal multiple generalization over regular patterns. Genome Informatics, 5:39–48, 1994.

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 40 / 44

slide-145
SLIDE 145

Conclusion References

References II

◮ Thomas Erlebach, Peter Rossmanith, Hans Stadtherr, Agelika

Steger, and Thomas Zeugmann. Learning one-variable pattern languages very efficiently on average, in parallel, and by asking queries.

  • Theor. Comput. Sci., 261(1):119–156, June 2001.

◮ Dominik D. Freydenberger and Daniel Reidenbach.

Bad news on decision problems for patterns. Information and Computation, 208(1):83 – 96, 2010.

◮ E. Mark Gold.

Language identification in the limit. Information and Control, 10(5):447–474, 1967.

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 41 / 44

slide-146
SLIDE 146

Conclusion References

References III

◮ Tao Jiang, Ming Li, Bala Ravikumar, and Kenneth W. Regan.

Formal grammars and languages. In Mikhail J. Atallah and Marina Blanton, editors, Algorithms and Theory of Computation Handbook, pages 20–20. Chapman & Hall/CRC, 2010.

◮ Tao Jiang, Arto Salomaa, Kai Salomaa, and Sheng Yu.

Inclusion is undecidable for pattern languages. In Andrzej Lingas, Rolf Karlsson, and Svante Carlsson, editors, Automata, Languages and Programming, volume 700 of Lecture Notes in Computer Science, pages 301–312. Springer Berlin Heidelberg, 1993.

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 42 / 44

slide-147
SLIDE 147

Conclusion References

References IV

◮ Steffen Lange and Rolf Wiehagen.

Polynomial-time inference of arbitrary pattern languages. New Generation Computing, 8(4):361–370, 1991.

◮ Yen Kaow Ng and Takeshi Shinohara.

Developments from enquiries into the learnability of the pattern languages from positive data. Theoretical Computer Science, 397(1):150–165, 2008.

◮ Takeshi Shinohara.

Polynomial time inference of pattern languages and its applications. In Proceedings of the 7th IBM Symposium on Mathematical Foundations of Computer Science, pages 191–209, 1982.

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 43 / 44

slide-148
SLIDE 148

Conclusion References

References V

◮ Takeshi Shinohara and Setsuo Arikawa.

Pattern inference. In Algorithmic Learning for Knowledge-Based Systems, pages 259–291. Springer, 1995.

◮ Thomas Zeugmann.

Lange and wiehagen’s pattern language learning algorithm: An average-case analysis with respect to its total learning time. Annals of Mathematics and Artificial Intelligence, 23(1-2):117–145, January 1998.

Michael Krause (RWTH Aachen) Pattern Languages 07.05.2015 44 / 44