Grammatical inference: an introduction Colin de la Higuera - - PowerPoint PPT Presentation
Grammatical inference: an introduction Colin de la Higuera - - PowerPoint PPT Presentation
Grammatical inference: an introduction Colin de la Higuera University of Nantes Nantes @wikipedia 2 Colin de la Higuera, Nantes 2013 Acknowledgements Pieter Adriaans, Hasan Ibne Akram, Anne-Muriel Arigon, Leo Becerra-Bonache,
Nantes
@wikipedia
Colin de la Higuera, Nantes 2013 2
Acknowledgements
Pieter Adriaans, Hasan Ibne Akram, Anne-Muriel
Arigon, Leo Becerra-Bonache, Cristina Bibire, Alex Clark, Rafael Carrasco, Paco Casacuberta, Pierre Dupont, Rémi Eyraud, Philippe Ezequel, Henning Fernau, Jeffrey Heinz, Jean-Christophe Janodet, Satoshi Kobayachi, Laurent Miclet, Thierry Murgue, Tim Oates, Jose Oncina, Frédéric Tantini, Franck Thollard, Sicco Verwer, Enrique Vidal, Menno van Zaanen,... http://pagesperso.lina.univ-nantes.fr/~cdlh/ http://videolectures.net/colin_de_la_higuera/
Colin de la Higuera, Nantes 2013 3
Practical information
Grammatical Inference is module X9IT050 18 hours http://pagesperso.lina.univ-
nantes.fr/~cdlh/X9IT050.html
Exam: to be decided
Colin de la Higuera, Nantes 2013 4
Some useful links
The
Grammatical Inference Software Repository https://logiciels.lina.univ- nantes.fr/redmine/projects/gisr/wiki
Talks on http://videolectures.net A book Articles Start
here: http://pagesperso.lina.univ- nantes.fr/~cdlh/X9IT050.html
Colin de la Higuera, Nantes 2013 5
What I plan to talk about
1.
11/9/2013 An introduction to grammatical inference. About what learning a language means, how we can measure success
2.
18/9/2013 An introduction to grammatical inference. A motivating example
3.
25/9/2013 Learning: identifying or approximating?
4.
2/10/2013 Learning from text
5.
9/10/2013 Learning from text: the window languages
6.
16/10/2013 Learning from an informant: the RPNI algorithm and variants
7.
23/10/2013 Learning distributions: why? How should we measure success? About distances between distributions
8.
6/11/2013 Learning distributions: learning the weights given a structure. EM, Gibbs sampling and the spectral methods
9.
13/11/2013 Learning distributions: state merging techniques
10.
20/11/2013 Active learning 1 About active learning
11.
27/11/2013 Active learning 2 The MAT algorithm
12.
4/12/2013 Learning transducers
13.
11/12/2013 Learning probabilistic transducers
14.
18/12/2013 Exam
Colin de la Higuera, Nantes 2013 6
Outline (of this first talk)
1.
What is grammatical inference about?
2.
Why is it a difficult task?
3.
Why is it a useful task?
4.
Validation issues
5.
Some criteria
Colin de la Higuera, Nantes 2013 7
1 Grammatical inference
is about learning a grammar given information about a language
Information is strings, trees or graphs Information can be (typically)
Text: only positive information Informant: labelled data Actively sought (query learning, teaching)
Above lists are not limitative
Colin de la Higuera, Nantes 2013 8
The functions/goals
Languages
and grammars from the Chomsky hierarchy
Probabilistic automata and context-free
grammars
Hidden Markov Models Patterns Transducers
Colin de la Higuera, Nantes 2013 9
The Chomsky hierarchy
Colin de la Higuera, Nantes 2013 10
Regular languages Context-free languages Context sensitive languages Recursively enumerable languages
The Chomsky hierarchy revisited
Regular languages
Recognized by DFA, NFA Generated by regular grammars Described by regular expressions
Context-free languages
Generated by CF grammars Recognized by Stack automata
Context-sensitive languages
CS grammars (parsing is not in P)
Turing machines
Parsing is undecidable
Colin de la Higuera, Nantes 2013 11
Other formalisms
Topological formalisms
Semilinear languages Hyperplanes Balls of strings
Colin de la Higuera, Nantes 2013 12
Distributions of strings
A
probabilistic automaton defines a distribution over the strings
Colin de la Higuera, Nantes 2013 13
Fuzzy automata
An automaton will say that string w belongs
to the language with probability p
The
difference with the probabilistic automata is that
The total sum of probabilities may be different
than 1 (may even be infinite)
The fuzzy automaton cannot be used as a
generator of strings
Colin de la Higuera, Nantes 2013 14
The data: examples of strings
A string in Gaelic and its translation to English:
Tha thu cho duaichnidh ri èarr àirde de a’ coisich
deas damh
You are as ugly as the north end of a southward
traveling ox
Colin de la Higuera, Nantes 2013 15
http://www.flickr.com/photos/popfossa/3992549630/
Colin de la Higuera, Nantes 2013 16
Time series pose the problem of the alphabet:
- An infinite alphabet?
- Discretizing?
- An ordered alphabet
GIORGIO BERNARDI, REGINA GOURSOT, EDDA RAYKO, RENÉ GOURSOT, BAYA CHERIF-ZAHAR, AND ROBERTA MELIS http://www.scopenvironment.org/downloadpubs/scope44/ chapter05.html
Colin de la Higuera, Nantes 2013 17
>A BAC=41M14 LIBRARY=CITB_978_SKB AAGCTTATTCAATAGTTTATTAAACAGCTTCTTAAATAGGATATAAGGCAGTGCCATGTA GTGGATAAAAGTAATAATCATTATAATATTAAGAACTAATACATACTGAACACTTTCAAT GGCACTTTACATGCACGGTCCCTTTAATCCTGAAAAAA TGCTATTGCCATCTTTATTTCA GAGACCAGGGTGCTAAGGCTTGAGAGTGAAGCCACTTTCCCCAAGCTCACACAGCAAAGA CACGGGGACACCAGGACTCCATCTACTGCAGGTTGTCTGACTGGGAACCCCCATGCACCT GGCAGGTGACAGAAATAGGAGGCATGTGCTGGGTTTGGAAGAGACACCTGGTGGGAGAGG GCCCTGTGGAGCCAGATGGGGCTGAAAACAAATGTTGAATGCAAGAAAAGTCGAGTTCCA GGGGCATTACATGCAGCAGGATATGCTTTTTAGAAAAAGTCCAAAAACACTAAACTTCAA CAATATGTTCTTTTGGCTTGCATTTGTGTATAACCGTAATTAAAAAGCAAGGGGACAACA CACAGTAGATTCAGGATAGGGGTCCCCTCTAGAAAGAAGGAGAAGGGGCAGGAGACAGGA TGGGGAGGAGCACATAAGTAGATGTAAATTGCTGCTAATTTTTCTAGTCCTTGGTTTGAA TGATAGGTTCATCAAGGGTCCATTACAAAAACATGTGTTAAGTTTTTTAAAAATATAATA AAGGAGCCAGGTGTAGTTTGTCTTGAACCACAGTTATGAAAAAAATTCCAACTTTGTGCA TCCAAGGACCAGATTTTTTTTAAAATAAAGGATAAAAGGAATAAGAAA TGAACAGCCAAG TATTCACTATCAAATTTGAGGAA TAATAGCCTGGCCAACATGGTGAAACTCCATCTCTAC TAAAAATACAAAAATTAGCCAGGTGTGGTGGCTCATGCCTGTAGTCCCAGCTACTTGCGA GGCTGAGGCAGGCTGAGAATCTCTTGAACCCAGGAAGTAGAGGTTGCAGTAGGCCAAGAT GGCGCCACTGCACTCCAGCCTGGGTGACAGAGCAAGACCCTATGTCCAAAAAAAAAAAAA AAAAAAAGGAAAAGAAAAAGAAAGAAAACAGTGTATATATAGTATATAGCTGAAGCTCCC TGTGTACCCATCCCCAATTCCATTTCCCTTTTTTGTCCCAGAGAACACCCCATTCCTGAC TAGTGTTTTATGTTCCTTTGCTTCTCTTTTTAAAAACTTCAATGCACACATATGCATCCA TGAACAACAGATAGTGGTTTTTGCATGACCTGAAACATTAATGAAATTGTATGATTCTAT
Colin de la Higuera, Nantes 2013 18
http://bandelestudio.com/tutoriel-mao-sur- la-creation-musicale/
Colin de la Higuera, Nantes 2013 19
http://fr.wikipedia.org/wiki/Philippe_VI_de_France
Colin de la Higuera, Nantes 2013 20
Colin de la Higuera, Nantes 2013 21
<book> <part> <chapter> <sect1/> <sect1> <orderedlist numeration="arabic"> <listitem/> <f:fragbody/> </orderedlist> </sect1> </chapter> </part> </book>
Colin de la Higuera, Nantes 2013 22
<?xml version="1.0"?> <?xml-stylesheet href="carmen.xsl" type="text/xsl"?> <?cocoon-process type="xslt"?> <!DOCTYPE pagina [ <!ELEMENT pagina (titulus?, poema)> <!ELEMENT titulus (#PCDATA)> <!ELEMENT auctor (praenomen, cognomen, nomen)> <!ELEMENT praenomen (#PCDATA)> <!ELEMENT nomen (#PCDATA)> <!ELEMENT cognomen (#PCDATA)> <!ELEMENT poema (versus+)> <!ELEMENT versus (#PCDATA)> ]> <pagina> <titulus>Catullus II</titulus> <auctor> <praenomen>Gaius</praenomen> <nomen>Valerius</nomen> <cognomen>Catullus</cognomen> </auctor>
Colin de la Higuera, Nantes 2013 23
Colin de la Higuera, Nantes 2013 24
And also
Business processes Bird songs Images (contours and shapes) Robot moves Web services Malware …
Colin de la Higuera, Nantes 2013 25
2 What does learning mean?
Suppose we write a program that can learn
grammars… are we done?
A first question is: « why bother? » If my programme works, why do something
more about it?
Why should we do something when other
researchers in Machine Learning are not?
Colin de la Higuera, Nantes 2013 26
Motivating reflection #1
Is 17 a random number? Is 0110110110110101011000111101 a random
sequence? (Is grammar G the correct grammar for a given sample S?)
Colin de la Higuera, Nantes 2013 27
Motivating reflection #2
In the case of languages, learning is an
- ngoing process
Is there a moment where we can say we
have learnt a language?
Colin de la Higuera, Nantes 2013 28
Motivating reflection #3
Statement “I have learnt” does not make
sense
Statement “I am learning” makes sense At least when learning over infinite spaces
Colin de la Higuera, Nantes 2013 29
What usually is called “ having learnt”
That the grammar / automaton is the
smallest, best (re a score) Combinatorial characterisation
That some optimisation problem has been
solved
That
the “learning” algorithm has converged (EM)
Colin de la Higuera, Nantes 2013 30
What is not said
That
having solved some complex combinatorial question we have an Occam, Compression, MDL, Kolmogorov complexity like argument which gives us some guarantee with respect to the future
Computational learning theory has got such
results
Colin de la Higuera, Nantes 2013 31
Why should we bother and those working in statistical machine learning not?
Whether with numerical functions or with
symbolic functions, we are all trying to do some sort of optimisation
The difference is (perhaps) that numerical
- ptimisation
works much better than combinatorial optimisation!
[they actually do bother, only differently]
Colin de la Higuera, Nantes 2013 32