Deep learning for natural language processing Introduction to natural language processing
Benoit Favre <benoit.favre@univ-mrs.fr>
Aix-Marseille Université, LIF/CNRS
20 Feb 2017
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 1 / 24
Deep learning for natural language processing Introduction to - - PowerPoint PPT Presentation
Deep learning for natural language processing Introduction to natural language processing Aix-Marseille Universit, LIF/CNRS 20 Feb 2017 Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 1 / 24 Benoit Favre < benoit.favre@univ-mrs.fr >
Aix-Marseille Université, LIF/CNRS
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 1 / 24
▶ Class: intro to natural language processing ▶ Class: quick primer on deep learning ▶ Tutorial: neural networks with Keras
▶ Class: word embeddings ▶ Tutorial: word embeddings
▶ Class: convolutional neural networks, recurrent neural networks ▶ Tutorial: sentiment analysis
▶ Class: advanced neural network architectures ▶ Tutorial: language modeling
▶ Tutorial: Image and text representations ▶ Test Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 2 / 24
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 3 / 24
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 4 / 24
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 5 / 24
1
2
3
1
2
3
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 6 / 24
1
2
3
4
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 7 / 24
question answer Automatic transcription Syntactic analysis Semantic analysis Dialog manager Syntactic generation Lexical generation Speech synthesis words syntactic tree concepts, relations words, prosody primitive syntax logical representation
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 8 / 24
▶ I don’t know! – I don’t - no!
▶ I live by the bank (river bank or fjnancial institution)
▶ I met an Indian (from India or native American) ▶ I love American wine (from USA or from the Americas)
▶ He looks at the man with a telescope ▶ He gave her cat food
▶ She is gone. Who?
▶ Birth date: 08/01/05
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 9 / 24
▶ Word / sentence segmentation ▶ Morphological analysis ▶ Part-of-speech tagging ▶ Syntactic chunking ▶ Syntactic parsing
▶ Word sense disambiguation ▶ Semantic role labeling ▶ Logical form creation
▶ Coreference resolution ▶ Discourse parsing Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 10 / 24
▶ 男孩喜歡冰淇淋。→ 男孩 (the boy) 喜歡 (likes) 冰淇淋 (ice cream) 。 Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 11 / 24
▶ fmower, fmower+s, fmoppy, fmopp+ies
▶ parse, pars+ing, pars+ed
▶ geo+caching ▶ re+do, un+do, over+do ▶ pre+fjx, suf+fjx ▶ geo+local+ization
▶ pronouns are glued to the verb (Arabic, spanish...)
▶ Turkish, Finish
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 12 / 24
▶ fmies: verb or noun? ▶ like: preposition or verb? Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 13 / 24
Source: http://www.nltk.org/book/tree_images/ch08-tree-1.png
Source: http://www.nltk.org/images/depgraph0.png
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 14 / 24
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 15 / 24
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 16 / 24
▶ “Alexander Graham Bell (March 3, 1847 –August 2, 1922)[4] was a
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 17 / 24
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 18 / 24
▶ Can be used to infer new
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 19 / 24
▶ Automatic translation during the cold war
▶ SHRDLU “place the red box next to the blue circle”, ELIZA “the therapis”
▶ Prolog (logic-base language for NLP), Dictionaries of semantic frames
▶ Transition “introspection” → “corpus” ▶ Evaluation campaigns ▶ Neural networks are “forgotten”
▶ Machine learning ▶ Applications: speech recognition, machine translation
▶ Deep learning Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 20 / 24
▶ Email ▶ Forums ▶ Chats ▶ Speech recordings ▶ Video
▶ Text → topic ▶ Sentence → parse tree ▶ Review → sentiment Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 21 / 24
1
2
3
4
5
6
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 22 / 24
▶ Raw text, audio... ▶ Sentences, contextualized words ▶ Output of another system
▶ n classes (ex: topics) ▶ Structure (ex: syntactic parse) ▶ Novel text (ex: translation, summary) ▶ Commands for a system (ex: chatbots)
▶ output = f(input) ▶ Deterministic vs random (evaluations need to be repeatable) ▶ Parametrisable: output = f(input, parameters) Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 23 / 24
▶ Requires a lot of feature engineering ▶ Is afgected by cascading errors ▶ Hard to account for unlabeled data ▶ Limited architectures and overly complex (ex: speech recognition...) ▶ The curse of annotated data (you need linguists)
▶ Feature extraction is learned within the model ▶ End-to-end training ▶ Much more fmexibility in model architectures ▶ Can use tons of data ▶ A step towards AI?
Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 24 / 24