statistical natural language processing
play

Statistical Natural Language Processing ar ltekin - PowerPoint PPT Presentation

Statistical Natural Language Processing ar ltekin ccoltekin@sfs.uni-tuebingen.de University of Tbingen Seminar fr Sprachwissenschaft Summer Semester 2017 / ta tltecn / Motivation Overview Practical matters


  1. Statistical Natural Language Processing Çağrı Çöltekin ccoltekin@sfs.uni-tuebingen.de University of Tübingen Seminar für Sprachwissenschaft Summer Semester 2017 / tʃaːɾˈɯ tʃœltecˈɪn /

  2. Motivation Overview Practical matters Next Why study (statistical) NLP program science (and more) Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2017 1 / 24 • (Most of) you are studying in a ‘computational linguistics’ • Many practical applications • Investigating basic questions in linguistics and cognitive

  3. Motivation recognition/synthesis Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, research annotation for linguistic space change through time and behavior For fun (research): Overview 2 / 24 Application examples Practical matters Next For profjt (engineering): • Machine translation • Modeling cognitive/social • Question answering • Authorship attribution • Information retrieval • Investigating language • Dialog systems • Summarization • Text classifjcation • (Automatic) corpus • Text mining/analytics • Sentiment analysis • Speech • Automatic grading • Forensic linguistics

  4. Motivation Semantic Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, Synthesis Speech Generation Word Generation Sentence Planning Sentence analysis Discourse analysis Parsing Overview Analysis Morphological Recognition Speech Generation Analysis discourse semantics syntax morphology phonetics / phonology Layers of linguistic analysis Next Practical matters 3 / 24

  5. Motivation det Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, Syntax punct nsubj det root obl case Overview : story this comes AP the From Annotation layers: example Next Practical matters 4 / 24 → Tokens

  6. Motivation DET Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, Syntax punct nsubj det root obl det case PUNCT Overview NOUN VERB AP Practical matters Next Annotation layers: example From PROPN the comes this story : ADP DET 4 / 24 → Tokens → POS Tags → Morphology

  7. Motivation root 3s,Pres Sing,Dem Sing case det obl det Overview nsubj punct Syntax Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2017 Sing Def PUNCT NOUN Practical matters Next Annotation layers: example From the AP comes this story : ADP DET PROPN VERB DET 4 / 24 → Tokens → POS Tags → Morphology

  8. Motivation obl Sing 3s,Pres Sing,Dem Sing case det root PUNCT det nsubj punct Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2017 Overview Def NOUN comes Practical matters Next Annotation layers: example From the DET AP this story : ADP DET PROPN VERB 4 / 24 → Syntax → Tokens → POS Tags → Morphology

  9. Motivation Overview Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, 5 / 24 Next Typical NLP pipeline Practical matters • Text processing / normalization • Word/sentence tokenization • POS tagging • Morphological analysis • Syntactic parsing • Semantic parsing • Named entity recognition • Coreference resolution

  10. Motivation Overview Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, another (recent/trending) approach improves the results level – tasks are done individually, results are passed to upper pipeline approach: Do we need a pipeline? Next Practical matters 6 / 24 • Most ”traditional” NLP architectures are based on a • Joint learning (e.g., POS tagging and syntax) often • End-to-end learning (without intermediate layers) is

  11. Motivation Overview Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, statistical component from 80’s 90’s) rule-based methods interpretation of this term. — Chomsky (1968) sentence’ is an entirely useless one, under any known But it must be recognized that the notion ’probability of a On the word ‘statistical’ Next Practical matters 7 / 24 • Some linguistic traditions emphasize(d) use of ‘symbolic’, • Some NLP systems are based on rule-based systems (esp. • Virtually, all modern NLP systems include some sort of

  12. Motivation Overview Practical matters Next What is diffjcult with NLP? Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2017 8 / 24 • Combinatorial problems - computational complexity • Ambiguity • Data sparseness

  13. Motivation Overview Practical matters Next NLP and computational complexity probabilities of words in it? Many similar questions we deal with have an exponential search space Naive approaches often are computationally intractable Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2017 9 / 24 • How many possible parses a sentence may have? • How many ways can you align two (parallel) sentences? • How to calculate probability of sentence based on the

  14. Motivation Overview Practical matters Next NLP and computational complexity probabilities of words in it? search space Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2017 9 / 24 • How many possible parses a sentence may have? • How many ways can you align two (parallel) sentences? • How to calculate probability of sentence based on the • Many similar questions we deal with have an exponential • Naive approaches often are computationally intractable

  15. Motivation PROSTITUTES APPEAL TO POPE Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, MINERS REFUSE TO WORK AFTER DEATH DRUNK GETS NINE MONTHS IN VIOLIN CASE KIDS MAKE NUTRITIOUS SNACKS BAN ON NUDE DANCING ON GOVERNOR’S DESK Overview SQUAD HELPS DOG BITE VICTIM TEACHER STRIKES IDLE KIDS fun with newspaper headlines NLP and ambiguity Next Practical matters 10 / 24 • FARMER BILL DIES IN HOUSE

  16. Motivation PROSTITUTES APPEAL TO POPE Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, MINERS REFUSE TO WORK AFTER DEATH DRUNK GETS NINE MONTHS IN VIOLIN CASE KIDS MAKE NUTRITIOUS SNACKS BAN ON NUDE DANCING ON GOVERNOR’S DESK Overview SQUAD HELPS DOG BITE VICTIM fun with newspaper headlines NLP and ambiguity Next Practical matters 10 / 24 • FARMER BILL DIES IN HOUSE • TEACHER STRIKES IDLE KIDS

  17. Motivation PROSTITUTES APPEAL TO POPE Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, MINERS REFUSE TO WORK AFTER DEATH DRUNK GETS NINE MONTHS IN VIOLIN CASE KIDS MAKE NUTRITIOUS SNACKS BAN ON NUDE DANCING ON GOVERNOR’S DESK Overview fun with newspaper headlines NLP and ambiguity Next Practical matters 10 / 24 • FARMER BILL DIES IN HOUSE • TEACHER STRIKES IDLE KIDS • SQUAD HELPS DOG BITE VICTIM

  18. Motivation PROSTITUTES APPEAL TO POPE Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, MINERS REFUSE TO WORK AFTER DEATH DRUNK GETS NINE MONTHS IN VIOLIN CASE KIDS MAKE NUTRITIOUS SNACKS 10 / 24 Overview fun with newspaper headlines NLP and ambiguity Next Practical matters • FARMER BILL DIES IN HOUSE • TEACHER STRIKES IDLE KIDS • SQUAD HELPS DOG BITE VICTIM • BAN ON NUDE DANCING ON GOVERNOR’S DESK

  19. Motivation Overview Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, MINERS REFUSE TO WORK AFTER DEATH DRUNK GETS NINE MONTHS IN VIOLIN CASE KIDS MAKE NUTRITIOUS SNACKS 10 / 24 fun with newspaper headlines NLP and ambiguity Next Practical matters • FARMER BILL DIES IN HOUSE • TEACHER STRIKES IDLE KIDS • SQUAD HELPS DOG BITE VICTIM • BAN ON NUDE DANCING ON GOVERNOR’S DESK • PROSTITUTES APPEAL TO POPE

  20. Motivation Overview Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, MINERS REFUSE TO WORK AFTER DEATH DRUNK GETS NINE MONTHS IN VIOLIN CASE 10 / 24 NLP and ambiguity fun with newspaper headlines Next Practical matters • FARMER BILL DIES IN HOUSE • TEACHER STRIKES IDLE KIDS • SQUAD HELPS DOG BITE VICTIM • BAN ON NUDE DANCING ON GOVERNOR’S DESK • PROSTITUTES APPEAL TO POPE • KIDS MAKE NUTRITIOUS SNACKS

  21. Motivation Overview Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, MINERS REFUSE TO WORK AFTER DEATH 10 / 24 fun with newspaper headlines NLP and ambiguity Next Practical matters • FARMER BILL DIES IN HOUSE • TEACHER STRIKES IDLE KIDS • SQUAD HELPS DOG BITE VICTIM • BAN ON NUDE DANCING ON GOVERNOR’S DESK • PROSTITUTES APPEAL TO POPE • KIDS MAKE NUTRITIOUS SNACKS • DRUNK GETS NINE MONTHS IN VIOLIN CASE

  22. Motivation Overview Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, 10 / 24 NLP and ambiguity fun with newspaper headlines Next Practical matters • FARMER BILL DIES IN HOUSE • TEACHER STRIKES IDLE KIDS • SQUAD HELPS DOG BITE VICTIM • BAN ON NUDE DANCING ON GOVERNOR’S DESK • PROSTITUTES APPEAL TO POPE • KIDS MAKE NUTRITIOUS SNACKS • DRUNK GETS NINE MONTHS IN VIOLIN CASE • MINERS REFUSE TO WORK AFTER DEATH

  23. Motivation Overview Summer Semester 2017 SfS / University of Tübingen Ç. Çöltekin, 11 / 24 we do not recognize many of them at fjrst read More ambiguities Next Practical matters • Time fmies like an arrow • Outside of a dog, a book is a man’s best friend • One morning I shot an elephant in my pajamas • Don’t eat the pizza with knife and fork • Hearing voices? Then you’re not alone! • No parking on both sides. • They are canning peas. • My job was keeping him alive. • We watched another fmy. • Double job pay. • He fed her cat food.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend