Statistical Natural Language Processing Sing DET NOUN PUNCT Def - PDF document

Statistical Natural Language Processing Sing DET NOUN PUNCT Def Sing 3s,Pres Sing,Dem case PROPN det obl root det nsubj punct Ç. Çöltekin, VERB DET Summer Semester 2018 Next Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2018 3 / 27 Motivation Overview Çağrı Çöltekin Annotation layers: example ADP From the AP comes this story : SfS / University of Tübingen 4 / 27 Speech But it must be recognized that the notion ’probability of a Summer Semester 2018 6 / 27 Motivation Overview Practical matters Next On the word ‘statistical’ sentence’ is an entirely useless one, under any known Ç. Çöltekin, interpretation of this term. — Chomsky (1968) rule-based methods from 80’s 90’s) statistical component Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2018 SfS / University of Tübingen another (recent/trending) approach Motivation 5 / 27 Overview Practical matters Next Typical NLP pipeline Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2018 Motivation improves the results Overview Practical matters Next Do we need a pipeline? pipeline approach: – tasks are done individually, results are passed to upper level Synthesis Practical matters Generation 1 / 27 space change through time and behavior For fun (research): recognition/synthesis Word For profjt (engineering): Application examples Next Practical matters Overview Motivation Summer Semester 2018 research SfS / University of Tübingen Ç. Çöltekin, science (and more) program Why study (statistical) NLP Next Practical matters Overview Motivation Summer Semester 2018 Seminar für Sprachwissenschaft University of Tübingen ccoltekin@sfs.uni-tuebingen.de annotation for linguistic 7 / 27 Ç. Çöltekin, SfS / University of Tübingen phonetics / phonology morphology syntax semantics discourse Analysis Generation Speech Recognition Morphological Analysis Next Practical matters Parsing Semantic Summer Semester 2018 2 / 27 Generation Sentence Planning Sentence analysis Discourse Motivation analysis Overview Layers of linguistic analysis • (Most of) you are studying in a ‘computational linguistics’ / tʃaːɾˈɯ tʃœltecˈɪn / • Many practical applications • Investigating basic questions in linguistics and cognitive • Machine translation • Modeling cognitive/social • Question answering • Authorship attribution • Information retrieval • Investigating language • Dialog systems • Summarization • Text classifjcation • (Automatic) corpus • Text mining/analytics • Sentiment analysis • Speech • Automatic grading • Forensic linguistics • Text processing / normalization • Word/sentence tokenization → Syntax • POS tagging • Morphological analysis • Syntactic parsing → Tokens • Semantic parsing → POS Tags → Morphology • Named entity recognition • Coreference resolution • Most ”traditional” NLP architectures are based on a • Some linguistic traditions emphasize(d) use of ‘symbolic’, • Joint learning (e.g., POS tagging and syntax) often • Some NLP systems are based on rule-based systems (esp. • End-to-end learning (without intermediate layers) is • Virtually, all modern NLP systems include some sort of

Motivation Even more ambiguities Statistical methods and data sparsity Next Practical matters Overview Motivation 12 / 27 Summer Semester 2018 SfS / University of Tübingen Ç. Çöltekin, Cartoon Theories of Linguistics, SpecGram Vol CLIII, No 4, 2008. http://specgram.com/CLIII.4/school.gif with pretty pictures Next disambiguation component is necessary Practical matters Overview Motivation 11 / 27 Summer Semester 2018 SfS / University of Tübingen Ç. Çöltekin, Overview you’re not alone! with anchovies is better. I don’t know. know to deal with ambiguities Ç. Çöltekin, elephant in my pajamas. SfS / University of Tübingen Summer Semester 2018 SfS / University of Tübingen Ç. Çöltekin, tools in NLP What is in this course Next Practical matters Overview Motivation 14 / 27 Summer Semester 2018 Ç. Çöltekin, SfS / University of Tübingen relative frequency rank a long tail follows … word frequencies in a small corpus Languages are full of rare events Next Practical matters Overview Motivation 13 / 27 Summer Semester 2018 How he got in my pajamas, alive. it’s too hard to read. NLP and computational complexity Next Practical matters Overview Motivation 9 / 27 Summer Semester 2018 SfS / University of Tübingen Ç. Çöltekin, search space probabilities of words in it? Next fun with newspaper headlines Practical matters Overview Motivation 8 / 27 Summer Semester 2018 SfS / University of Tübingen Ç. Çöltekin, What is diffjcult with NLP? Next Practical matters NLP and ambiguity 15 / 27 fruit fmies like a banana. More ambiguities we do not recognize many of them at fjrst read Next Practical matters Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2018 Overview 10 / 27 Motivation • How many possible parses a sentence may have? • How many ways can you align two (parallel) sentences? • Combinatorial problems - computational complexity • How to calculate probability of sentence based on the • Ambiguity • Data sparseness • Many similar questions we deal with have an exponential • Naive approaches often are computationally intractable • Time fmies like an arrow; • Hearing voices? Then • FARMER BILL DIES IN HOUSE • Outside of a dog, a book is • No parking on both sides. • TEACHER STRIKES IDLE KIDS • They are canning peas. • SQUAD HELPS DOG BITE VICTIM a man’s best friend; inside • My job was keeping him • BAN ON NUDE DANCING ON GOVERNOR’S DESK • One morning I shot an • PROSTITUTES APPEAL TO POPE • We watched another fmy. • KIDS MAKE NUTRITIOUS SNACKS • Double job pay. • DRUNK GETS NINE MONTHS IN VIOLIN CASE • He fed her cat food. • MINERS REFUSE TO WORK AFTER DEATH • Don’t eat the pizza with knife and fork ; the one • Statistical methods (machine learning) are the best way we • Even for rule-based approaches, a statistical • Machine learning methods require (annotated) data • But … 0 . 06 • Quick introduction / refreshers on important prerequisites 0 . 04 • The computational linguist’s toolbox: basic methods and 0 . 02 • Some applications of NLP 0 . 00 0 50 100 150 200 250

Statistical Natural Language Processing Sing DET NOUN PUNCT Def - PDF document

Statistical Natural Language Processing Sing DET NOUN PUNCT Def Sing 3s,Pres Sing,Dem case PROPN det obl root det nsubj punct . ltekin, VERB DET Summer Semester 2018 Next . ltekin, SfS / University of Tbingen

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Information Extraction Industrial Natural Language Processing Industrial Natural Language

Natural Language Processing 1 Lecture 11: Language generation and summarisation Katia Shutova

Natural Language Processing 1 Lecture 10: Language generation and summarisation Katia Shutova

Natural Language Processing 1 Lecture 8: Compositional semantics and discourse processing Katia

Natural Language Processing Fall 2018 Frank Ferraro Natural language processing ITE 358

Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2019 Natural Language

Statistical Natural Language Processing Prasad Tadepalli CS430 lecture Natural Language

MIA - Master on Artificial Intelligence Advanced Natural Language Processing Advanced Natural

Advanced Natural Language Processing: What is Natural Language Processing (NLP)? Background

Introduction Karl Stratos Rutgers University Karl Stratos CS 533: Natural Language Processing

Statistical natural language processing 24.05.19 Statistical Natural Language Processing 1 The

TOUR OF THE CHURCH Liturgical Colours The Church today recognizes 6 liturgical colours white,

Heavenly Music Past, Present, and Future Where wast thou when I laid the foundations of the

Welcome to Project FeederWatch Count the birds at your feeders and become a citizen scientist

At Creation Common Holy Day 1 Day 2 Day 8 Day 9 Day 3 Day 4 Day 5 Day 6 Day 7 7 Days The

Semantic amplification: Exploring metafunctional and interstratal correspondence through the

Pushing the Boundaries of Interaction in Data Visualization John Stasko School of Interactive

Idling and Sidling Towards Philosophical Peace Huw Price Centre for Time University of Sydney

The unexamined life Thriving not as fools, but as wise, is not worth living . 16 Redeeming