Statistical Natural Language Processing ar ltekin - - PowerPoint PPT Presentation

statistical natural language processing
SMART_READER_LITE
LIVE PREVIEW

Statistical Natural Language Processing ar ltekin - - PowerPoint PPT Presentation

Statistical Natural Language Processing ar ltekin ccoltekin@sfs.uni-tuebingen.de University of Tbingen Seminar fr Sprachwissenschaft Summer Semester 2019 / ta tltecn / Motivation Overview Practical matters


slide-1
SLIDE 1

Statistical Natural Language Processing

Çağrı Çöltekin /tʃaːɾˈɯ tʃœltecˈɪn/ ccoltekin@sfs.uni-tuebingen.de

University of Tübingen Seminar für Sprachwissenschaft

Summer Semester 2019

slide-2
SLIDE 2

Motivation Overview Practical matters Next

Why study (statistical) NLP

  • (Most of) you are studying in a ‘computational linguistics’

program

  • Many practical applications
  • Investigating basic questions in linguistics and cognitive

science (and more)

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 1 / 29

slide-3
SLIDE 3

Motivation Overview Practical matters Next

Application examples

Just a few examples

For profjt (engineering):

  • Machine translation
  • Question answering
  • Information retrieval
  • Dialog systems
  • Summarization
  • Text classifjcation
  • Text mining/analytics
  • Sentiment analysis
  • Speech recognition and

synthesis

  • Automatic grading
  • Forensic linguistics

For fun (research):

  • Modeling language

processing learning

  • Investigating language

change through time and space

  • (Aiding) language

documentation through text processing

  • (Automatic) corpus

annotation for linguistic research

  • Stylometry, author

identifjcation

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 2 / 29

slide-4
SLIDE 4

Motivation Overview Practical matters Next

Layers of linguistic analysis

phonetics / phonology morphology syntax semantics discourse Analysis Generation

Speech Recognition Morphological Analysis Parsing Semantic analysis Discourse analysis Sentence Planning Sentence Generation Word Generation Speech Synthesis

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 3 / 29

slide-5
SLIDE 5

Motivation Overview Practical matters Next

Annotation layers: example

From the AP comes this story :

case det

  • bl

root det nsubj punct Syntax →Tokens

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 4 / 29

slide-6
SLIDE 6

Motivation Overview Practical matters Next

Annotation layers: example

From the AP comes this story :

ADP DET PROPN VERB DET NOUN PUNCT case det

  • bl

root det nsubj punct Syntax →Tokens →POS Tags

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 4 / 29

slide-7
SLIDE 7

Motivation Overview Practical matters Next

Annotation layers: example

From the AP comes this story :

ADP DET PROPN VERB DET NOUN PUNCT Def Sing 3s,Pres Sing,Dem Sing case det

  • bl

root det nsubj punct Syntax →Tokens →POS Tags →Morphology

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 4 / 29

slide-8
SLIDE 8

Motivation Overview Practical matters Next

Annotation layers: example

From the AP comes this story :

ADP DET PROPN VERB DET NOUN PUNCT Def Sing 3s,Pres Sing,Dem Sing case det

  • bl

root det nsubj punct →Syntax →Tokens →POS Tags →Morphology

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 4 / 29

slide-9
SLIDE 9

Motivation Overview Practical matters Next

Typical NLP pipeline

  • Text processing / normalization
  • Word/sentence tokenization
  • POS tagging
  • Morphological analysis
  • Syntactic parsing
  • Semantic parsing
  • Named entity recognition
  • Coreference resolution

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 5 / 29

slide-10
SLIDE 10

Motivation Overview Practical matters Next

Do we need a pipeline?

  • Most ”traditional” NLP architectures are based on a

pipeline approach:

– tasks are done individually, results are passed to upper level

  • Joint learning (e.g., POS tagging and syntax) often

improves the results

  • End-to-end learning (without intermediate layers) is

another (recent/trending) approach

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 6 / 29

slide-11
SLIDE 11

Motivation Overview Practical matters Next

On the word ‘statistical’

But it must be recognized that the notion ’probability of a sentence’ is an entirely useless one, under any known inter- pretation of this term. — Chomsky (1968)

  • Some linguistic traditions emphasize(d) use of ‘symbolic’,

rule-based methods

  • Some NLP systems are based on rule-based systems (esp.

from 80’s 90’s)

  • Virtually, all modern NLP systems include some sort of

statistical component

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 7 / 29

slide-12
SLIDE 12

Motivation Overview Practical matters Next

What is diffjcult with NLP?

  • Combinatorial problems - computational complexity
  • Ambiguity
  • Data sparseness

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 8 / 29

slide-13
SLIDE 13

Motivation Overview Practical matters Next

NLP and computational complexity

  • How many possible parses a sentence may have?
  • How many ways can you align two (parallel) sentences?
  • How to calculate probability of sentence based on the

probabilities of words in it? Many similar questions we deal with have an exponential search space Naive approaches often are computationally intractable

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 9 / 29

slide-14
SLIDE 14

Motivation Overview Practical matters Next

NLP and computational complexity

  • How many possible parses a sentence may have?
  • How many ways can you align two (parallel) sentences?
  • How to calculate probability of sentence based on the

probabilities of words in it?

  • Many similar questions we deal with have an exponential

search space

  • Naive approaches often are computationally intractable

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 9 / 29

slide-15
SLIDE 15

Motivation Overview Practical matters Next

Combinatorial problems

A typical linguistic problem: parsing

How many difgerent binary trees can span a sentence of N words? words trees 2 3 4 5 10 20 … …

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 10 / 29

slide-16
SLIDE 16

Motivation Overview Practical matters Next

Combinatorial problems

A typical linguistic problem: parsing

How many difgerent binary trees can span a sentence of N words?

a b

words trees 2 3 4 5 10 20 … …

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 10 / 29

slide-17
SLIDE 17

Motivation Overview Practical matters Next

Combinatorial problems

A typical linguistic problem: parsing

How many difgerent binary trees can span a sentence of N words?

a b c a b c

words trees 2 3 4 5 10 20 … …

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 10 / 29

slide-18
SLIDE 18

Motivation Overview Practical matters Next

Combinatorial problems

A typical linguistic problem: parsing

How many difgerent binary trees can span a sentence of N words?

a b c d a b c d a b c d a b c d a b c d

words trees 2 3 4 5 10 20 … …

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 10 / 29

slide-19
SLIDE 19

Motivation Overview Practical matters Next

Combinatorial problems

A typical linguistic problem: parsing

How many difgerent binary trees can span a sentence of N words?

a b c d e a b c d e

a b c d e a b c d e

… words trees 2 3 4 5 10 20 … …

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 10 / 29

slide-20
SLIDE 20

Motivation Overview Practical matters Next

Combinatorial problems

A typical linguistic problem: parsing

How many difgerent binary trees can span a sentence of N words?

a b c d e a b c d e

a b c d e a b c d e

… words trees 2 1 3 2 4 5 5 14 10 4862 20 1 767 263 190 … …

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 10 / 29

slide-21
SLIDE 21

Motivation Overview Practical matters Next

NLP and ambiguity

fun with newspaper headlines

FARMER BILL DIES IN HOUSE TEACHER STRIKES IDLE KIDS SQUAD HELPS DOG BITE VICTIM BAN ON NUDE DANCING ON GOVERNOR’S DESK PROSTITUTES APPEAL TO POPE KIDS MAKE NUTRITIOUS SNACKS DRUNK GETS NINE MONTHS IN VIOLIN CASE MINERS REFUSE TO WORK AFTER DEATH

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 11 / 29

slide-22
SLIDE 22

Motivation Overview Practical matters Next

NLP and ambiguity

fun with newspaper headlines

FARMER BILL DIES IN HOUSE TEACHER STRIKES IDLE KIDS SQUAD HELPS DOG BITE VICTIM BAN ON NUDE DANCING ON GOVERNOR’S DESK PROSTITUTES APPEAL TO POPE KIDS MAKE NUTRITIOUS SNACKS DRUNK GETS NINE MONTHS IN VIOLIN CASE MINERS REFUSE TO WORK AFTER DEATH

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 11 / 29

slide-23
SLIDE 23

Motivation Overview Practical matters Next

NLP and ambiguity

fun with newspaper headlines

FARMER BILL DIES IN HOUSE TEACHER STRIKES IDLE KIDS SQUAD HELPS DOG BITE VICTIM BAN ON NUDE DANCING ON GOVERNOR’S DESK PROSTITUTES APPEAL TO POPE KIDS MAKE NUTRITIOUS SNACKS DRUNK GETS NINE MONTHS IN VIOLIN CASE MINERS REFUSE TO WORK AFTER DEATH

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 11 / 29

slide-24
SLIDE 24

Motivation Overview Practical matters Next

NLP and ambiguity

fun with newspaper headlines

FARMER BILL DIES IN HOUSE TEACHER STRIKES IDLE KIDS SQUAD HELPS DOG BITE VICTIM BAN ON NUDE DANCING ON GOVERNOR’S DESK PROSTITUTES APPEAL TO POPE KIDS MAKE NUTRITIOUS SNACKS DRUNK GETS NINE MONTHS IN VIOLIN CASE MINERS REFUSE TO WORK AFTER DEATH

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 11 / 29

slide-25
SLIDE 25

Motivation Overview Practical matters Next

NLP and ambiguity

fun with newspaper headlines

FARMER BILL DIES IN HOUSE TEACHER STRIKES IDLE KIDS SQUAD HELPS DOG BITE VICTIM BAN ON NUDE DANCING ON GOVERNOR’S DESK PROSTITUTES APPEAL TO POPE KIDS MAKE NUTRITIOUS SNACKS DRUNK GETS NINE MONTHS IN VIOLIN CASE MINERS REFUSE TO WORK AFTER DEATH

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 11 / 29

slide-26
SLIDE 26

Motivation Overview Practical matters Next

NLP and ambiguity

fun with newspaper headlines

FARMER BILL DIES IN HOUSE TEACHER STRIKES IDLE KIDS SQUAD HELPS DOG BITE VICTIM BAN ON NUDE DANCING ON GOVERNOR’S DESK PROSTITUTES APPEAL TO POPE KIDS MAKE NUTRITIOUS SNACKS DRUNK GETS NINE MONTHS IN VIOLIN CASE MINERS REFUSE TO WORK AFTER DEATH

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 11 / 29

slide-27
SLIDE 27

Motivation Overview Practical matters Next

NLP and ambiguity

fun with newspaper headlines

FARMER BILL DIES IN HOUSE TEACHER STRIKES IDLE KIDS SQUAD HELPS DOG BITE VICTIM BAN ON NUDE DANCING ON GOVERNOR’S DESK PROSTITUTES APPEAL TO POPE KIDS MAKE NUTRITIOUS SNACKS DRUNK GETS NINE MONTHS IN VIOLIN CASE MINERS REFUSE TO WORK AFTER DEATH

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 11 / 29

slide-28
SLIDE 28

Motivation Overview Practical matters Next

NLP and ambiguity

fun with newspaper headlines

FARMER BILL DIES IN HOUSE TEACHER STRIKES IDLE KIDS SQUAD HELPS DOG BITE VICTIM BAN ON NUDE DANCING ON GOVERNOR’S DESK PROSTITUTES APPEAL TO POPE KIDS MAKE NUTRITIOUS SNACKS DRUNK GETS NINE MONTHS IN VIOLIN CASE MINERS REFUSE TO WORK AFTER DEATH

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 11 / 29

slide-29
SLIDE 29

Motivation Overview Practical matters Next

NLP and ambiguity

fun with newspaper headlines

FARMER BILL DIES IN HOUSE TEACHER STRIKES IDLE KIDS SQUAD HELPS DOG BITE VICTIM BAN ON NUDE DANCING ON GOVERNOR’S DESK PROSTITUTES APPEAL TO POPE KIDS MAKE NUTRITIOUS SNACKS DRUNK GETS NINE MONTHS IN VIOLIN CASE MINERS REFUSE TO WORK AFTER DEATH

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 11 / 29

slide-30
SLIDE 30

Motivation Overview Practical matters Next

More ambiguities

we do not recognize many of them at fjrst read

  • Time fmies like an arrow

; fruit fmies like a banana.

  • Outside of a dog, a book is a man’s best friend

; inside it’s too hard to read.

  • One morning I shot an elephant in my pajamas

. How he got in my pajamas, I don’t know.

  • Don’t eat the pizza with knife and fork

; the one with anchovies is better.

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 12 / 29

slide-31
SLIDE 31

Motivation Overview Practical matters Next

More ambiguities

we do not recognize many of them at fjrst read

  • Time fmies like an arrow;

fruit fmies like a banana.

  • Outside of a dog, a book is a man’s best friend

; inside it’s too hard to read.

  • One morning I shot an elephant in my pajamas

. How he got in my pajamas, I don’t know.

  • Don’t eat the pizza with knife and fork

; the one with anchovies is better.

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 12 / 29

slide-32
SLIDE 32

Motivation Overview Practical matters Next

More ambiguities

we do not recognize many of them at fjrst read

  • Time fmies like an arrow;

fruit fmies like a banana.

  • Outside of a dog, a book is a man’s best friend;

inside it’s too hard to read.

  • One morning I shot an elephant in my pajamas

. How he got in my pajamas, I don’t know.

  • Don’t eat the pizza with knife and fork

; the one with anchovies is better.

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 12 / 29

slide-33
SLIDE 33

Motivation Overview Practical matters Next

More ambiguities

we do not recognize many of them at fjrst read

  • Time fmies like an arrow;

fruit fmies like a banana.

  • Outside of a dog, a book is a man’s best friend;

inside it’s too hard to read.

  • One morning I shot an elephant in my pajamas.

How he got in my pajamas, I don’t know.

  • Don’t eat the pizza with knife and fork

; the one with anchovies is better.

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 12 / 29

slide-34
SLIDE 34

Motivation Overview Practical matters Next

More ambiguities

we do not recognize many of them at fjrst read

  • Time fmies like an arrow;

fruit fmies like a banana.

  • Outside of a dog, a book is a man’s best friend;

inside it’s too hard to read.

  • One morning I shot an elephant in my pajamas.

How he got in my pajamas, I don’t know.

  • Don’t eat the pizza with knife and fork;

the one with anchovies is better.

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 12 / 29

slide-35
SLIDE 35

Motivation Overview Practical matters Next

Even more ambiguities

with pretty pictures

Cartoon Theories of Linguistics, SpecGram Vol CLIII, No 4, 2008. http://specgram.com/CLIII.4/school.gif Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 13 / 29

slide-36
SLIDE 36

Motivation Overview Practical matters Next

Statistical methods and data sparsity

  • Statistical methods (machine learning) are the best way we

know to deal with ambiguities

  • Even for rule-based approaches, a statistical

disambiguation component is often needed

  • Machine learning methods require (annotated) data
  • But …

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 14 / 29

slide-37
SLIDE 37

Motivation Overview Practical matters Next

Languages are full of rare events

word frequencies in a small corpus

50 100 150 200 250 0.00 0.02 0.04 0.06 a long tail follows … rank relative frequency

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 15 / 29

slide-38
SLIDE 38

Motivation Overview Practical matters Next

What is diffjcult in CL?

and how can machine learning help?

  • Combinatorial problems - computational complexity

– Often we resort to approximate methods: the answer to ‘what is a good approximation?’ comes from ML.

  • Ambiguity

– The answer to ‘what is the best choice?’ comes from ML.

  • Data sparseness

– Even here, ML can help.

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 16 / 29

slide-39
SLIDE 39

Motivation Overview Practical matters Next

What is in this course

  • Quick introduction / refreshers on important prerequisites
  • The computational linguist’s toolbox: basic methods and

tools in NLP

  • Some applications of NLP

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 17 / 29

slide-40
SLIDE 40

Motivation Overview Practical matters Next

What is in this course

Preliminaries

  • Linear algebra, some concepts from calculus
  • Probability theory
  • Information theory
  • Statistical inference
  • Some topics from machine learning

– Regression & classifjcation – Sequence learning (HMMs) – Neural networks and deep learning – Unsupervised learning

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 18 / 29

slide-41
SLIDE 41

Motivation Overview Practical matters Next

What is in this course

NLP Tools and techniques

  • Tokenization, normalization, segmentation
  • N-gram language models
  • Part of speech tagging
  • Statistical parsing
  • Distributed representations (of words, and other linguistic
  • bjects)

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 19 / 29

slide-42
SLIDE 42

Motivation Overview Practical matters Next

What is in this course

Applications

  • Text classifjcation

– sentiment analysis – language detection – authorship attribution – …

If time allows Statistical machine translation Named entitiy recognition Text summarization Dialog systems …

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 20 / 29

slide-43
SLIDE 43

Motivation Overview Practical matters Next

What is in this course

Applications

  • Text classifjcation

– sentiment analysis – language detection – authorship attribution – …

If time allows

  • Statistical machine translation
  • Named entitiy recognition
  • Text summarization
  • Dialog systems

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 20 / 29

slide-44
SLIDE 44

Motivation Overview Practical matters Next

What is not in this course

  • Cutting edge, latest methods & applications
  • In-depth treatment of particular topics
  • Introduction to terms / concepts from linguistics

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 21 / 29

slide-45
SLIDE 45

Motivation Overview Practical matters Next

Logistics

  • Lectures: Mon/Fri 12:15 at Hörsaal 0.02
  • Practical sessions: Wed 10:15 at Hörsaal 0.02
  • Offjce hours: Mon 14:00-15:00 (room 1.09), or by

appointment (email ccoltekin@sfs.uni-tuebingen.de)

  • Course web page:

http://sfs.uni-tuebingen.de/~ccoltekin/courses/snlp

  • We will use GitHub classroom in this class (more on this

soon)

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 22 / 29

slide-46
SLIDE 46

Motivation Overview Practical matters Next

Reading material

  • Daniel Jurafsky and James H. Martin (2009). Speech and

Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition.

  • second. Pearson Prentice Hall. isbn: 978-0-13-504196-3

– Draft chapters of the third edition is available at http://web.stanford.edu/~jurafsky/slp3/

  • Trevor Hastie, Robert Tibshirani, and Jerome Friedman (2009).

The Elements of Statistical Learning: Data Mining, Inference, and

  • Prediction. Second. Springer series in statistics. Springer-Verlag

New York. isbn: 9780387848587. url: http://web.stanford.edu/~hastie/ElemStatLearn/

  • Course notes for some lectures
  • Other online references

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 23 / 29

slide-47
SLIDE 47

Motivation Overview Practical matters Next

Grading / evaluation

  • 7 graded assignments (6-best counts, 10 % each)
  • Final exam (40 %)
  • Attendance

– 5 % (bonus) if you miss only one or two classes – you lose one bonus point for each additional class you miss

  • Up to 5 % additional bonus points for Easter eggs:

– fjrst person fjnding (intentional, trivial) mistakes in the course material gets 1 %

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 24 / 29

slide-48
SLIDE 48

Motivation Overview Practical matters Next

Assignments

  • For distribution and submission of assignments, we will

use GitHub Classroom

  • The amount of git usage required is low, but

learning/using git well is strongly recommended

  • You are encouraged work on the assignments in pairs, but

you can work with the same person only once

  • Late assignments up to one week, will be graded up to half

points indicated

  • The solutions will be discussed in the tutorial session after
  • ne week from deadline
  • Poll: a match-making system for working in random

groups?

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 25 / 29

slide-49
SLIDE 49

Motivation Overview Practical matters Next

Assignment 0

  • Your fjrst assignment is already posted on the web page
  • By completing assignment 0, you will

– register for the course – have access to the non-public course material – exercise with the way later assignments will work – provide some data for future exercises

  • The repository created for assignment 0 is private, and can
  • nly be accessed you and the instructors

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 26 / 29

slide-50
SLIDE 50

Motivation Overview Practical matters Next

Practical sessions

  • Tutors: Marko Lozajic & Maxim Korniyenko
  • You need to bring your own computer, make sure you have

a working Python interpreter

  • You are encouraged to ask questions about the exercises

during practical sessions

  • The solutions will be discussed during tutorial sessions
  • Poll: Python tutorial?

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 27 / 29

slide-51
SLIDE 51

Motivation Overview Practical matters Next

Further git/GitHub usage

  • Once you complete Assignment 0, you will be a member of

the ‘organization’ snlp2019

  • You will get access to

– private course material – assignment links – news and announcements

through the repository at https://github.com/snlp2018/snlp2019

  • Make sure to watch this repository
  • You are also encouraged to use ‘issues’ in this repository as

a place to discuss course topics, ask questions about the material and assignments

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 28 / 29

slide-52
SLIDE 52

Motivation Overview Practical matters Next

Next

Mon Mathematical preliminaries (some linear algebra and bits from calculus) Fri Probability theory

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 29 / 29

slide-53
SLIDE 53

References / additional reading material

Bishop, Christopher M. (2006). Pattern Recognition and Machine Learning. Springer. isbn: 978-0387-31073-2. Chomsky, Noam (1968). “Quine’s empirical assumptions”. In: Synthese 19.1, pp. 53–68. doi: 10.1007/BF00568049. Hastie, Trevor, Robert Tibshirani, and Jerome Friedman (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Second. Springer series in statistics. Springer-Verlag New York. isbn: 9780387848587. url: http://web.stanford.edu/~hastie/ElemStatLearn/. Jurafsky, Daniel and James H. Martin (2009). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. second. Pearson Prentice Hall. isbn: 978-0-13-504196-3. Manning, Christopher D. and Hinrich Schütze (1999). Foundations of Statistical Natural Language Processing. MIT

  • Press. isbn: 9780262133609.

Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 A.1