Machine Learning for NLP Introduction session Aurlie Herbelot 2020 - - PowerPoint PPT Presentation

machine learning for nlp
SMART_READER_LITE
LIVE PREVIEW

Machine Learning for NLP Introduction session Aurlie Herbelot 2020 - - PowerPoint PPT Presentation

Machine Learning for NLP Introduction session Aurlie Herbelot 2020 Centre for Mind/Brain Sciences University of Trento 1 Material, contact All material will be posted at: http: //aurelieherbelot.net/teaching/ml-for-nlp/ Any question,


slide-1
SLIDE 1

Machine Learning for NLP

Introduction session

Aurélie Herbelot 2020

Centre for Mind/Brain Sciences University of Trento 1

slide-2
SLIDE 2

Material, contact

All material will be posted at: http: //aurelieherbelot.net/teaching/ml-for-nlp/ Any question, worry, complaint... write to: aurelie.herbelot@unitn.it

2

slide-3
SLIDE 3

Course overview

3

slide-4
SLIDE 4

Goals

  • 1. Understand core machine learning algorithms used in

NLP:

1.1 for science; 1.2 for applications.

  • 2. Be able to read and criticise related literature.
  • 3. Acquire some fundamental computational skills to run ML

code and interpret its output.

4

slide-5
SLIDE 5

Session structure

  • An introductory week, followed by 9 topics, each

associated with 3 classes:

  • 1. A lecture presenting the topic for that week.
  • 2. A reading group on one or two papers using the presented

algorithm(s) / metric(s).

  • 3. A practical with a task and/or some code to play with. All

code will be provided on GitHub. Some practicals will focus

  • n linguistic questions (

), some on applications ( ).

5

slide-6
SLIDE 6

Syllabus

6

slide-7
SLIDE 7

What for?

7

slide-8
SLIDE 8

NLP for science

8

slide-9
SLIDE 9

Computational tools for language sciences Why?

9

slide-10
SLIDE 10

Simulation of star formation.

http://burro.astr.cwru.edu/models/sfrmm009.gif

10

slide-11
SLIDE 11

Right: Prusinkiewicz (2004), modelling plant growth with a grammar. 10

slide-12
SLIDE 12

Current work in the CALM group: Simulate the tension between linguistic creativity and communication needs. What shape can lexical meaning take without breaking alignment between speakers? 10

slide-13
SLIDE 13

Modelling

  • A model is an approximation of reality.
  • Aim: observe behaviour of the model and ensure it does

not produce states incompatible with reality.

  • Computational models and their implementation

(simulations) allow for fast counter-checking of hypotheses.

11

slide-14
SLIDE 14

Issues

  • Assumptions: a model rests on a number of simplifying assumptions. Q: What

might be simplified in a model of language?

  • Evaluation: a model should be compatible with past / future observations

(corpora, linguistic judgements, behavioural data, brain states...) Q: In which ways might linguistic data be biased / flawed?

  • Access to reality: it took 50 years to

(partially) confirm the existence of the Higgs

  • Boson. Most of what makes language is

invisible, including parts we take for granted. Q: Which ones?

  • Replicability: will others be able to reproduce

your experiment? A: why shouldn’t they? 12

slide-15
SLIDE 15

Example question Can language be learned from scratch? Or does it need innate mechanisms?

13

slide-16
SLIDE 16

Example question Can language be learned from scratch? Or does it need innate mechanisms?

14

slide-17
SLIDE 17

Example question Can language be learned from scratch? Or does it need innate mechanisms?

14

slide-18
SLIDE 18

Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis.

15

slide-19
SLIDE 19

Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis.

15

slide-20
SLIDE 20

Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. Then let’s choose a model to match the hypothesis.

16

slide-21
SLIDE 21

Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. Then let’s choose a model to match the hypothesis.

16

slide-22
SLIDE 22

Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. Then let’s choose a model to match the hypothesis.

16

slide-23
SLIDE 23

Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. Then let’s choose a model to match the hypothesis. And data to train the model under various conditions.

17

slide-24
SLIDE 24

Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. Then let’s choose a model to match the hypothesis. And data to train the model under various conditions.

17

slide-25
SLIDE 25

Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. Then let’s choose a model to match the hypothesis. And data to train the model under various conditions. Finally, test the hypothesis.

18

slide-26
SLIDE 26

A real model

Actual CHILDES data “What kind of food did you buy?” buy ___|________ | | kind | | ____|____ | | |

  • f

| | | | did you what food Catenae: kind -[DET]→ what kind-[PREP]→ of

  • f-[POBJ]→ food
  • f-[POBJ]→ NOUN

... RNN-generated data “What kind of little girl?” kind ____|_____ |

  • f

| | | girl | | what little Catenae: kind -[DET]→ what kind-[PREP]→ of

  • f-[POBJ]→ girl
  • f-[POBJ]→ NOUN

... Given a) some training data and b) some generated output from an RNN, processed with the same formalism, we can investigate which structures the RNN can reproduce, and compare their distribution to the original data. (Ongoing work by Ludovica Pannitto.)

19

slide-27
SLIDE 27

A real model

Actual CHILDES data “Are you teasing me?” teasing _____|_____ are you me Catenae: VERB -[AUX]→ are ... “It’s a steering wheel.” ’s ___|____ | wheel | ____|______ it a steering Catenae: ’s -[NSUBJ]→ it NOUN -[DET]→ a ... RNN-generated data It’s a jelly graving me. graving _____|______ | ’s jelly | | | me it a Catenae: VERB -[AUX]→ are NOUN -[DET]→ a ’s -[NSUBJ]→ it graving -[DOBJ]→ me graving -[ADVMOD]→ jelly ... We can do this check even when the RNN produces partially nonsensical sentences.

20

slide-28
SLIDE 28

NLP for applications

21

slide-29
SLIDE 29

Computational tools for technology How?

22

slide-30
SLIDE 30

Software development

  • Requirement analysis: what should the software do? with

how much resources?

  • Design: modelling (as in science), choice of language /

hardware, etc.

  • Implementation: programming and documenting.
  • Evaluation: check requirements are satisfied,

performance is acceptable, etc.

23

slide-31
SLIDE 31

Issues

  • Model: as in science, some assumptions must be made

and constraints respected (e.g. hardware / internet access available to end user).

  • Implementation: for whom are you writing? will the code

be open? how do you document? do you need training to be reproducible?

  • Ethics: should you be doing this? If your app is 99%

accurate, does the 1% matter? Where did the data come from? Etc.

24

slide-32
SLIDE 32

An example task Build a Web search engine.

25

slide-33
SLIDE 33

An example task Build a Web search engine.

26

slide-34
SLIDE 34

An example task Build a Web search engine.

26

slide-35
SLIDE 35

An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources.

27

slide-36
SLIDE 36

An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources.

27

slide-37
SLIDE 37

An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources. Representations must be adequate for the task.

28

slide-38
SLIDE 38

An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources. Representations must be adequate for the task.

28

slide-39
SLIDE 39

An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources. Representations must be adequate for the task. The system must be evaluated.

29

slide-40
SLIDE 40

An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources. Representations must be adequate for the task. The system must be evaluated.

29

slide-41
SLIDE 41

An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources. Representations must be adequate for the task. The system must be evaluated.

29

slide-42
SLIDE 42

The PeARS search engine

More information on https://pearsearch.org/

30

slide-43
SLIDE 43

What to expect in the course

31

slide-44
SLIDE 44
  • ML for NLP: what’s it good for?
  • Basic principles of statistical language learning.
  • Practical: how unique is each person’s language? Can we

model inter-speaker differences?

32

slide-45
SLIDE 45

Q: in what sense can data be good or bad?

  • How to generate (good) data for training and evaluation.
  • Reading: do people agree on what the world is like?
  • Practical: produce data for your search engine.

Pre-process Wikipedia, extract specific category pages, design page/query representations.

33

slide-46
SLIDE 46

Q: how do machines build models?

  • Introduction to supervised learning and regression

techniques.

  • Reading: predicting words from brain images

(‘mind-reading’).

  • Practical: to which extent do people’s conceptual spaces

align across languages?

34

slide-47
SLIDE 47

Q: why does complexity matter?

  • Clustering and dimensionality reduction.
  • Reading: Locality-sensitive hashing: how do fruit flies

implement unsupervised learning?

  • Practical: Document clustering for the backend of your

search engine. First search attempts.

35

slide-48
SLIDE 48

Q: what kind of decisions are binary?

  • Introduction to Support Vector Machines.
  • Reading: Detection of semantic errors in the prose of

non-English speakers with SVMs.

  • Practical: Is there a correlation between a person’s writing

and the onset of certain medical conditions?

36

slide-49
SLIDE 49

Q: what is a function?

  • Basics of NNs and general AI concepts.
  • Reading: a historical view on NNs and linguistics.
  • Practical: Implement a Neural Net from scratch.

Introduction to Deep Learning frameworks.

37

slide-50
SLIDE 50

Q: what is the difference between speaking and seeing?

  • Sequence learning with neural networks.
  • Reading: comparing seq-to-seq and attention-based

language models.

  • Practical: Using RNNs for document representation. Test

a different representation for your search engine.

38

slide-51
SLIDE 51

Q: do you read the news?

  • Special week. Important for the exam!
  • Take a neural network into your caring hands.
  • Brag to everybody about your protégé.

39

slide-52
SLIDE 52

Q: how did you learn to ride a bike?

  • Basics of Reinforcement Learning.
  • Reading: Multi-agent emergence of natural language.
  • Practical: how do we learn linguistic politeness in an

uncertain social environment? (a.k.a. asking for a coffee at the Rovereto train station.)

40

slide-53
SLIDE 53

Q: will you be good?

  • Ethical issues with ML. Bias in distributional vectors.
  • Reading: Literature on bias and on de-biasing techniques.
  • Practical: Is your search engine ethically sound?

41

slide-54
SLIDE 54

The exam

  • A list of topics will be posted on the

website.

  • Bring your adopted network to the

exam!

42