Machine Learning for NLP Introduction session Aurlie Herbelot 2020 - PowerPoint PPT Presentation

Machine Learning for NLP Introduction session Aurélie Herbelot 2020 Centre for Mind/Brain Sciences University of Trento 1

Material, contact All material will be posted at: http: //aurelieherbelot.net/teaching/ml-for-nlp/ Any question, worry, complaint... write to: aurelie.herbelot@unitn.it 2

Course overview 3

Goals 1. Understand core machine learning algorithms used in NLP: 1.1 for science; 1.2 for applications. 2. Be able to read and criticise related literature. 3. Acquire some fundamental computational skills to run ML code and interpret its output. 4

Session structure • An introductory week, followed by 9 topics, each associated with 3 classes: 1. A lecture presenting the topic for that week. 2. A reading group on one or two papers using the presented algorithm(s) / metric(s). 3. A practical with a task and/or some code to play with. All code will be provided on GitHub. Some practicals will focus on linguistic questions ( ), some on applications ( ). 5

Syllabus 6

What for? 7

NLP for science 8

Computational tools for language sciences Why? 9

Simulation of star formation. http://burro.astr.cwru.edu/models/sfrmm009.gif 10

Right: Prusinkiewicz (2004), modelling plant growth with a grammar. 10

Current work in the CALM group: Simulate the tension between linguistic creativity and communication needs. What shape can lexical meaning take without breaking alignment between speakers? 10

Modelling • A model is an approximation of reality. • Aim: observe behaviour of the model and ensure it does not produce states incompatible with reality. • Computational models and their implementation (simulations) allow for fast counter-checking of hypotheses. 11

Issues • Assumptions: a model rests on a number of simplifying assumptions. Q: What might be simplified in a model of language? • Evaluation: a model should be compatible with past / future observations (corpora, linguistic judgements, behavioural data, brain states...) Q: In which ways might linguistic data be biased / flawed? • Access to reality: it took 50 years to (partially) confirm the existence of the Higgs Boson. Most of what makes language is invisible, including parts we take for granted. Q: Which ones? • Replicability: will others be able to reproduce your experiment? A: why shouldn’t they? 12

Example question Can language be learned from scratch? Or does it need innate mechanisms? 13

Example question Can language be learned from scratch? Or does it need innate mechanisms? 14

Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. 15

Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. Then let’s choose a model to match the hypothesis. 16

Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. Then let’s choose a model to match the hypothesis. And data to train the model under various conditions. 17

Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. Then let’s choose a model to match the hypothesis. And data to train the model under various conditions. Finally, test the hypothesis. 18

A real model Actual CHILDES data RNN-generated data “What kind of food did you buy?” “What kind of little girl?” buy kind ___|________ ____|_____ | | kind | of | | ____|____ | | | | | of | girl | | | | | | did you what food what little Catenae: Catenae: kind -[DET] → what kind -[DET] → what kind-[PREP] → of kind-[PREP] → of of-[POBJ] → food of-[POBJ] → girl of-[POBJ] → NOUN of-[POBJ] → NOUN ... ... Given a) some training data and b) some generated output from an RNN, processed with the same formalism, we can investigate which structures the RNN can reproduce, and compare their distribution to the original data. (Ongoing work by Ludovica Pannitto.) 19

A real model Actual CHILDES data “Are you teasing me?” RNN-generated data It’s a jelly graving me. teasing _____|_____ graving are you me _____|______ | ’s jelly Catenae: | | | VERB -[AUX] → are me it a ... “It’s a steering wheel.” Catenae: VERB -[AUX] → are ’s NOUN -[DET] → a ___|____ ’s -[NSUBJ] → it | wheel graving -[DOBJ] → me | ____|______ graving -[ADVMOD] → jelly it a steering ... Catenae: We can do this check even when the RNN produces ’s -[NSUBJ] → it partially nonsensical sentences. NOUN -[DET] → a ... 20

NLP for applications 21

Computational tools for technology How? 22

Software development • Requirement analysis: what should the software do? with how much resources? • Design: modelling (as in science), choice of language / hardware, etc. • Implementation: programming and documenting. • Evaluation: check requirements are satisfied, performance is acceptable, etc. 23

Issues • Model: as in science, some assumptions must be made and constraints respected (e.g. hardware / internet access available to end user). • Implementation: for whom are you writing? will the code be open? how do you document? do you need training to be reproducible? • Ethics: should you be doing this? If your app is 99% accurate, does the 1% matter? Where did the data come from? Etc. 24

An example task Build a Web search engine. 25

An example task Build a Web search engine. 26

An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources. 27

An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources. Representations must be adequate for the task. 28

An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources. Representations must be adequate for the task. The system must be evaluated. 29

The PeARS search engine More information on https://pearsearch.org/ 30

What to expect in the course 31

• ML for NLP: what’s it good for? • Basic principles of statistical language learning. • Practical: how unique is each person’s language? Can we model inter-speaker differences? 32

Q: in what sense can data be good or bad? • How to generate (good) data for training and evaluation. • Reading: do people agree on what the world is like? • Practical: produce data for your search engine. Pre-process Wikipedia, extract specific category pages, design page/query representations. 33

Q: how do machines build models? • Introduction to supervised learning and regression techniques. • Reading: predicting words from brain images (‘mind-reading’). • Practical: to which extent do people’s conceptual spaces align across languages? 34

Q: why does complexity matter? • Clustering and dimensionality reduction. • Reading: Locality-sensitive hashing: how do fruit flies implement unsupervised learning? • Practical: Document clustering for the backend of your search engine. First search attempts. 35

Q: what kind of decisions are binary? • Introduction to Support Vector Machines. • Reading: Detection of semantic errors in the prose of non-English speakers with SVMs. • Practical: Is there a correlation between a person’s writing and the onset of certain medical conditions? 36

Machine Learning for NLP Introduction session Aurlie Herbelot 2020 - PowerPoint PPT Presentation

Machine Learning for NLP Introduction session Aurlie Herbelot 2020 Centre for Mind/Brain Sciences University of Trento 1 Material, contact All material will be posted at: http: //aurelieherbelot.net/teaching/ml-for-nlp/ Any question,

SI485i : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

SI425 : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

NLP: Two pictures Wordnet and Word Sense Problem NLP Disambiguation Semantics NLP Trinity

Recurrent Neural Networks Graham Neubig Site https://phontron.com/class/nn4nlp2017/ NLP and

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for

Deep Learning for NLP Kiran Vodrahalli Feb 11, 2015 Overview What is NLP? Natural

SI485i : NLP Set 12 Features and Prediction What is NLP, really? Many of our tasks boil down

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Service Discovery For IP Applications Dr Stuart Cheshire, Apple DEST IETF 106, Singapore,

Youth Digital Activism 47th Symposium on International Relations - Social Media: Global

CLINICAL PEARLS of a problem FACULTY AND AUDIENCE May be evidence-based or experience-based

Part III: Machine Learning CS 188: Artificial Intelligence Up until now: how to reason in a

Earth's Layers Three Types of Rocks Early Life on Earth / Fossils Rock Strata

Presentation by Adam Lee If you would like to have a similar presentation done at your event, or

SAM-T04: whats new for CASP6 Kevin Karplus Richard Hughey Jenny Draper, Sol Katzman, Martina

Midterm II Review STA 104 - Summer 2017 Project proposal due tomorrow at 2 pm Duke