Statistical Natural Language Processing Recap: logistic regression - PDF document

Statistical Natural Language Processing Recap: logistic regression Learning in ANNs Non-linearity and MLP MLP Non-linearity Introduction 5 / 34 Summer Semester 2019 SfS / University of Tübingen Ç. Çöltekin, . . . where Learning in ANNs said to be linearly separable Non-linearity and MLP MLP Non-linearity Introduction 4 / 34 Artifjcial Neural networks: an introduction SfS / University of Tübingen Ç. Çöltekin, . . . activation function . otherwise Linear separability if one can fjnd a linear where 0 Summer Semester 2019 SfS / University of Tübingen Ç. Çöltekin, engineering 0 1 1 1 0 1 1 1 0 0 discriminator 0 Can a linear classifjer learn the XOR problem? Learning in ANNs Non-linearity and MLP MLP Non-linearity Introduction 6 / 34 Summer Semester 2019 SfS / University of Tübingen Ç. Çöltekin, There is no line that can separate positive and negative classes. problem example is the logical XOR if Summer Semester 2019 7 / 34 The biological neuron Ç. Çöltekin, simple processing units 1950’s – with some ups and downs ‘ deep learning ’ methods Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 1 / 34 Introduction Non-linearity MLP Non-linearity and MLP Learning in ANNs (showing a picture of a real neuron is mandatory in every ANN lecture) SfS / University of Tübingen Artifjcial and biological neural networks Learning in ANNs Non-linearity and MLP Axon terminal Axon Soma Dendrite *Image source: Wikipedia MLP Non-linearity Introduction 2 / 34 Summer Semester 2019 Ç. Çöltekin, do not mimic biological networks Summer Semester 2019 Non-linearity and MLP Çağrı Çöltekin University of Tübingen Seminar für Sprachwissenschaft Summer Semester 2019 Introduction Non-linearity MLP Recap: the perceptron Non-linearity and MLP Learning in ANNs 3 / 34 Artifjcial neural networks Learning in ANNs MLP fjnding the global minimum of the error function Introduction SfS / University of Tübingen models inspired by biological neural networks Non-linearity • Artifjcial neural networks (ANNs) are machine learning • ANNs are powerful non-linear models • Power comes with a price: there are no guarantees of • ANNs have been used in ML, AI, Cognitive science since • Currently they are the driving force behind the popular • ANNs are inspired by biological neural networks • Similar to biological networks, ANNs are made of many • Despite the similarities, there are many difgerences: ANNs • ANNs are practical statistical machine learning methods x 0 = 1 x 0 = 1   m ∑ y = f w j x j     m j ∑ x 1 x 1 w 0 w 0 P ( y ) = f w j x j   w 1 w 1 j { x 2 y x 2 P ( y ) + 1 wx > 0 w 2 w 2 f ( x ) = − 1 1 f ( x ) = w m w m 1 + e − wx In ANN-speak f ( · ) is called an x m x m • We can use non-linear basis functions x 2 w 0 + w 1 x 1 + w 2 x 2 + w 3 φ ( x 1 , x 2 ) • A classifjcation problem is 1 + − is still linear in w for any choice of φ ( · ) • For example, adding the product x 1 x 2 as an additional feature would allow a solution like: x 1 + x 2 − 2x 1 x 2 • A well-known counter x 1 − + x 1 x 2 x 1 + x 2 − 2x 1 x 2 0 1 • Choosing proper basis functions like x 1 x 2 is called feature

Introduction 12 / 34 Output Each unit takes a weighted sum of their input, and applies a (non-linear) activation function . Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 Introduction Input Non-linearity MLP Non-linearity and MLP Learning in ANNs Artifjcial neurons . . Hidden Non-linearity weighted sum of the inputs SfS / University of Tübingen multi-layer perceptron (MLP) consisting of perceptron-like units activation function problems – It can be used for both regression and classifjcation Ç. Çöltekin, Summer Semester 2019 the picture 11 / 34 Introduction Non-linearity MLP Non-linearity and MLP Learning in ANNs Multi-layer perceptron . transformation Learning in ANNs Activation functions in ANNs 14 / 34 Introduction Non-linearity MLP Non-linearity and MLP Learning in ANNs hidden units SfS / University of Tübingen (difgerentiable) functions Sigmoid (logistic) Hyperbolic tangent (tanh) Rectifjed linear unit (relu) Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 Summer Semester 2019 Ç. Çöltekin, non-linear activation Non-linearity and MLP Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 13 / 34 Introduction Non-linearity MLP Learning in ANNs becomes Artifjcial neurons an example . . . function is logistic sigmoid function Multi-layer perceptron y Non-linearity and MLP Non-linearity Summer Semester 2019 SfS / University of Tübingen Ç. Çöltekin, linearly separable space, the points are problem into 3D function maps the MLP solution in the 3D input space Non-linear basis functions Learning in ANNs Non-linearity and MLP MLP Introduction Introduction 8 / 34 Summer Semester 2019 SfS / University of Tübingen Ç. Çöltekin, the problem discriminant that solves is a (non-linear) The solution to solution in the original input space Non-linear basis functions Learning in ANNs Non-linearity and MLP MLP 9 / 34 15 / 34 Non-linearity Introduction SfS / University of Tübingen Ç. Çöltekin, ‘ not bad ’ is not ‘ not ’ + ‘ bad ’ (e.g., for sentiment analysis) MLP – Some efgects are periodic (e.g., many measures of time) of the variable (e.g., reaction time change by age) – The efgect may be strong or positive only in a certain range This is not always the case: 10 / 34 Summer Semester 2019 non-linearities are abundant in nature, it is not only the XOR problem Non-linearity and MLP Non-linearity Where do non-linearities come from? Learning in ANNs + − 1 1 • The additional basis x 1 x 2 x 2 x 1 + x 2 − 2x 1 x 2 − 0 . 5 = 0 0 . 5 • In the new, mapped − + 0 0 0 1 1 0 0 1 x 1 x 2 x 1 • The simplest modern ANN architecture is called In a linear model, y = w 0 + w 1 x 1 + . . . + w k x k • The outcome is linearly-related to the predictors • The MLP is a fully connected , feed-forward network • The efgects of the inputs are additive • Unlike perceptron, the units in an MLP use a continuous • Some predictors afgect the outcome in a non-linear way • The MLP can be trained using gradient-based methods • The MLP can represent many interesting machine learning • Some predictors interact • The unit calculates a x 0 = 1 m x 1 ∑ w j x j = wx w 0 x 1 j x 2 w 1 • Result is a linear x 2 ∑ y f ( · ) w 2 x 3 • Then the unit applies a x 4 w m function f ( · ) x m • Output of the unit is y = f ( wx ) x 0 = 1 • The activation functions in MLP are typically continuous • A common activation w 0 x 1 • For hidden units common choices are w 1 1 1 1 + e x f ( x ) = ∑ x 2 y 1 + e − x w 2 e 2x − 1 • The output of the network e 2x + 1 w m 1 max ( 0 , x ) y = x m 1 + e − wx

Statistical Natural Language Processing Recap: logistic regression - PDF document

Statistical Natural Language Processing Recap: logistic regression Learning in ANNs Non-linearity and MLP MLP Non-linearity Introduction 5 / 34 Summer Semester 2019 SfS / University of Tbingen . ltekin, . . . where Learning in

IN4080 2020 FALL NATURAL LANGUAGE PROCESSING Jan Tore Lnning 2 Logistic Regression

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Information Extraction Industrial Natural Language Processing Industrial Natural Language

Natural Language Processing 1 Lecture 11: Language generation and summarisation Katia Shutova

Natural Language Processing 1 Lecture 10: Language generation and summarisation Katia Shutova

2015 Schield Logistic MLE1C Excel2013 8/18/2016 V0D V0D V0D 2015 Schield Logistic MLE 1C

2015 Schield Logistic MLE1A Excel2013 10/29/2015 V0D V0D V0D 2015 Schield Logistic MLE 1A

Todays lecture Logistic regression How can we use logistic regression for reranking? Shay

From Logistic Regression to Neural Networks CMSC 470 Marine Carpuat Logistic Regression What

LEARNING Outline Math Behind Logistic Regression Visualizing Logistic Regression Loss

BASICS OF ARTIFICIAL NEURAL NETWORKS Tilo Burghardt | tilo@cs.bris.ac.uk 35 Slides Agenda for

Methods for recording neuronal activity Prof. Tom Otis t.otis@ucl.ac.uk From animal

Modelling Biochemical Reaction Networks Introductory lecture: What to model? Why? Marc R.

Timing in Biological Systems Lou Scheffer Howard Hughes Medical Institute

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

A Compiler for Scalable Placement and Routing of Brain-like Architectures Narayan Srinivasa

Machine Learning 2007: Lecture 7 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website:

Statistical analysis for the Johnson-Mehl germination-growth model Jesper Mller, Mohammad