statistical natural language processing recap logistic
play

Statistical Natural Language Processing Recap: logistic regression - PDF document

Statistical Natural Language Processing Recap: logistic regression Learning in ANNs Non-linearity and MLP MLP Non-linearity Introduction 5 / 34 Summer Semester 2019 SfS / University of Tbingen . ltekin, . . . where Learning in


  1. Statistical Natural Language Processing Recap: logistic regression Learning in ANNs Non-linearity and MLP MLP Non-linearity Introduction 5 / 34 Summer Semester 2019 SfS / University of Tübingen Ç. Çöltekin, . . . where Learning in ANNs said to be linearly separable Non-linearity and MLP MLP Non-linearity Introduction 4 / 34 Artifjcial Neural networks: an introduction SfS / University of Tübingen Ç. Çöltekin, . . . activation function . otherwise Linear separability if one can fjnd a linear where 0 Summer Semester 2019 SfS / University of Tübingen Ç. Çöltekin, engineering 0 1 1 1 0 1 1 1 0 0 discriminator 0 Can a linear classifjer learn the XOR problem? Learning in ANNs Non-linearity and MLP MLP Non-linearity Introduction 6 / 34 Summer Semester 2019 SfS / University of Tübingen Ç. Çöltekin, There is no line that can separate positive and negative classes. problem example is the logical XOR if Summer Semester 2019 7 / 34 The biological neuron Ç. Çöltekin, simple processing units 1950’s – with some ups and downs ‘ deep learning ’ methods Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 1 / 34 Introduction Non-linearity MLP Non-linearity and MLP Learning in ANNs (showing a picture of a real neuron is mandatory in every ANN lecture) SfS / University of Tübingen Artifjcial and biological neural networks Learning in ANNs Non-linearity and MLP Axon terminal Axon Soma Dendrite *Image source: Wikipedia MLP Non-linearity Introduction 2 / 34 Summer Semester 2019 Ç. Çöltekin, do not mimic biological networks Summer Semester 2019 Non-linearity and MLP Çağrı Çöltekin University of Tübingen Seminar für Sprachwissenschaft Summer Semester 2019 Introduction Non-linearity MLP Recap: the perceptron Non-linearity and MLP Learning in ANNs 3 / 34 Artifjcial neural networks Learning in ANNs MLP fjnding the global minimum of the error function Introduction SfS / University of Tübingen models inspired by biological neural networks Non-linearity • Artifjcial neural networks (ANNs) are machine learning • ANNs are powerful non-linear models • Power comes with a price: there are no guarantees of • ANNs have been used in ML, AI, Cognitive science since • Currently they are the driving force behind the popular • ANNs are inspired by biological neural networks • Similar to biological networks, ANNs are made of many • Despite the similarities, there are many difgerences: ANNs • ANNs are practical statistical machine learning methods x 0 = 1 x 0 = 1   m ∑ y = f w j x j     m j ∑ x 1 x 1 w 0 w 0 P ( y ) = f w j x j   w 1 w 1 j { x 2 y x 2 P ( y ) + 1 wx > 0 w 2 w 2 f ( x ) = − 1 1 f ( x ) = w m w m 1 + e − wx In ANN-speak f ( · ) is called an x m x m • We can use non-linear basis functions x 2 w 0 + w 1 x 1 + w 2 x 2 + w 3 φ ( x 1 , x 2 ) • A classifjcation problem is 1 + − is still linear in w for any choice of φ ( · ) • For example, adding the product x 1 x 2 as an additional feature would allow a solution like: x 1 + x 2 − 2x 1 x 2 • A well-known counter x 1 − + x 1 x 2 x 1 + x 2 − 2x 1 x 2 0 1 • Choosing proper basis functions like x 1 x 2 is called feature

  2. Introduction 12 / 34 Output Each unit takes a weighted sum of their input, and applies a (non-linear) activation function . Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 Introduction Input Non-linearity MLP Non-linearity and MLP Learning in ANNs Artifjcial neurons . . Hidden Non-linearity weighted sum of the inputs SfS / University of Tübingen multi-layer perceptron (MLP) consisting of perceptron-like units activation function problems – It can be used for both regression and classifjcation Ç. Çöltekin, Summer Semester 2019 the picture 11 / 34 Introduction Non-linearity MLP Non-linearity and MLP Learning in ANNs Multi-layer perceptron . transformation Learning in ANNs Activation functions in ANNs 14 / 34 Introduction Non-linearity MLP Non-linearity and MLP Learning in ANNs hidden units SfS / University of Tübingen (difgerentiable) functions Sigmoid (logistic) Hyperbolic tangent (tanh) Rectifjed linear unit (relu) Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 Summer Semester 2019 Ç. Çöltekin, non-linear activation Non-linearity and MLP Ç. Çöltekin, SfS / University of Tübingen Summer Semester 2019 13 / 34 Introduction Non-linearity MLP Learning in ANNs becomes Artifjcial neurons an example . . . function is logistic sigmoid function Multi-layer perceptron y Non-linearity and MLP Non-linearity Summer Semester 2019 SfS / University of Tübingen Ç. Çöltekin, linearly separable space, the points are problem into 3D function maps the MLP solution in the 3D input space Non-linear basis functions Learning in ANNs Non-linearity and MLP MLP Introduction Introduction 8 / 34 Summer Semester 2019 SfS / University of Tübingen Ç. Çöltekin, the problem discriminant that solves is a (non-linear) The solution to solution in the original input space Non-linear basis functions Learning in ANNs Non-linearity and MLP MLP 9 / 34 15 / 34 Non-linearity Introduction SfS / University of Tübingen Ç. Çöltekin, ‘ not bad ’ is not ‘ not ’ + ‘ bad ’ (e.g., for sentiment analysis) MLP – Some efgects are periodic (e.g., many measures of time) of the variable (e.g., reaction time change by age) – The efgect may be strong or positive only in a certain range This is not always the case: 10 / 34 Summer Semester 2019 non-linearities are abundant in nature, it is not only the XOR problem Non-linearity and MLP Non-linearity Where do non-linearities come from? Learning in ANNs + − 1 1 • The additional basis x 1 x 2 x 2 x 1 + x 2 − 2x 1 x 2 − 0 . 5 = 0 0 . 5 • In the new, mapped − + 0 0 0 1 1 0 0 1 x 1 x 2 x 1 • The simplest modern ANN architecture is called In a linear model, y = w 0 + w 1 x 1 + . . . + w k x k • The outcome is linearly-related to the predictors • The MLP is a fully connected , feed-forward network • The efgects of the inputs are additive • Unlike perceptron, the units in an MLP use a continuous • Some predictors afgect the outcome in a non-linear way • The MLP can be trained using gradient-based methods • The MLP can represent many interesting machine learning • Some predictors interact • The unit calculates a x 0 = 1 m x 1 ∑ w j x j = wx w 0 x 1 j x 2 w 1 • Result is a linear x 2 ∑ y f ( · ) w 2 x 3 • Then the unit applies a x 4 w m function f ( · ) x m • Output of the unit is y = f ( wx ) x 0 = 1 • The activation functions in MLP are typically continuous • A common activation w 0 x 1 • For hidden units common choices are w 1 1 1 1 + e x f ( x ) = ∑ x 2 y 1 + e − x w 2 e 2x − 1 • The output of the network e 2x + 1 w m 1 max ( 0 , x ) y = x m 1 + e − wx

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend