Machine Learning for Computational Linguistics A refresher on linear - PowerPoint PPT Presentation

Machine Learning for Computational Linguistics A refresher on linear algebra Çağrı Çöltekin University of Tübingen Seminar für Sprachwissenschaft April 14, 2016

Practical matters A bit of machine learning Linear algebra Frequently asked questions but you should start working on your projects during during the semester. ( http://coltekin.net/cagri/courses/ml/ ) for reading material, slides, and assignments. Ç. Çöltekin, SfS / University of Tübingen April 14, 2016 1 / 28 ▶ The course is worth 9 ECTS. ▶ Term project/paper deadline will extend to semester break, ▶ Please check the course web page

Practical matters objects/people in the image April 14, 2016 SfS / University of Tübingen Ç. Çöltekin, answers Questions syntactic representation Sentences credit risk/reliability People/companies genre of the music Music (audio) fjles Images of scenes A bit of machine learning the digit Images of digits age of the author Books/blog posts/tweets positive/neutral/negative Product reviews spam or not Email messages Output Input A few example (supervised) machine learning tasks Linear algebra 2 / 28

Practical matters P A bit of machine learning … 65 N … 23 … … … … … … Ç. Çöltekin, SfS / University of Tübingen April 14, 2016 45 P 3 / 28 … Linear algebra N A few example (supervised) machine learning tasks Input Output 18 … x 1 x 2 x 3 y 30 0 0 . 10 1 . 20 60 1 20 1 − 1 . 20 0 . 00 90 0

Practical matters A bit of machine learning April 14, 2016 SfS / University of Tübingen Ç. Çöltekin, an input 4 / 28 Machine learning as function approximation functions Linear algebra ▶ We assume that data we observe is generated by an unknown y = f ( x 1 , x 2 , x 3 , . . . ) ▶ During training we want to estimate the function f ▶ Once we have an estimate of f , ^ f , we use it to predict y , given y = ^ ^ f ( x 1 , x 2 , x 3 , . . . )

Practical matters A bit of machine learning Linear algebra example, weights Ç. Çöltekin, SfS / University of Tübingen April 14, 2016 5 / 28 How do we approximate f ? ▶ We assume that f comes from a class of functions F . For F ( x ) = w 1 x 1 + w 2 x 2 + w 3 x 3 + . . . where w 1 , w 2 , w 3 are parameters ▶ The approximation, or learning, is fjnding an optimum set of

Practical matters A bit of machine learning April 14, 2016 SfS / University of Tübingen Ç. Çöltekin, set of linear equations 6 / 28 Linear algebra is the fjeld of mathematics that studies vectors and matrices. Linear algebra Linear algebra ▶ A vector is an ordered sequence of numbers v = ( 6, 17 ) ▶ A matrix is a rectangular arrangement of numbers [ 2 ] 1 A = 1 4 ▶ Most common application of linear algebra includes solving a 2x 1 + x 2 = 6 x 1 + 4x 2 = 17

Practical matters … April 14, 2016 SfS / University of Tübingen Ç. Çöltekin, You should now be seeing vectors and matrices here. … … … … … 23 … 65 … A bit of machine learning 45 7 / 28 Linear algebra 18 … Why study linear algebra? Remember our input matrix: Input Output … x 1 x 2 x 3 y 0 . 10 30 0 60 1 1 . 20 − 1 . 20 20 1 90 0 0 . 00

Practical matters A bit of machine learning April 14, 2016 SfS / University of Tübingen Ç. Çöltekin, vector and matrices explicitly ML methods In machine learning, matrices. Why study linear algebra? Linear algebra 8 / 28 ▶ We typically represent input, output, parameters as vectors or ▶ Some insights from linear algebra is helpful in understanding ▶ It makes notation concise and manageable ▶ In programming, many machine learning libraries make use of ▶ ‘Vectorized’ operations may run much faster on GPUs

Practical matters A bit of machine learning April 14, 2016 SfS / University of Tübingen Ç. Çöltekin, 9 / 28 Linear algebra Vectors: some notation ▶ Typical notation for vectors include   v 1 v = ⃗ v = ( v 1 , v 2 , v 3 ) = ⟨ v 1 , v 2 , v 3 ⟩ = v 2   v 3 ▶ A vector of n real numbers v = ( v 1 , v 2 , . . . v n ) is said to be in vector space R n ( v ∈ R n ).

Practical matters A bit of machine learning Linear algebra Geometric interpretation of vectors magnitude and a direction represented by arrows from the origin Ç. Çöltekin, SfS / University of Tübingen April 14, 2016 10 / 28 ( 1, 3 ) ▶ Vectors are objects with a ( 1, 1 ) ▶ Geometrically, they are (− 1, − 3 )

Practical matters A bit of machine learning April 14, 2016 SfS / University of Tübingen Ç. Çöltekin, 1 3 machine learning is L1 norm Another norm often used in L2 norm is often written 11 / 28 L 2 ) norm is the most commonly used norm For Vector norms Linear algebra ▶ Euclidian norm, or L2 (or v = ( v 1 , v 2 ) , √ v 2 1 + v 2 ∥ v ∥ 2 = 2 ( 3, 1 ) √ 3 2 + 1 2 = 3.16 ∥ ( 3, 1 ) ∥ 2 = without a subscript: ∥ v ∥

Practical matters A bit of machine learning April 14, 2016 SfS / University of Tübingen Ç. Çöltekin, 1 3 machine learning is L1 norm L2 norm is often written 11 / 28 commonly used norm For Linear algebra Vector norms L 2 ) norm is the most ▶ Euclidian norm, or L2 (or v = ( v 1 , v 2 ) , √ v 2 1 + v 2 ∥ v ∥ 2 = 2 ( 3, 1 ) √ 3 2 + 1 2 = 3.16 ∥ ( 3, 1 ) ∥ 2 = without a subscript: ∥ v ∥ ▶ Another norm often used in ∥ v ∥ 1 = | v 1 | + | v 2 | ∥ ( 3, 1 ) ∥ 1 = | 3 | + | 1 | = 4

Practical matters A bit of machine learning Linear algebra Multiplying a vector with a scalar ‘scales’ the vector Ç. Çöltekin, SfS / University of Tübingen April 14, 2016 12 / 28 2 v ▶ For a vector v = ( v 1 , v 2 ) and v = ( 1, 2 ) a scalar a , av = ( av 1 , av 2 ) ▶ multiplying with a scalar − 0.5 v

Practical matters A bit of machine learning Linear algebra Vector addition and subtraction Ç. Çöltekin, SfS / University of Tübingen April 14, 2016 13 / 28 v + w ▶ For vectors v = ( v 1 , v 2 ) and v w = ( w 1 , w 2 ) and w v + w = ( v 1 + w 1 , v 2 + w 2 ) ( 1, 2 ) + ( 2, 1 ) = ( 3, 3 ) ▶ v − w = v + (− w )

Practical matters A bit of machine learning April 14, 2016 SfS / University of Tübingen Ç. Çöltekin, two vectors. a similarity measure between 14 / 28 Dot product or, Linear algebra ▶ For vectors w = ( w 1 , w 2 ) and v = ( v 1 , v 2 ) , wv = w 1 v 1 + w 2 v 2 v w wv = ∥ w ∥∥ v ∥ cos α α ∥ v ∥ cos α ▶ The dot product of orthogonal vectors is 0 ▶ ∥ w ∥ = ww ▶ Dot product is often used as

Practical matters A bit of machine learning April 14, 2016 SfS / University of Tübingen Ç. Çöltekin, to dot product magnitudes of the vectors similarity is often used as another similarity metric, called cosine Cosine similarity Linear algebra 15 / 28 ▶ Cosine of the angle between two vectors vw cos α = ∥ v ∥∥ w ∥ ▶ The cosine similarity related to dot product, but ignores the ▶ For unit vectors (vectors of length 1) cosine similarity is equal

Practical matters ... . . . . A bit of machine learning . . . . . . collection of row or column vectors Ç. Çöltekin, SfS / University of Tübingen April 14, 2016 . . Linear algebra 16 / 28 Matrices ▶ We can think of matrices as   a 1,1 a 1,2 a 1,3 . . . a 1,n a 2,1 a 2,2 a 2,3 . . . a 2,n   A =     ▶ A matrix with n rows and m   a m,1 a m,2 a m,3 . . . a m,n columns is in R n × m

Practical matters A bit of machine learning April 14, 2016 SfS / University of Tübingen Ç. Çöltekin, . 17 / 28 Linear algebra Transpose of a matrix the columns of the original matrix. Transpose of a n × m matrix is a m × n matrix whose rows are Transpose of a matrix A is denoted with A T .   a b [ a ] c e  , A T = If A = c d  b d f e f

Practical matters A bit of machine learning April 14, 2016 SfS / University of Tübingen Ç. Çöltekin, 18 / 28 Linear algebra Multiplying a matrix with a scalar Similar to vectors, each element is multiplied by the scalar. [ 2 ] [ 2 × 2 ] [ 4 ] 1 2 × 1 2 2 = = 2 × 1 2 × 4 1 4 2 8

Practical matters A bit of machine learning April 14, 2016 SfS / University of Tübingen Ç. Çöltekin, 19 / 28 element Linear algebra Matrix addition and subtraction Each element is added to (or subtracted from) the corresponding [ 2 ] [ 0 ] [ 2 ] 1 1 2 + = 1 4 1 0 2 4

Machine Learning for Computational Linguistics A refresher on linear - PowerPoint PPT Presentation

Machine Learning for Computational Linguistics A refresher on linear algebra ar ltekin University of Tbingen Seminar fr Sprachwissenschaft April 14, 2016 Practical matters A bit of machine learning Linear algebra Frequently

4CSLL5 Advanced Computational Linguistics Introduction Phrase Based Machine Trans Martin

Foundations of Computational Linguistics man-machine communication in natural language R OLAND H

Foundations of Computational Linguistics man-machine communication in natural language R OLAND H

Foundations of Computational Linguistics man-machine communication in natural language R OLAND H

Foundations of Computational Linguistics man-machine communication in natural language R OLAND H

Introduction to Linguistics Darrell Larsen Linguistics 101 Darrell Larsen Introduction to

Outline zipfR zipfR (Computational) linguistics Evert & Baroni Evert & Baroni

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Topics in Computational Linguistics Topics in Computational Linguistics March 28, 2014 GIL,

One-Shot Learning: Language Acquisition for Machine SS16 Computational Linguistics for

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Linguistics 201 Personnel Introduction to Linguistics General Course Description Syllabus

Pyramid Vector Quantization for Video Coding Jean-Marc Valin Daala Coding Party Sep 2013

CS 4 7 3 1 / 5 4 3 : Com puter Graphics Lecture 3 ( Part I I I ) : 3 D Modeling: Polygonal Meshes

AM 205: lecture 6 Last time: finished the data fitting topic Todays lecture: numerical

Diagrams: Declarative Vector Graphics in Haskell Brent Yorgey NY Haskell Users Group

Chapter 7 Norms and Distance Measures Chapter 7 Vector Norms Norms are functions which measure

(Un)decidability results on real vector spaces arising from the formalization of mathematics John

Computer Graphics (CS 543) Lecture 3 (Part 1): Linear Algebra for Graphics (Points, Scalars,

Verification Process Training Child Nutrition 2020-2021 1 10/5/2020 Who Do You Call???

Machine Learning for Computational Linguistics A refresher on linear - PowerPoint PPT Presentation

Machine Learning for Computational Linguistics A refresher on linear algebra ar ltekin University of Tbingen Seminar fr Sprachwissenschaft April 14, 2016 Practical matters A bit of machine learning Linear algebra Frequently

4CSLL5 Advanced Computational Linguistics Introduction Phrase Based Machine Trans Martin

Foundations of Computational Linguistics man-machine communication in natural language R OLAND H

Foundations of Computational Linguistics man-machine communication in natural language R OLAND H

Foundations of Computational Linguistics man-machine communication in natural language R OLAND H

Foundations of Computational Linguistics man-machine communication in natural language R OLAND H

Introduction to Linguistics Darrell Larsen Linguistics 101 Darrell Larsen Introduction to

Outline zipfR zipfR (Computational) linguistics Evert &amp; Baroni Evert &amp; Baroni

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Topics in Computational Linguistics Topics in Computational Linguistics March 28, 2014 GIL,

One-Shot Learning: Language Acquisition for Machine SS16 Computational Linguistics for

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Linguistics 201 Personnel Introduction to Linguistics General Course Description Syllabus

Pyramid Vector Quantization for Video Coding Jean-Marc Valin Daala Coding Party Sep 2013

CS 4 7 3 1 / 5 4 3 : Com puter Graphics Lecture 3 ( Part I I I ) : 3 D Modeling: Polygonal Meshes

AM 205: lecture 6 Last time: finished the data fitting topic Todays lecture: numerical

Diagrams: Declarative Vector Graphics in Haskell Brent Yorgey NY Haskell Users Group

Chapter 7 Norms and Distance Measures Chapter 7 Vector Norms Norms are functions which measure

(Un)decidability results on real vector spaces arising from the formalization of mathematics John

Computer Graphics (CS 543) Lecture 3 (Part 1): Linear Algebra for Graphics (Points, Scalars,

Verification Process Training Child Nutrition 2020-2021 1 10/5/2020 Who Do You Call???

Outline zipfR zipfR (Computational) linguistics Evert & Baroni Evert & Baroni