Bilinear Text Regression and Applications Vasileios Lampos - PowerPoint PPT Presentation

Bilinear Text Regression and Applications Vasileios Lampos Department of Computer Science University College London May, 2014 1 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 1/45

Outline ⊥ Linear Regression Methods ⊣ Bilinear Regression Methods ⊣ Applications | = Conclusions 2 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 2/45

Recap on regression methods 3 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 3/45

Regression basics — Ordinary Least Squares (1/2) x i ∈ R m , • observations x x — X X X i ∈ { 1 , ..., n } • responses y i ∈ R , — y y y i ∈ { 1 , ..., n } • weights, bias w w w j , β ∈ R , — w w ∗ = [ w w ; β ] j ∈ { 1 , ..., m } Ordinary Least Squares (OLS)   2 n m � � argmin  y i − β − x ij w j  w w,β w i =1 j =1 or in matrix form y � 2 argmin � X X X ∗ w w w ∗ − y y ℓ 2 , where X X X ∗ = [ X X diag ( I X I )] I w ∗ w w � − 1 X � X T X T ⇒ w w w ∗ = X X ∗ X X ∗ X X ∗ y y y 4 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 4/45

Regression basics — Ordinary Least Squares (2/2) x i ∈ R m , • observations x x — X X X i ∈ { 1 , ..., n } • responses y i ∈ R , — y y y i ∈ { 1 , ..., n } • weights, bias w w w j , β ∈ R , — w w ∗ = [ w w ; β ] j ∈ { 1 , ..., m } Ordinary Least Squares (OLS) � − 1 X � y � 2 X T X T argmin � X X X ∗ w w w ∗ − y y ℓ 2 ⇒ w w ∗ = w X X ∗ X X X ∗ X ∗ y y y w w w ∗ Why not? X T − − − X X ∗ X X X ∗ may be singular (thus difficult to invert) − − − high-dimensional models difficult to interpret − − − unsatisfactory prediction accuracy (estimates have large variance) 5 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 5/45

Regression basics — Ridge Regression (1/2) x i ∈ R m , • observations x x — X X X i ∈ { 1 , ..., n } • responses y i ∈ R , — y y y i ∈ { 1 , ..., n } • weights, bias w w w j , β ∈ R , — w w ∗ = [ w w ; β ] j ∈ { 1 , ..., m } Ridge Regression (RR) � � − 1 X X T X T w w ∗ = w X X ∗ X X X ∗ + λI I I X ∗ y y y (Hoerl & Kennard, 1970) � �� non singular     2  n m m    � � � w 2 argmin  y i − β − x ij w j + λ  j   w w,β w  i =1 j =1 j =1  � � y � 2 w � 2 or argmin � X X ∗ w X w w ∗ − y y ℓ 2 + λ � w w ℓ 2 w ∗ w w 6 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 6/45

Regression basics — Ridge Regression (2/2) x i ∈ R m , • observations x x — X X X i ∈ { 1 , ..., n } • responses y i ∈ R , — y y y i ∈ { 1 , ..., n } • weights, bias w w w j , β ∈ R , — w w ∗ = [ w w ; β ] j ∈ { 1 , ..., m } Ridge Regression (RR) � � y � 2 w � 2 argmin � X X X ∗ w w w ∗ − y y ℓ 2 + λ � w w ℓ 2 w w w ∗ + + + size constraint on the weight coefficients ( regularisation ) → resolves problems caused by collinear variables + + + less degrees of freedom, better predictive accuracy than OLS − − − does not perform feature selection (nonzero coefficients) 7 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 7/45

Regression basics — Lasso x i ∈ R m , • observations x x — X X X i ∈ { 1 , ..., n } • responses y i ∈ R , — y y y i ∈ { 1 , ..., n } • weights, bias w j , β ∈ R , — w w w ∗ = [ w w ; β ] w j ∈ { 1 , ..., m } ℓ 1 ℓ 1 ℓ 1 –norm regularisation or lasso (Tibshirani, 1996)     2 n m m     � � � argmin  y i − β − x ij w j + λ | w j |    w w w,β   i =1 j =1 j =1 � � y � 2 X w y w or argmin � X X ∗ w w ∗ − y ℓ 2 + λ � w w � ℓ 1 w ∗ w w − − − no closed form solution — quadratic programming problem + Least Angle Regression explores entire reg. path (Efron et al. , 2004) + + + w + + sparse w w , interpretability, better performance (Hastie et al. , 2009) − if m > n , at most n variables can be selected − − − − − strongly corr. predictors → model-inconsistent (Zhao & Yu, 2009) 8 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 8/45

Regression basics — Lasso for Text Regression x i ∈ R m , • n-gram frequencies x x X — X X i ∈ { 1 , ..., n } • flu rates y i ∈ R , — y y y i ∈ { 1 , ..., n } • weights, bias w w w j , β ∈ R , j ∈ { 1 , ..., m } — w w ∗ = [ w w ; β ] ℓ 1 ℓ 1 ℓ 1 –norm regularisation or lasso � � y � 2 X w y w or argmin � X X ∗ w w ∗ − y ℓ 2 + λ � w w � ℓ 1 w w ∗ w ‘unwel’, ‘temperatur’, ‘headach’, ‘appetit’, ‘symptom’, ‘diarrhoea’, ‘muscl’, ‘feel’, ... 150 HPA HPA 100 Inferred Inferred 100 Flu rate Flu rate A B C D E 50 50 0 0 180 200 220 240 260 280 300 320 340 0 10 20 30 40 50 60 70 80 90 Day Number (2009) Days Figure 1 : Flu rate predictions for the UK by applying lasso on Twitter data (Lampos & Cristianini, 2010) 9 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 9/45

Regression basics — Elastic Net • observations x i ∈ R m , x x — X X X i ∈ { 1 , ..., n } • responses y y i ∈ R , — y y i ∈ { 1 , ..., n } • weights, bias w j , β ∈ R , — w w w ∗ = [ w w w ; β ] j ∈ { 1 , ..., m } [ Linear ] Elastic Net (LEN) (Zhou & Hastie, 2005)       y � 2 w � 2 argmin � X X X ∗ w w w ∗ − y y + λ 1 � w w + λ 2 � w w � ℓ 1 w ℓ 2 ℓ 2 w w w ∗   � ��   Lasso reg. OLS RR reg. + + + ‘compromise’ between ridge regression (handles collinear predictors) and lasso (favours sparsity) + + entire reg. path can be explored by modifying LAR + + + + if m > n , number of selected variables not limited to n − − − may select redundant variables! 10 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 10/45

Would a slightly different text regression approach be more suitable for Social Media content? 11 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 11/45

About Twitter (1/2) Tweet Examples @PaulLondon: I would strongly support a coalition government. It is the best thing for our country right now. #electionsUK2010 @JohnsonMP: Socialism is something forgotten in our country #supportLabour @FarageNOT: Far-right ‘movements’ come along with crises in capitalism #UKIP @JohnK 1999: RT @HannahB: Stop talking about politics and listen to Justin!! Bieber rules, peace and love ♥ ♥ ♥ The Twitter basics : • 140 characters per status (tweet) • users follow and be followed • embedded usage of topics (#elections) • retweets ( RT ), @replies, @mentions, favourites • real-time nature • biased user demographics 12 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 12/45

About Twitter (2/2) Tweet Examples @PaulLondon: I would strongly support a coalition government. It is the best thing for our country right now. #electionsUK2010 @JohnsonMP: Socialism is something forgotten in our country #supportLabour @FarageNOT: Far-right ‘movements’ come along with crises in capitalism #UKIP @JohnK 1999: RT @HannahB: Stop talking about politics and listen to Justin!! Bieber rules, peace and love ♥ ♥ ♥ • contains a vast amount of information about various topics • this information ( X X y X ) can be used to assist predictions ( y y ) (Lampos & Cristianini, 2012; Sakaki et al. , 2010; Bollen et al. , 2011) − X y − − f : X X → y y , f usually formulates a linear regression task − − − X X X represents word frequencies only... + is it possible to incorporate a user contribution somehow? + + word selection + + user selection + 13 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 13/45

Bi-linear Text Regression 14 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 14/45

Bilinear Text Regression — The general idea (1/2) x T Linear regression: f ( x x x i ) = x x i w w w + β x i ∈ R m , • observations x x — X X X i ∈ { 1 , ..., n } • responses y i ∈ R , — y y y i ∈ { 1 , ..., n } • weights, bias w j , β ∈ R , — w w w ∗ = [ w w w ; β ] j ∈ { 1 , ..., m } u T Q Bilinear regression: f ( Q Q Q i ) = u u Q Q i w w w + β p ∈ Z + • users • observations Q i ∈ R p × m , Q Q — X X X i ∈ { 1 , ..., n } • responses y y i ∈ R , — y y i ∈ { 1 , ..., n } • weights, bias u k , w j , β ∈ R , — u u, w u w w, β k ∈ { 1 , ..., p } j ∈ { 1 , ..., m } 15 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 15/45

Bilinear Text Regression and Applications Vasileios Lampos - PowerPoint PPT Presentation

Bilinear Text Regression and Applications Vasileios Lampos Department of Computer Science University College London May, 2014 1 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 1/45 Outline Linear Regression

Pairing-Based Cryptography & Generic Groups Lecture 22 Bilinear Pairing Bilinear Pairing

Pairing-Based Cryptography & Generic Groups Lecture 21 Bilinear Pairing Bilinear Pairing

Pairing-Based Cryptography & Generic Groups Lecture 22 1 Bilinear Pairing 2 Bilinear

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

A Bilinear Model for Text Regression Daniel Preotiuc-Pietro daniel@dcs.shef.ac.uk

Abstract rule representations in a Abstract rule representations in a bilinear model bilinear

Weakly-coupled bilinear quantum systems Thomas Chambrion Nabile Boussad (Besanon) and Marco

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Text Text #ICANN51 15 October 2014 Text Text IDN Root Zone LGR Sarmad Hussain IDN Program

Enhancing ICANN Text Accountability 26 June 2014 Text #ICANN50 Text #ICANN50 Text #ICANN50

Add Your Title Here Replace your text here! Replace your text here! Insert your title here 1

Text Text #ICANN51 Contractual Compliance Text Text Contractual Compliance Update

Text Text #ICANN50 Contractual Compliance Text Text GNSO Council Meeting Wednesday, Jun 25

Usable Security Spring 2015 Franziska (Franzi) Roesner

The LinGO Grammar Matrix Customization System Antske Fokkens Department of Computational

HUBZone Small Business Summit Welcome & Opening Comments Summit April 26, 2017 -

Welcome and Overview Wendy Fink NIWR Conference, Washington, DC, 3/1/2017 What is APLU 235

Democracy, Security and Evidence lets have all three ASIACRYPT 2018 Vanessa Teague

Origins and Development of the Austrian School Peter G. Klein University of Missouri August

International and Global Security 1 Peer Discussion What is security? 2 International

WIDER WIDER Development Conference 13 - 15 September 2018, Helsinki, Finland FINANCING THE ZAMBIA

Bilinear Text Regression and Applications Vasileios Lampos - PowerPoint PPT Presentation

Bilinear Text Regression and Applications Vasileios Lampos Department of Computer Science University College London May, 2014 1 / 45 V. Lampos v.lampos@ucl.ac.uk Bilinear Text Regression and Applications 1/45 Outline Linear Regression

Pairing-Based Cryptography &amp; Generic Groups Lecture 22 Bilinear Pairing Bilinear Pairing

Pairing-Based Cryptography &amp; Generic Groups Lecture 21 Bilinear Pairing Bilinear Pairing

Pairing-Based Cryptography &amp; Generic Groups Lecture 22 1 Bilinear Pairing 2 Bilinear

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

A Bilinear Model for Text Regression Daniel Preotiuc-Pietro daniel@dcs.shef.ac.uk

Abstract rule representations in a Abstract rule representations in a bilinear model bilinear

Weakly-coupled bilinear quantum systems Thomas Chambrion Nabile Boussad (Besanon) and Marco

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Text Text #ICANN51 15 October 2014 Text Text IDN Root Zone LGR Sarmad Hussain IDN Program

Enhancing ICANN Text Accountability 26 June 2014 Text #ICANN50 Text #ICANN50 Text #ICANN50

Add Your Title Here Replace your text here! Replace your text here! Insert your title here 1

Text Text #ICANN51 Contractual Compliance Text Text Contractual Compliance Update

Text Text #ICANN50 Contractual Compliance Text Text GNSO Council Meeting Wednesday, Jun 25

Usable Security Spring 2015 Franziska (Franzi) Roesner

The LinGO Grammar Matrix Customization System Antske Fokkens Department of Computational

HUBZone Small Business Summit Welcome &amp; Opening Comments Summit April 26, 2017 -

Welcome and Overview Wendy Fink NIWR Conference, Washington, DC, 3/1/2017 What is APLU 235

Democracy, Security and Evidence lets have all three ASIACRYPT 2018 Vanessa Teague

Origins and Development of the Austrian School Peter G. Klein University of Missouri August

International and Global Security 1 Peer Discussion What is security? 2 International

WIDER WIDER Development Conference 13 - 15 September 2018, Helsinki, Finland FINANCING THE ZAMBIA

Pairing-Based Cryptography & Generic Groups Lecture 22 Bilinear Pairing Bilinear Pairing

Pairing-Based Cryptography & Generic Groups Lecture 21 Bilinear Pairing Bilinear Pairing

Pairing-Based Cryptography & Generic Groups Lecture 22 1 Bilinear Pairing 2 Bilinear

HUBZone Small Business Summit Welcome & Opening Comments Summit April 26, 2017 -