Discovering the multifaceted information hidden within large - PowerPoint PPT Presentation

Discovering the multifaceted information hidden within large user-generated text streams Daniel Preotiuc-Pietro daniel@dcs.shef.ac.uk 23.04.2014

Context • vast increase in user generated content • Online Social Networks most time-consuming activity on Internet • multiple modalities: text, time, location, user info, images, etc. • social network structure • Challenges: • Engeneering: data volume • Algorithmic: restricted information, grounded in context, streaming, noise

Motivation Assumption: Text has different use conditioned on factors such as time, location, etc. Aim: Build models which incorporate these factors Tasks: • Supervised prediction applications • internal, external • Study the effect of these factors in text use • Improve performance of downstream applications

Outline i. Introduction ii. Data processing iii. Temporal patterns iv. Text forecasting real-world outcomes v. Spatio-temporal clustering vi. User level properties

TrendMiner project • `Large scale, cross-lingual trend mining and summarization of real time media streams’ • 6+4 organisations; we work with University of Southampton and SORA on machine learning • application to predicting political polls and aiding political analysts to make sense of social media data www.trendminer-project.eu

Text Processing new conventions lack of context creative spellings RT @MediaScotland greeeat!!!lvly speech by cameron on scott's indy :) #indyref shortenings unorthodox capitalisation OOV words

Processing Architecture • Fast: real time processing, Hadoop MapReduce (I/O bound), online and batch processing • Scalable: adding more machines • Modular: easy to add new modules • Pipeline: the user specifies his needs • Extensible: different sources of data (USMF format) • Data consistency: JSON format, append to ‘analysis’ • Reusable: open-source (ICWSM 2012)

Components

Gaussian Processes Task: Forecast hashtag frequency in Social Media - identify and categorise complex temporal patterns (EMNLP 2013) Non-parametric Bayesian framework • kernelised • probabilistic formulation • propagation of uncertainty • exact posterior inference for regression • Non-parametric extension of Bayesian regression • very good results, but hardly used in NLP

Gaussian Processes Define prior over functions Compute posterior (ACL 2014 Tutorial)

Extrapolation

Examples of time series #FYI #SNOW SE #FAIL #RAW

Experimental results

Experimental results Compared to Mean prediction

Text classification Task: Assign the hashtag to a given tweet • Most frequent (MF) • Naive Bayes model (NB-E) • Naive Bayes with GP forecast as prior (NB-P) MF NB-E NB-P Match@1 7.28% 16.04% 17.39% Match@5 19.90% 29.51% 31.91% Match@50 44.92% 59.17% 60.85% MRR 0.144 0.237 0.252

User behaviour 100 Task: Predict venue 50 check-in frequencies 0 • Modelled using GPs Linear SE PER PS Select -50 • Compared to Mean -100 -150

Individual user behaviour Task: Predict venue type of user check-in Method Accuracy • highly periodic Random 11.11% M.Freq Categ. 35.21% • compared to standard Markov-1 36.13% Markov predictors Markov-2 34.21% Daily period 38.92% Weekly period 40.65% (WebScience 2013)

Text based forecasting Task: predicting real world outcomes Aim: replace expensive polls with streaming text • predict political voting intention (not elections!) • based on social media (Twitter) text • strong baselines (last day, mean) • 2 different use cases (UK and Austria) • UK: 42k users, 60m tweets, 3 parties, 2 years (ACL 2013)

Linear regression w x t + β = y t

Linear regression 𝑜 w, β = argmin (𝑥𝑦 𝑗 + 𝛾 − 𝑧 𝑗 ) 2 𝑗=1

Linear regression 𝑜 w, β = argmin (𝑥𝑦 𝑗 + 𝛾 − 𝑧 𝑗 ) 2 + 𝜔 𝑓𝑚 (𝑥, 𝜍) 𝑗=1 LEN – Elastic Net

Bilinear regression • main issue is noise: many non-informative users • we look for a model of sparse words & sparse users • bi-convex optimisation problem • solved by alternatively fixing each set of weights and iterating until convergence

Bilinear regression u X t w T + β = y t

Bilinear regression 𝑜 w, u, β = argmin (𝑣𝑌 𝑗 𝑥 𝑈 + 𝛾 − 𝑧 𝑗 ) 2 𝑗=1

Bilinear regression 𝑜 w, u, β = argmin (𝑣𝑌 𝑗 𝑥 𝑈 + 𝛾 − 𝑧 𝑗 ) 2 + 𝜔 𝑓𝑚 𝑥, 𝜍 1 + 𝜔 𝑓𝑚 (𝑣, 𝜍 2 ) 𝑗=1 BEN – Bilinear Elastic Net

Bilinear regression 𝑜 𝑥 𝑢 , 𝑣 𝑢 , β = argmin (𝑣 𝑢 𝑌 𝑗 𝑥 𝑢 + 𝛾 − 𝑧 𝑢𝑗 ) 2 + 𝜔 𝑓𝑚 𝑥 𝑢 , 𝜍 1 + 𝜔 𝑓𝑚 (𝑣 𝑢 , 𝜍 2 ) 𝑗=1

Bilinear regression 𝜐 𝑜 w, u, β = argmin (𝑣 𝑢 𝑌 𝑗 𝑥 𝑢 + 𝛾 − 𝑧 𝑢𝑗 ) 2 + 𝜔 𝑚 1 𝑚 2 𝑥, 𝜍 1 + 𝜔 𝑚 1 𝑚 2 (𝑣, 𝜍 2 ) 𝑢=1 𝑗=1 BGL – Bilinear Group LASSO

Quantitative results Polls BEN Root Mean Squared Error (RMSE) forecasting results over 50 testing polls (in VI %) BGL

Quantitative results Party Tweet Score Author CON PM in friendly chat with top EU mate, Sweden’s Fredrik 1.334 Journalist Reinfeldt, before family photo Have Liberal Democrats broken electoral rules? Blog on -0.991 Journalist Labour complaint to cabinet secretary LAB Blog Post Liverpool: City of Radicals Website now Live 1.954 Art Fanzine <link> #liverpool #art I am so pleased to head Paul Savage who worked for -0.552 Politicial the Labour group has been Appointed the Marketing (Labour) manager for the baths hall GREAT NEWS LBD RT @user: Must be awful for TV bosses to keep getting 0.874 LibDem MP knocked back by all the women they ask to host election night (via @user) Blog Post Liverpool: City of Radicals 2011 – More -0.521 Art Fanzine Details Announced #liverpool #art

User features • The real-world outcome and users share: i. region info: London (L), South England (S), Midlands & Wales (MW), North (N), Scotland (Sc) - observed ii. gender: Male (M), Female (F) - inferred using statistical text-based classifier iii. age: 18-24, 25-39, 40-59, 60+ - unknown

Recap: Bilinear regression 𝜐 𝑜 w, u, β = argmin (𝑣 𝑢 𝑌 𝑗 𝑥 𝑢 + 𝛾 − 𝑧 𝑢𝑗 ) 2 + 𝜔 𝑚 1 𝑚 2 𝑥, 𝜍 1 + 𝜔 𝑚 1 𝑚 2 (𝑣, 𝜍 2 ) 𝑢=1 𝑗=1 BGL – Bilinear Group LASSO

Region & Demographics 𝜐 𝜖 𝑜 w, u, β = argmin (𝑣 𝑢𝑠 𝑌 𝑗𝑠 𝑥 𝑢𝑠 + 𝛾 𝑢𝑠 − 𝑧 𝑢𝑗𝑠 ) 2 + 𝑢=1 𝑠=1 𝑗=1 𝜖 𝜔 𝑚 1 𝑚 2 𝑥 𝑠 , 𝜍 1 + 𝜔 𝑚 1 𝑚 2 𝑥 𝑢 , 𝜍 1 + 𝜔 𝑚 1 𝑚 2 (𝑣 𝑠 , 𝜍 2 ) BGGR 𝑠=1

Region & Demographics 𝝂 S L MW N Sc 𝑪 𝝂 2.9 3.9 3.2 3.2 3.8 3.4 𝑪 𝒎𝒃𝒕𝒖 3.0 4.9 4.3 4.0 5.3 4.3 BGGR 2.6 3.9 3.2 3.0 3.7 3.3 Regional model 𝝂 M F 𝑪 𝝂 2.6 2.1 2.4 𝑪 𝒎𝒃𝒕𝒖 2.6 2.4 2.5 BGGR 2.1 2.1 2.1 Gender model

Region & Demographics London Predictions Female Predictions

Region & Demographics Conservatives, Positive London

NewsSummaries dataset Task: Predict socioeconomic EU indicators Dataset: • News summaries from Open Europe think tank • Daily summaries of EU and member states related news together with their news source • Feb 2006 – Nov 2013; 1,913 days; 94 months • 296 news outlets (with >10 summaries) • Features: unigrams + bigrams (LACSS 2014)

Predictions Unemployment ESI (Economic Sentiment Indicator) ESI Unemployment LEN 9.253 (9.89%) 0.9275 (8.75%) BEN 8.209 (8.77%) 0.9047 (8.52%)

Economic Sentiment Indicator

Unemployment

Deep linguistic features • Unigrams (8,912) (cameron) • Bigrams (33,206) (david__cameron) • POS (10,277) : Unigrams together with their part-of-speech (cameron/NNP) • NE (1,013) : Entities - Location, Person or Organisation (Person:David_Cameron) • Annotations (3,392) : Link entities to DBpedia e.g. political party (Org:Conservative_Party), office held (Office:Prime_minister)

Deep linguistic features Features ESI Unempl. Unigrams 8.21 1.27 Bigrams 9.66 1.61 Unigrams + Bigrams 8.91 1.47 POS 7.87 1.14 Entities 9.59 1.45 POS + NE 8.09 1.12 NE + Annotations 12.67 1.62 POS + NE + Annotations 10.50 1.31 Unigrams + NE + Annotations 10.92 1.31 Unigrams + Bigrams + NE + Annotations 10.81 1.53

Discovering the multifaceted information hidden within large - PowerPoint PPT Presentation

Discovering the multifaceted information hidden within large user-generated text streams Daniel Preotiuc-Pietro daniel@dcs.shef.ac.uk 23.04.2014 Context vast increase in user generated content Online Social Networks most

Finding Hidden Supernovae with Finding Hidden Supernovae with Finding Hidden Supernovae with

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

The 3 rd Covenant Re-Discovering the Word of God within the words of the Bible Re-Discovering The

Discovering Gods Word (Part-2) Discovering Gods Word (Part-2) Hermeneutics = The science

Outline depmixS4: an R-package for hidden Markov models Hidden Markov Models Ingmar Visser 1

Another view Hidden Input CEC is constant error Hidden carrousel No vanishing gradients

~ Discovering gold in the Cortez gold-trend of Nevada ~ NUG:V NULGF:QX Discovering gold in

Discovering Mammalian Endocytic Discovering Mammalian Endocytic Pathways with High- -Throughput

DISCOVERING OF CHILDREN NEEDS DISCOVERING OF CHILDREN NEEDS AND POTENTIALS: MAP SUPPORT IN

Discovering Flight Chapter Overview Discovering Flight The Early Days of Flight Chapter

Discovering Gods Word (Part-1) Discovering Gods Word The Inspired Word (Part-1) 2

LSTMs Overview Subhashini Venugopalan Neural Networks z t Output B Hidden Hidden Input WHY

Hidden Markov Models Pratik Lahiri Introduction A hidden Markov model (HMM) is a

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

Discovering Hidden Repetitions Florin Manea a l Gawrychowski b , Robert Merca s c , Dirk Nowotka

Dental Update February 2020 Oral health is multifaceted and includes the ability to speak,

Why FP matters to Credit Suisse Hundreds of thousands of live derivative trades Nightly

Physics 2D Lecture Slides Lecture 18: Feb 11 th Vivek Sharma UCSD Physics Non-repeating wave

Building a f ile syst em To build a f ile syst em f rom an array of disk 12: FFS,LFS and ot

Robots, Trade, and Luddism by Arnaud Costinot & Iv an Werning Brian C. Albrecht , V. V.

The IP 14 Team This day is brought to you by the IP 14 team: Methods for Longitudinal Data

Statistical Data Mining for Computational Financial Modeling Ali Serhan KOYUNCUGIL, Ph.D.

RETT, a Reasonably Exceptional Type Theory Pierre-Marie Pdrot 1 , Nicolas Tabareau 1 , Hans

Three flavor effects and Synergy between atmospheric and other experiments Srubabati Goswami

Discovering the multifaceted information hidden within large - PowerPoint PPT Presentation

Discovering the multifaceted information hidden within large user-generated text streams Daniel Preotiuc-Pietro daniel@dcs.shef.ac.uk 23.04.2014 Context vast increase in user generated content Online Social Networks most

Finding Hidden Supernovae with Finding Hidden Supernovae with Finding Hidden Supernovae with

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

The 3 rd Covenant Re-Discovering the Word of God within the words of the Bible Re-Discovering The

Discovering Gods Word (Part-2) Discovering Gods Word (Part-2) Hermeneutics = The science

Outline depmixS4: an R-package for hidden Markov models Hidden Markov Models Ingmar Visser 1

Another view Hidden Input CEC is constant error Hidden carrousel No vanishing gradients

~ Discovering gold in the Cortez gold-trend of Nevada ~ NUG:V NULGF:QX Discovering gold in

Discovering Mammalian Endocytic Discovering Mammalian Endocytic Pathways with High- -Throughput

DISCOVERING OF CHILDREN NEEDS DISCOVERING OF CHILDREN NEEDS AND POTENTIALS: MAP SUPPORT IN

Discovering Flight Chapter Overview Discovering Flight The Early Days of Flight Chapter

Discovering Gods Word (Part-1) Discovering Gods Word The Inspired Word (Part-1) 2

LSTMs Overview Subhashini Venugopalan Neural Networks z t Output B Hidden Hidden Input WHY

Hidden Markov Models Pratik Lahiri Introduction A hidden Markov model (HMM) is a

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

Discovering Hidden Repetitions Florin Manea a l Gawrychowski b , Robert Merca s c , Dirk Nowotka

Dental Update February 2020 Oral health is multifaceted and includes the ability to speak,

Why FP matters to Credit Suisse Hundreds of thousands of live derivative trades Nightly

Physics 2D Lecture Slides Lecture 18: Feb 11 th Vivek Sharma UCSD Physics Non-repeating wave

Building a f ile syst em To build a f ile syst em f rom an array of disk 12: FFS,LFS and ot

Robots, Trade, and Luddism by Arnaud Costinot &amp; Iv an Werning Brian C. Albrecht , V. V.

The IP 14 Team This day is brought to you by the IP 14 team: Methods for Longitudinal Data

Statistical Data Mining for Computational Financial Modeling Ali Serhan KOYUNCUGIL, Ph.D.

RETT, a Reasonably Exceptional Type Theory Pierre-Marie Pdrot 1 , Nicolas Tabareau 1 , Hans

Three flavor effects and Synergy between atmospheric and other experiments Srubabati Goswami

Robots, Trade, and Luddism by Arnaud Costinot & Iv an Werning Brian C. Albrecht , V. V.