#TagSpace: Semantic Embeddings from Hashtags Jason Weston, Sumit - PowerPoint PPT Presentation

#TagSpace: Semantic Embeddings from Hashtags Jason Weston, Sumit Chopra, Keith Adams – 2014 Jack Lanchantin

Motivation • Word and document embeddings are difficult to learn • Most current techniques use unsupervised methods • word2vec learns word embeddings by trying to predict each word in a doc based on surrounding text • Hashtags: labels of text for such as sentiment (#happy) or topic annotation (#nyc) written by the post author • Hashtag prediction provides better way to learn word and document embeddings than unsupervised learning because hashtags provide stronger semantic guidance

Overview Model Hashtag Prediction Document Recommendation Conclusion

Overview • #TagSpace: Convolutional Neural Network that learns features (embeddings) of short textual posts using hashtags as the supervised signal • Train the network to be able to optimally predict hashtags on test posts • The learned embedding of text (ignoring the hashtag labels) is useful for other tasks such as document recommendation

Neural Net For Scoring a (doc, hashtag) Pair Representation of entire document Assigning a d Assigning a d Hidden network layers dimensional dimensional Scoring vector to each of vector to the function the l words hashtag

Training the Scoring Function • Given a document, rank all hashtags by score: • • Loss function is used to approximately optimize the top of the ranked list – useful for P and R@k • More energy spent on improving ranking of positive labels near the top of ranked list

Hashtag Prediction • Goal: Rank a post’s ground truth hashtags higher than hashtags it does not contain • Test using: Precision @ 1, Recall @10, mean rank for the hashtags of 50,000 test posts • Compared to 4 other models: – Frequency: always ranks hashtags by training frequency – #words: “crazy commute this am” → #crazy, #commute, #this, #am – Word2vec (unsupervised) – WSABIE (supervised)

Data Business, Celebrity, Brand, or Product Individual users

#TagSpace Examples (256 dim) Post Predicted Hashtags

Hashtag Prediction Results

Personalized Document Recommendation • Goal: extend the learned representations from predicting hashtags to do other tasks • Document recommendation: recommending documents to users based on interaction history • Used day-long interaction histories for 34,000 people on Facebook • Text of posts that he/she liked, clicked, replied to • Given n-1 trailing posts, predict the nth post by ranking it against 10,000 other posts • Score of nth post is obtained by max embedding similarity over n-1 posts • Used cosine similarity between post embeddings

Document Recommendation Results TF-IDF weighted bag of words baseline Best results come from summing BOW scores w/ Tagspace

Conclusion • Outperformed all comparison models in hashtag prediction • Model scales very well when considering a large number (millions) of hashtags • Logistic regression and SVMs do not • Semantics of hashtags cause #TagSpace to learn features that capture important aspects of text • Able to port the learned embeddings to the task of personalized document recommendation with better accuracy than other models

#TagSpace: https://research.facebook.com/publications/279494668926031/- tagspace-semantic-embeddings-from-hashtags/ WSABIE: http://www.thespermwhale.com/jaseweston/papers/wsabie-ijcai.pdf Word2Vec: http://arxiv.org/abs/1301.3781

#TagSpace: Semantic Embeddings from Hashtags Jason Weston, Sumit - PowerPoint PPT Presentation

#TagSpace: Semantic Embeddings from Hashtags Jason Weston, Sumit Chopra, Keith Adams 2014 Jack Lanchantin Motivation Word and document embeddings are difficult to learn Most current techniques use unsupervised methods word2vec

#HASHTAGS, SELFIES & BEING A CHRISTIAN ONLINE @davemiers @thedavemiers davemiers.com #

Embeddings @ Twitter Making ML easy with Embeddings !!! Sept 2018 Agenda 1 Team 2 Whats an

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme:

Word Embeddings Natural Language Processing VU (706.230) - Andi Rexha 02/04/2020 Word Embeddings

Word Embeddings Revisited: Contextual Embeddings CS 6956: Deep Learning for NLP Overview

Efficient Indexing of Hashtags using Bitmap Indices DOLAP 2019 - Lisbon (Portugal) - Mar 26, 2019

6/11/2019 NexaIntelligence Visualizations Top Terms Know the hashtags Timeline and words

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Mixed membership word embeddings: Corpus-specific embeddings without big data James Foulds

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

What is best for spoken langage understanding: small but task-dependent embeddings or huge but

Creating Semantic Mashups: Bridging Web 2.0 and the Semantic Web Jamie Taylor, Colin Evans, Toby

: on the Semantic Web : on the Semantic Web Building a Semantic Prototype for Danish Building a

Semantic Processing Augmenting CFGs Currying Quantifier scope Semantic Grammars L445 / L545

QoS for (Web) Applications Velocity EU 2011 Intelligent Self Activity Regulated Metering

8a. Stability: Application of Predictive Stability Models to Extrapolate Shelf-life Andrew

THROUGH DATA WITHOUT DATA IT IS JUST AN OPINION TYPES OF ASSESSMENT & PURPOSES

Predicting Global Failure Regimes in Complex Information Systems Chris Dabrowski, Jim Filliben

Sound Transit Agency Safety Plan Rider Experience and Operations Committee 6/4/2020 Why we are

PROCESS (CMP) UPDATE Regional Transportation Council March 14, 2019 Mike Galizio Principal

Federal Transit Administration Federal Transit Administration State Management Review State

Combined Transportation, Emergency, & Communications Center (CTECC) CTECC Has Four Partners

#TagSpace: Semantic Embeddings from Hashtags Jason Weston, Sumit - PowerPoint PPT Presentation

#TagSpace: Semantic Embeddings from Hashtags Jason Weston, Sumit Chopra, Keith Adams 2014 Jack Lanchantin Motivation Word and document embeddings are difficult to learn Most current techniques use unsupervised methods word2vec

#HASHTAGS, SELFIES &amp; BEING A CHRISTIAN ONLINE @davemiers @thedavemiers davemiers.com #

Embeddings @ Twitter Making ML easy with Embeddings !!! Sept 2018 Agenda 1 Team 2 Whats an

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme:

Word Embeddings Natural Language Processing VU (706.230) - Andi Rexha 02/04/2020 Word Embeddings

Word Embeddings Revisited: Contextual Embeddings CS 6956: Deep Learning for NLP Overview

Efficient Indexing of Hashtags using Bitmap Indices DOLAP 2019 - Lisbon (Portugal) - Mar 26, 2019

6/11/2019 NexaIntelligence Visualizations Top Terms Know the hashtags Timeline and words

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Mixed membership word embeddings: Corpus-specific embeddings without big data James Foulds

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

What is best for spoken langage understanding: small but task-dependent embeddings or huge but

Creating Semantic Mashups: Bridging Web 2.0 and the Semantic Web Jamie Taylor, Colin Evans, Toby

: on the Semantic Web : on the Semantic Web Building a Semantic Prototype for Danish Building a

Semantic Processing Augmenting CFGs Currying Quantifier scope Semantic Grammars L445 / L545

QoS for (Web) Applications Velocity EU 2011 Intelligent Self Activity Regulated Metering

8a. Stability: Application of Predictive Stability Models to Extrapolate Shelf-life Andrew

THROUGH DATA WITHOUT DATA IT IS JUST AN OPINION TYPES OF ASSESSMENT &amp; PURPOSES

Predicting Global Failure Regimes in Complex Information Systems Chris Dabrowski, Jim Filliben

Sound Transit Agency Safety Plan Rider Experience and Operations Committee 6/4/2020 Why we are

PROCESS (CMP) UPDATE Regional Transportation Council March 14, 2019 Mike Galizio Principal

Federal Transit Administration Federal Transit Administration State Management Review State

Combined Transportation, Emergency, &amp; Communications Center (CTECC) CTECC Has Four Partners

#HASHTAGS, SELFIES & BEING A CHRISTIAN ONLINE @davemiers @thedavemiers davemiers.com #

THROUGH DATA WITHOUT DATA IT IS JUST AN OPINION TYPES OF ASSESSMENT & PURPOSES

Combined Transportation, Emergency, & Communications Center (CTECC) CTECC Has Four Partners