Sentiment analysis IN TRODUCTION TO N ATURAL LAN GUAGE P ROCES S IN - PowerPoint PPT Presentation

Sentiment analysis IN TRODUCTION TO N ATURAL LAN GUAGE P ROCES S IN G IN R Kasey Jones Research Data Scientist

Sentiment analysis Assess subjective information from text Types of sentiment analysis: positive vs negative words eliciting emotions Each word is given a meaning and sometimes a score abandon -> fear accomplish -> joy INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

Tidytext sentiments library(tidytext) sentiments # A tibble: 27,314 x 4 word sentiment lexicon score <chr> <chr> <chr> <int> 1 abacus trust nrc NA 2 abandon fear nrc NA 3 abandon negative nrc NA 4 abandon sadness nrc NA 5 abandoned anger nrc NA INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

3 lexicons AFINN : scores words from -5 (extremely negative) to 5 (extremely positive) bing : positive/negative label for all words nrc : labels words as fear, joy, anger, etc. library(tidytext) get_sentiments("afinn") # A tibble: 2,476 x 2 1 abandon -2 2 abandoned -2 3 abandons -2 ... INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

Prepare your data. # Read the data animal_farm <- read.csv("animal_farm.csv", stringsAsFactors = FALSE) animal_farm <- as_tibble(animal_farm) # Tokenize and remove stop words animal_farm_tokens <- animal_farm %>% unnest_tokens(output = "word", token = "words", input = text_column) %>% anti_join(stop_words) INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

The a�nn lexicon animal_farm_tokens %>% inner_join(get_sentiments("afinn")) # A tibble: 1,175 x 3 chapter word score <chr> <chr> <int> 1 Chapter 1 drunk -2 2 Chapter 1 strange -1 3 Chapter 1 dream 1 4 Chapter 1 agreed 1 5 Chapter 1 safely 1 INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

a�nn continued animal_farm_tokens %>% inner_join(get_sentiments("afinn")) %>% group_by(chapter) %>% summarise(sentiment = sum(score)) %>% arrange(sentiment) # A tibble: 10 x 2 chapter sentiment <chr> <int> 1 Chapter 7 -166 2 Chapter 8 -158 3 Chapter 4 -84 INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

The bing lexicon word_totals <- animal_farm_tokens %>% chapter sentiment n p group_by(chapter) %>% 1 Chapter 7 negative 154 0.11711027 count() 2 Chapter 6 negative 106 0.10750507 3 Chapter 4 negative 68 0.10559006 4 Chapter 10 negative 117 0.10372340 animal_farm_tokens %>% 5 Chapter 8 negative 155 0.10006456 inner_join(get_sentiments("bing")) %>% 6 Chapter 9 negative 121 0.09152799 group_by(chapter) %>% 7 Chapter 3 negative 65 0.08843537 count(sentiment) %>% 8 Chapter 1 negative 77 0.08603352 filter(sentiment == 'negative') %>% 9 Chapter 5 negative 93 0.08462238 transform(p = n / word_totals$n) %>% 10 Chapter 2 negative 67 0.07395143 arrange(desc(p)) INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

The nrc lexicon as.data.frame(table(get_sentiments("nrc")$sentiment)) %>% arrange(desc(Freq)) Var1 Freq 1 negative 3324 2 positive 2312 3 fear 1476 4 anger 1247 5 trust 1231 6 sadness 1191 ... INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

nrc continued fear <- get_sentiments("nrc") %>% # A tibble: 220 x 2 filter(sentiment == "fear") word n animal_farm_tokens %>% <chr> <int> inner_join(fear) %>% 1 rebellion 29 count(word, sort = TRUE) 2 death 19 3 gun 19 4 terrible 15 5 bad 14 6 enemy 12 7 broke 11 ... INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

Sentiment time. IN TRODUCTION TO N ATURAL LAN GUAGE P ROCES S IN G IN R

Word embeddings IN TRODUCTION TO N ATURAL LAN GUAGE P ROCES S IN G IN R Kasey Jones Research Data Scientist

The �aw in word counts Two statements: Bob is the smartest person I know. Bob is the most brilliant person I know. Without stop words: Bob smartest person Bob brilliant person INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

Word meanings Additional data: The smartest people ... He was the smartest ... Brilliant people ... His was so brilliant ... INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

word2vec represents words as a large vector space captures multiple similarities between words words of similarly meaning are closer within the space 1 2 3 4 5 6 https://www.adityathakker.com/introduction to word2vec how it works/ INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

Preparing data library(h2o) h2o.init() h2o_object = as.h2o(animal_farm) T okenize using h2o: words <- h2o.tokenize(h2o_object$text_column, "\\\\W+") words <- h2o.tolower(words) words = words[is.na(words) || (!words %in% stop_words$word),] INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

word2vec modeling word2vec_model <- h2o.word2vec(words, min_word_freq = 5, epochs = 5) min_word_freq : removes words used fewer than 5 times epochs : number of training iterations to run INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

Word synonyms h2o.findSynonyms(w2v.model, "animal") h2o.findSynonyms(w2v.model, "jones") synonym score synonym score 1 drink 0.8209088 1 battle 0.7996588 2 age 0.7952490 2 discovered 0.7944554 3 alcohol 0.7867004 3 cowshed 0.7823287 4 act 0.7710537 4 enemies 0.7766532 5 hero 0.7658424 5 yards 0.7679787 INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

Additional uses classi�cation modeling sentiment analysis topic modeling INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

Apply word2vec IN TRODUCTION TO N ATURAL LAN GUAGE P ROCES S IN G IN R

Additional NLP analysis IN TRODUCTION TO N ATURAL LAN GUAGE P ROCES S IN G IN R Kasey Jones Research Data Scientist

BERT, and ERNIE. What is it: BERT: Bidirectional Encoder Representations from Transformers A model used in transfer learning for NLP tasks is pre-trained on unlabeled data to create a language representation requires only small amounts of labeled data to train for speci�c task What is it used for: supervised tasks to create features for NLP models ERNIE: Enhanced Representation through kNowledge IntEgration INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

Named Entity Recognition What is it: classi�es named entities within text Examples: names, locations, organizations, values What is it used for: extracting entities from tweets aiding recommendation engines search algorithms INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

Part-of-speech tagging What is it: tagging words with their part-of-speech nouns, verbs, adjectives, etc. How is it used: aids in sentiment analysis creates features for NLP models enhances what a model knows about each word in text INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

Let's recap. IN TRODUCTION TO N ATURAL LAN GUAGE P ROCES S IN G IN R

Conclusion IN TRODUCTION TO N ATURAL LAN GUAGE P ROCES S IN G IN R Kasey Jones Research Data Scientist

Course recap The pre-processing: tokenization stop-word removal data formats (tibbles, VCorpus, h2o frame) The classics: sentiment analysis text classi�cation topic modeling INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

Recap continued The advanced techniques word embeddings BERT/ERNIE The Next Steps practice master the basics INTRODUCTION TO NATURAL LANGUAGE PROCESSING IN R

Course complete! IN TRODUCTION TO N ATURAL LAN GUAGE P ROCES S IN G IN R

Sentiment analysis IN TRODUCTION TO N ATURAL LAN GUAGE P ROCES S IN - PowerPoint PPT Presentation

Sentiment analysis IN TRODUCTION TO N ATURAL LAN GUAGE P ROCES S IN G IN R Kasey Jones Research Data Scientist Sentiment analysis Assess subjective information from text Types of sentiment analysis: positive vs negative words eliciting

Twitter Sentiment Analysis Twitter Sentiment Analysis Presented by: Loitongbam Gyanendro Singh

Sentiment analysis Christopher Potts CS 244U: Natural language understanding May 19 1 / 83

Pl u tchik ' s w heel of emotion , polarit y v s . sentiment SE N TIME N T AN ALYSIS IN R Ted K

Linguistic Expressions of Sentiment, Subjectivity & Stance Ling575 Sentiment April 1, 2014

Welcome! Julia Silge Data Scientist at Stack Overflow DataCamp Sentiment Analysis in R: The

Sentiment Analysis What is Sentiment Analysis? Positive or negative

Sentiment Analysis What is Sentiment Analysis? Dan Jurafsky Positive or negative movie review?

Multilingual Sentiment Analysis in Social Media Supervisors Candidate Dr. Rodrigo Agerri Iaki

Sentiment Analysis in Twitter Rohit Kumar Jha, Sakaar Khurana Sentiment Analysis in Twitter

Analysis in Hindi Naman Bansal Umair Z Ahmed MOTIVATION Why Sentiment Analysis? Labeling

Tidying Shakespeare Julia Silge Data Scientist at Stack Overflow DataCamp Sentiment Analysis in

Feature extraction for sentiment analysis on twitter data with spanish language Victor Mu niz

Sentiment analysis tasks and methods Mike Thelwall University of Wolverhampton, UK Contents

Exploiting New Sentiment-Based Meta-level Features for Effective Sentiment Analysis Srgio

Rule-Based Sentiment Analysis in Narrow Domain Detecting Sentiment in Daily Horoscopes Using

Sentiment Analysis A Baseline Algorithm Dan Jurafsky Sentiment

Communication: A Catalyst for Growing Positive Culture Katie Dively, M.S., CHES. Jay Otto, M.S.

Compositionality in Semantic Spaces Martha Lewis ILLC University of Amsterdam 2nd Symposium on

What Matters Most In Morphologically Segmented SMT Models? Mohammad Salameh Colin Cherry Greg

RE REENT NTRY CONV NVERSATION Iowa Conference UMC Clergy Townhall WORDS OF WELCOME TownHall

Terrorism and the Boston Marathon Fear, Hope, and Resilience

Working with Families: Tips for Effective Communication and Strategies for Challenging Situations

BIR IRTH TH PANGS NGS Fear & Trust God! KOGmissions.com WORLD VIEWS? Fear or Trust?

Advance Caching 1 Today quiz 5 recap quiz 6 recap advanced caching Hand a

Sentiment analysis IN TRODUCTION TO N ATURAL LAN GUAGE P ROCES S IN - PowerPoint PPT Presentation

Sentiment analysis IN TRODUCTION TO N ATURAL LAN GUAGE P ROCES S IN G IN R Kasey Jones Research Data Scientist Sentiment analysis Assess subjective information from text Types of sentiment analysis: positive vs negative words eliciting

Twitter Sentiment Analysis Twitter Sentiment Analysis Presented by: Loitongbam Gyanendro Singh

Sentiment analysis Christopher Potts CS 244U: Natural language understanding May 19 1 / 83

Pl u tchik ' s w heel of emotion , polarit y v s . sentiment SE N TIME N T AN ALYSIS IN R Ted K

Linguistic Expressions of Sentiment, Subjectivity &amp; Stance Ling575 Sentiment April 1, 2014

Welcome! Julia Silge Data Scientist at Stack Overflow DataCamp Sentiment Analysis in R: The

Sentiment Analysis What is Sentiment Analysis? Positive or negative

Sentiment Analysis What is Sentiment Analysis? Dan Jurafsky Positive or negative movie review?

Multilingual Sentiment Analysis in Social Media Supervisors Candidate Dr. Rodrigo Agerri Iaki

Sentiment Analysis in Twitter Rohit Kumar Jha, Sakaar Khurana Sentiment Analysis in Twitter

Analysis in Hindi Naman Bansal Umair Z Ahmed MOTIVATION Why Sentiment Analysis? Labeling

Tidying Shakespeare Julia Silge Data Scientist at Stack Overflow DataCamp Sentiment Analysis in

Feature extraction for sentiment analysis on twitter data with spanish language Victor Mu niz

Sentiment analysis tasks and methods Mike Thelwall University of Wolverhampton, UK Contents

Exploiting New Sentiment-Based Meta-level Features for Effective Sentiment Analysis Srgio

Rule-Based Sentiment Analysis in Narrow Domain Detecting Sentiment in Daily Horoscopes Using

Sentiment Analysis A Baseline Algorithm Dan Jurafsky Sentiment

Communication: A Catalyst for Growing Positive Culture Katie Dively, M.S., CHES. Jay Otto, M.S.

Compositionality in Semantic Spaces Martha Lewis ILLC University of Amsterdam 2nd Symposium on

What Matters Most In Morphologically Segmented SMT Models? Mohammad Salameh Colin Cherry Greg

RE REENT NTRY CONV NVERSATION Iowa Conference UMC Clergy Townhall WORDS OF WELCOME TownHall

Terrorism and the Boston Marathon Fear, Hope, and Resilience

Working with Families: Tips for Effective Communication and Strategies for Challenging Situations

BIR IRTH TH PANGS NGS Fear &amp; Trust God! KOGmissions.com WORLD VIEWS? Fear or Trust?

Advance Caching 1 Today quiz 5 recap quiz 6 recap advanced caching Hand a

Linguistic Expressions of Sentiment, Subjectivity & Stance Ling575 Sentiment April 1, 2014

BIR IRTH TH PANGS NGS Fear & Trust God! KOGmissions.com WORLD VIEWS? Fear or Trust?