Ranking pop songs through the years Julia Silge Data Scientist at - - PowerPoint PPT Presentation

ranking pop songs through the years
SMART_READER_LITE
LIVE PREVIEW

Ranking pop songs through the years Julia Silge Data Scientist at - - PowerPoint PPT Presentation

DataCamp Sentiment Analysis in R: The Tidy Way SENTIMENT ANALYSIS IN R : THE TIDY WAY Ranking pop songs through the years Julia Silge Data Scientist at Stack Overflow DataCamp Sentiment Analysis in R: The Tidy Way Lyrics of pop songs


slide-1
SLIDE 1

DataCamp Sentiment Analysis in R: The Tidy Way

Ranking pop songs through the years

SENTIMENT ANALYSIS IN R: THE TIDY WAY

Julia Silge

Data Scientist at Stack Overflow

slide-2
SLIDE 2

DataCamp Sentiment Analysis in R: The Tidy Way

Lyrics of pop songs

song_lyrics rank, the rank a song achieved on the Billboard Year-End Hot 100 song, the song's title artist, the artist who recorded the song year, the year the song reached the given rank on the Billboard chart lyrics, the lyrics of the song.

Lyrics from the Billboard Year-End Hot 100, sourced by Kaylin Walker

slide-3
SLIDE 3

DataCamp Sentiment Analysis in R: The Tidy Way

Tidying song lyrics

> tidy_lyrics %>% + count(word, sort = TRUE) # A tibble: 42,156 x 2 word n <chr> <int> 1 you 64606 2 i 56472 3 the 53451 4 to 35752 5 and 32555 6 me 31170 7 a 29282 8 it 25688 9 my 22821 10 in 18553 # ... with 42,146 more rows

slide-4
SLIDE 4

DataCamp Sentiment Analysis in R: The Tidy Way

Sentiment analysis of pop songs

> lyric_sentiment %>% + count(song, sentiment, total_words) # A tibble: 39,564 x 4 song sentiment total_words n <chr> <chr> <int> <int> 1 0 to 100 the catch up anger 894 29 2 0 to 100 the catch up anticipation 894 14 3 0 to 100 the catch up disgust 894 33 4 0 to 100 the catch up fear 894 9 5 0 to 100 the catch up joy 894 5 6 0 to 100 the catch up negative 894 47 7 0 to 100 the catch up positive 894 34 8 0 to 100 the catch up sadness 894 12 9 0 to 100 the catch up surprise 894 8 10 0 to 100 the catch up trust 894 29 # ... with 39,554 more rows

slide-5
SLIDE 5

DataCamp Sentiment Analysis in R: The Tidy Way

Let's practice!

SENTIMENT ANALYSIS IN R: THE TIDY WAY

slide-6
SLIDE 6

DataCamp Sentiment Analysis in R: The Tidy Way

Connecting sentiment to

  • ther quantities

SENTIMENT ANALYSIS IN R: THE TIDY WAY

Julia Silge

Data Scientist at Stack Overflow

slide-7
SLIDE 7

DataCamp Sentiment Analysis in R: The Tidy Way

Sentiment and...

how far a song reached on the Billboard chart? when the song was released?

slide-8
SLIDE 8

DataCamp Sentiment Analysis in R: The Tidy Way

Sentiment and rank

> lyric_sentiment %>% + filter(sentiment == "positive") %>% + count(song, rank, total_words) # A tibble: 4,777 x 4 song rank total_words n <chr> <int> <int> <int> 1 0 to 100 the catch up 97 894 34 2 1 2 3 4 sumpin new 40 670 18 3 1 2 3 red light 48 145 9 4 1 2 step 5 437 20 5 100 pure love 46 590 11 6 100 pure love 82 590 11 7 100 years 77 257 4 8 123 62 220 15 9 18 and life 61 285 9 10 19 somethin 84 281 6 # ... with 4,767 more rows

slide-9
SLIDE 9

DataCamp Sentiment Analysis in R: The Tidy Way

Exploring with boxplots

slide-10
SLIDE 10

DataCamp Sentiment Analysis in R: The Tidy Way

Let's practice!

SENTIMENT ANALYSIS IN R: THE TIDY WAY

slide-11
SLIDE 11

DataCamp Sentiment Analysis in R: The Tidy Way

Moving from rank to year

SENTIMENT ANALYSIS IN R: THE TIDY WAY

Julia Silge

Data Scientist at Stack Overflow

slide-12
SLIDE 12

DataCamp Sentiment Analysis in R: The Tidy Way

Sentiment and rank

slide-13
SLIDE 13

DataCamp Sentiment Analysis in R: The Tidy Way

Pop songs over time

> lyric_sentiment %>% + filter(sentiment == "positive") %>% + count(song, year, total_words) # A tibble: 4,772 x 4 song year total_words n <chr> <int> <int> <int> 1 0 to 100 the catch up 2014 894 34 2 1 2 3 4 sumpin new 1996 670 18 3 1 2 3 red light 1968 145 9 4 1 2 step 2005 437 20 5 100 pure love 1994 590 11 6 100 pure love 1995 590 11 7 100 years 2004 257 4 8 123 1988 220 15 9 18 and life 1989 285 9 10 19 somethin 2003 281 6 # ... with 4,762 more rows

slide-14
SLIDE 14

DataCamp Sentiment Analysis in R: The Tidy Way

Pop songs over time

Define new columns using mutate() Visualize using geom_boxplot()

slide-15
SLIDE 15

DataCamp Sentiment Analysis in R: The Tidy Way

Modeling sentiment

> sentiment_model <- lm(percent ~ year, data = sentiment_by_year) > summary(sentiment_model)

slide-16
SLIDE 16

DataCamp Sentiment Analysis in R: The Tidy Way

Let's practice!

SENTIMENT ANALYSIS IN R: THE TIDY WAY

slide-17
SLIDE 17

DataCamp Sentiment Analysis in R: The Tidy Way

You made it!

SENTIMENT ANALYSIS IN R: THE TIDY WAY

Julia Silge

Data Scientist at Stack Overflow

slide-18
SLIDE 18

DataCamp Sentiment Analysis in R: The Tidy Way

Positive sentiment and year

slide-19
SLIDE 19

DataCamp Sentiment Analysis in R: The Tidy Way

Diverse texts, powerful techniques

Social media text from Twitter Classic narrative text by Shakespeare TV news text sourced from closed captioning Lyrics from pop songs

slide-20
SLIDE 20

DataCamp Sentiment Analysis in R: The Tidy Way

Tidy text

Tidy data principles makes text mining easier and more effective http://tidytextmining.com/

slide-21
SLIDE 21

DataCamp Sentiment Analysis in R: The Tidy Way

Thanks!

SENTIMENT ANALYSIS IN R: THE TIDY WAY