The Evolution of Spotify Home Architecture Emily Anil Staff - - PowerPoint PPT Presentation

the evolution of spotify home architecture emily anil
SMART_READER_LITE
LIVE PREVIEW

The Evolution of Spotify Home Architecture Emily Anil Staff - - PowerPoint PPT Presentation

The Evolution of Spotify Home Architecture Emily Anil Staff Engineer Data Engineer @anilmuppallar @emilymsa Our mission is to unlock the potential of human creativity by giving a million creative artists the opportunity to live off


slide-1
SLIDE 1

The Evolution of Spotify Home Architecture

slide-2
SLIDE 2

Emily

Staff Engineer

Anil

Data Engineer @anilmuppallar @emilymsa

slide-3
SLIDE 3

Our mission is to unlock the potential of human creativity — by giving a million creative artists the opportunity to live off their art and billions of fans the

  • pportunity to enjoy and be inspired by it.
slide-4
SLIDE 4
slide-5
SLIDE 5
slide-6
SLIDE 6

shelf shelf name card

slide-7
SLIDE 7

Overview

  • Started with a Batch architecture
  • Used services to hide complexity and be more reactive
  • Leveraged GCP and added streaming pipelines to build

a product based on user activity

slide-8
SLIDE 8

Batch

2016

slide-9
SLIDE 9

Batch

Songs Played Logs Word2Vec

slide-10
SLIDE 10

word2vec

A natural language processing model to learn vector representations of words (“embeddings”) from text.

https://www.tensorflow.org/tutorials/word2vec

slide-11
SLIDE 11

word2vec

Input: Playlists Output: Vector representation of tracks

slide-12
SLIDE 12

word2vec

Input: Playlists Output: Vector representation of tracks 2Pac Bach Mozart

slide-13
SLIDE 13

Batch

Songs Played Logs Word2Vec

slide-14
SLIDE 14

Batch

Songs Played Logs Hadoop Jobs Word2Vec

slide-15
SLIDE 15

Batch

Songs Played Logs Hadoop Jobs Cassandra Word2Vec

slide-16
SLIDE 16

Batch

Songs Played Logs Hadoop Jobs Cassandra Word2Vec

slide-17
SLIDE 17

Batch

Songs Played Logs Hadoop Jobs Cassandra Word2Vec CMS

slide-18
SLIDE 18

Batch

Songs Played Logs Hadoop Jobs Fetch Shelf for Home Cassandra Word2Vec CMS

slide-19
SLIDE 19

Pros & Cons

+ Low latency to load Home + Fallback to old data if it fails to generate recommendations

  • Recommendations updated
  • nce every 24 hours
  • Calculate recommendations

for every user, even if they aren’t active

  • Experimentation can be

difficult

  • Operational overhead to

maintain Cassandra and Hadoop

slide-20
SLIDE 20

Batch

Songs Played Logs Hadoop Jobs Fetch Shelf for Home Cassandra Word2Vec CMS

slide-21
SLIDE 21

Batch

Songs Played Logs Hadoop Jobs Fetch Shelf for Home Cassandra Word2Vec CMS

slide-22
SLIDE 22

Services

2017

slide-23
SLIDE 23

Services

Songs Played Service Word2Vec Service

slide-24
SLIDE 24

Services

Songs Played Service CMS Word2Vec Service

slide-25
SLIDE 25

Services

Create Shelf for Home CMS Songs Played Service Word2Vec Service

slide-26
SLIDE 26

Services

CMS Songs Played Service Word2Vec Service Create Shelf for Home

slide-27
SLIDE 27

Services

CMS Songs Played Service Word2Vec Service Create Shelf for Home Create Shelf for Home

slide-28
SLIDE 28

Services

CMS Songs Played Service Word2Vec Service Create Shelf for Home Create Shelf for Home Create Shelf for Home

slide-29
SLIDE 29

Pros & Cons

+ Updates recommendations at request time + Calculate recommendations for Home users only + Simplified stack + Easier to Experiment + Google managed infrastructure

  • High latency to load Home
  • No fallback if request fails
slide-30
SLIDE 30

Streaming ++ Services

2018 - Present

slide-31
SLIDE 31

Streaming Pipelines

  • Google Dataflow pipelines using Spotify Scio - scala wrapper on Apache

Beam

  • Real time data - Unbounded stream of user events

○ All user events are available as Google Pubsub topics

  • Perform aggregation operations using time based windows

○ groupBy, countBy, join...

  • Store the results

○ Pubsub, BigQuery, GCS, Bigtable

slide-32
SLIDE 32

follow

Real time Signals

slide-33
SLIDE 33

Real time Signals

follow

pubsub pubsub pubsub

slide-34
SLIDE 34

Streaming Pipeline Real time Signals

follow

pubsub pubsub pubsub

slide-35
SLIDE 35

Streaming Pipeline Real time Signals

follow

pubsub pubsub pubsub pubsub

slide-36
SLIDE 36

pubsub

Streaming Pipeline Real time Signals

follow

Create Shelves

slide-37
SLIDE 37

pubsub

Streaming Pipeline Real time Signals

follow

Create Shelves

slide-38
SLIDE 38

pubsub

Streaming Pipeline Real time Signals

follow

Songs Played Service Word2Vec Service Create Shelves

slide-39
SLIDE 39

BT

pubsub

Streaming Pipeline BT

Write

Write Shelf Real time Signals

follow

Fetch Shelf Songs Played Service Word2Vec Service Create Shelves

slide-40
SLIDE 40

pubsub

Streaming Pipeline CMS Real time Signals

follow

BT

BT

Write

Write Shelf Fetch Shelf Songs Played Service Word2Vec Service Create Shelves

slide-41
SLIDE 41

Pros & Cons

+ Updates recommendations based

  • n user events

+ Computing recommendations out

  • f request path

+ Fresher content, driven by user sessions + Fallback to previously generated recommendations + Easy to experiment

  • More complex stack
  • More tuning in the system
  • Event spikes

+ Guardrails

  • Debugging is more complicated
slide-42
SLIDE 42

Lessons Learned

Batch

+ Fallback to old recommendations + Low latency to load Home

  • Updates are slow

Services

+ Updates are fast

  • High Latency to load

Home

  • No fallback if

request fails

Streaming ++ Services

+ Updates are frequent/fast + Low latency to load Home + Fallback to old recommendations

  • Balance computation

frequency and downstream system load

slide-43
SLIDE 43

Lessons Learned

Batch

+ Fallback to old recommendations + Low latency to load Home

  • Updates are slow

Services

+ Updates are fast

  • High Latency to load

Home

  • No fallback if

request fails

Streaming ++ Services

+ Updates are frequent/fast + Low latency to load Home + Fallback to old recommendations

  • Balance computation

frequency and downstream system load

slide-44
SLIDE 44

Takeaways

  • Less overhead with managed infrastructure. Focus more on

product

  • If you care about timeliness, then adopt streaming pipelines

○ Beware of event spikes

  • Optimize for developer productivity and ease of experimentation

○ Creating a new shelf is as simple as writing a new function.

slide-45
SLIDE 45

Hi! I’m Luna, Any questions?