DATA AT SWEDEN'S TELEVISION Ismail Elouafiq A wide spectrum of Apps - - PowerPoint PPT Presentation

data at sweden s television
SMART_READER_LITE
LIVE PREVIEW

DATA AT SWEDEN'S TELEVISION Ismail Elouafiq A wide spectrum of Apps - - PowerPoint PPT Presentation

LESSONS & PITFALLS DATA AT SWEDEN'S TELEVISION Ismail Elouafiq A wide spectrum of Apps A wide spectrum of Apps Running on different platforms A wide spectrum of Users STRATEGY ANALYSTS PRODUCT OWNERS A wide spectrum of Users STRATEGY


slide-1
SLIDE 1

LESSONS & PITFALLS

DATA AT SWEDEN'S TELEVISION

Ismail Elouafiq

slide-2
SLIDE 2
slide-3
SLIDE 3
slide-4
SLIDE 4
slide-5
SLIDE 5
slide-6
SLIDE 6
slide-7
SLIDE 7
slide-8
SLIDE 8
slide-9
SLIDE 9

A wide spectrum of Apps

slide-10
SLIDE 10

A wide spectrum of Apps

Running on different platforms

slide-11
SLIDE 11

PRODUCT OWNERS STRATEGY

ANALYSTS

A wide spectrum of Users

slide-12
SLIDE 12

PRODUCT OWNERS STRATEGY

ANALYSTS

AUTHORS/ EDITORS DEVELOPERS

A wide spectrum of Users

slide-13
SLIDE 13

tl;dr:

Defining what to prioritise

slide-14
SLIDE 14

tl;dr:

Defining what to prioritise

“Data: I could be chasing an untamed

  • rnithoid without cause.”

― Star Trek The Next Generation

slide-15
SLIDE 15

tl;dr:

Defining what to prioritise

“Data: I could be chasing an untamed

  • rnithoid without cause.”

― Star Trek The Next Generation

slide-16
SLIDE 16

tl;dr:

Spoilers: how and why we now use protobuf, functional data engineering and ETL practices Experimenting and iterating in small increments Defining what to prioritise

slide-17
SLIDE 17

AI Deep reinforcement learning

tl;dr:

Spoilers: how and why we now use protobuf, functional data engineering and ETL practices Experimenting and iterating in small increments Defining what to prioritise

BLOCKCHAIN

slide-18
SLIDE 18

tl;dr:

Experimenting and iterating in small increments Defining what to prioritise

slide-19
SLIDE 19

ismail.land/velocity

tl;dr:

Experimenting and iterating in small increments Defining what to prioritise

slide-20
SLIDE 20

tl;dr:

Experimenting and iterating in small increments Defining what to prioritise

ismail.land/velocity

slide-21
SLIDE 21

What events should you collect?

slide-22
SLIDE 22

What events should you collect?

slide-23
SLIDE 23

what we want to know

How many people read the article per day

slide-24
SLIDE 24

what we want to know

click scroll share

How many people read the article per day

what we can

  • bserve
slide-25
SLIDE 25

what we want to know

click scroll share

How many people read the article per day

what we can

  • bserve

events

slide-26
SLIDE 26

what we want to know

click scroll share

How many people read the article per day

what we can

  • bserve

explicit model events

slide-27
SLIDE 27

what we want to know

How many people read the article per day

click scroll share

what we can

  • bserve

explicit model events

slide-28
SLIDE 28
slide-29
SLIDE 29

let's start with views

slide-30
SLIDE 30

If you could do anything with data... What would you actually use for decision making

slide-31
SLIDE 31

If you could do anything with data... What would you actually use for decision making

A/B tests... Hell yeah!

slide-32
SLIDE 32

tl;dr:

Experimenting and iterating in small increments Defining what to prioritise

ismail.land/velocity

slide-33
SLIDE 33

1

COLLECT

2

INGEST

SDK

First we need to collect data

slide-34
SLIDE 34

1

COLLECT

2

INGEST

SDK

events

Event API

slide-35
SLIDE 35

1

COLLECT

2

INGEST

SDK

events

Event API

publish

slide-36
SLIDE 36

1

COLLECT

2

INGEST

SDK

events

Event API

publish

pub/sub

slide-37
SLIDE 37

2

INGEST pub/sub

slide-38
SLIDE 38

2

INGEST pub/sub

2

STORE

slide-39
SLIDE 39

2

INGEST pub/sub

2

STORE Events table

slide-40
SLIDE 40

2

INGEST pub/sub

2

STORE Events table

subscribe

judge-judi

write

slide-41
SLIDE 41

2

INGEST pub/sub

2

STORE Events table

subscribe

judge-judi

write

slide-42
SLIDE 42

3

STORE

2

INGEST

1

COLLECT

slide-43
SLIDE 43

3

STORE

2

INGEST

1

COLLECT

{event_type: click} { eventType: click} {eventType: klick}

slide-44
SLIDE 44

3

STORE

2

INGEST

1

COLLECT

{event_type: click} { eventType: click} {eventType: klick}

slide-45
SLIDE 45

3

STORE

2

INGEST

1

COLLECT

More Issues

Multiple teams/platforms =>takes time to update the clients The schema is sent with every event Unclear types (arbitrary memory allocation)

slide-46
SLIDE 46

3

STORE

2

INGEST

1

COLLECT

More Issues

Multiple teams/platforms =>takes time to update the clients The schema is sent with every event Unclear types (arbitrary memory allocation) We know the schema on all levels we have a common model for the data.. how can we make use of that...

slide-47
SLIDE 47

ENTER PROTOBUF

Keepign a centralized Event Schema

slide-48
SLIDE 48

ENTER PROTOBUF

Keepign a centralized Event Schema

person.proto

slide-49
SLIDE 49

ENTER PROTOBUF

Keepign a centralized Event Schema

person.proto person.go person.js person.f compiler

slide-50
SLIDE 50

ENTER PROTOBUF

Keepign a centralized Event Schema

person.js Person Client Person Server person.js binary serialize deserialize

slide-51
SLIDE 51

1 - Define the Schema As a .proto file

ENTER PROTOBUF

Keepign a centralized Event Schema

event.proto

slide-52
SLIDE 52

1 - Define the Schema As a .proto file

ENTER PROTOBUF

Keepign a centralized Event Schema

event.proto

2 - Publish libraries Publish using CI pipeline

go, js, java, swift

slide-53
SLIDE 53

1 - Define the Schema As a .proto file

ENTER PROTOBUF

Keepign a centralized Event Schema

event.proto

2 - Publish libraries Publish using CI pipeline 2 - Fetch Fetch in SDKs (serialization) Fetch in Judy (deserialization) Use to generate table

go, js, java, swift

slide-54
SLIDE 54

3

STORE

2

INGEST

1

COLLECT DEFINE

slide-55
SLIDE 55

My work here is done!

slide-56
SLIDE 56

Not really... Backward and forward compatibility Table changes Language agnostic but nor really Lack of support

slide-57
SLIDE 57

The Data Pyramid

Collection and ingestion Storage, transformation, monitoring

slide-58
SLIDE 58

The Data Pyramid

Collection and ingestion Storage, transformation, monitoring

slide-59
SLIDE 59

The Data Pyramid

Metrics, aggregations, KPIs Collection and ingestion Learn, Optimise, Experiment Storage, transformation, monitoring

slide-60
SLIDE 60

The Data Pyramid

Nirvana AI, machine learning Metrics, aggregations, KPIs Collection and ingestion Learn, Optimise, Experiment Storage, transformation, monitoring

slide-61
SLIDE 61

The Data Pyramid

"The pyramids of Egypt could be explained as symbolic stairways to the stars, according to a British scientist" _ The Guardian

slide-62
SLIDE 62

The Data Pyramid

"The pyramids of Egypt could be explained as symbolic stairways to the stars, according to a British scientist" _ The Guardian "The data pyramid could be explained as a symbolic stairway to the A.I., according to myself" _ Me

slide-63
SLIDE 63

Endorse me on Linkedin

slide-64
SLIDE 64

3

STORE

2

INGEST

1

COLLECT DEFINE

We have the data

Now what?

slide-65
SLIDE 65

3

STORE

2

INGEST

1

COLLECT DEFINE

We have the data

Now what?

5 4

Batch jobs etl Streaming

Analyze

Service/API Dashboard Reports

Present

slide-66
SLIDE 66

3

STORE

2

INGEST

1

COLLECT DEFINE

We have the data

Now what?

5 4

Batch jobs etl Streaming

Analyze

Service/API Dashboard Reports

Present

slide-67
SLIDE 67

Everybody ETLs

slide-68
SLIDE 68

Everybody ETLs

slide-69
SLIDE 69

Some data to be aggregated

Inputs

Our mysterious job pipeline

Aggregated Table (article reads) Per DAY

Output

article reads per day

click events article titles

slide-70
SLIDE 70

today- partition magic job Append

slide-71
SLIDE 71

today- partition Failed magic job

slide-72
SLIDE 72

today- partition magic job Append

Immutable data partitions Versioned logic

Principle: Ensuring reproducibility

slide-73
SLIDE 73

On ETL design

Ensure reproducibility Practice failure in small increments Defining conventions in one place

slide-74
SLIDE 74

ISMAIL.LAND/VELOCITY

keeping a tidy pipeline

slide-75
SLIDE 75

Br3Ak 'em rULeS

¯\_(ツ)_/¯

slide-76
SLIDE 76

summary...

slide-77
SLIDE 77

summary...

slide-78
SLIDE 78

summary...

slide-79
SLIDE 79

summary... (what worked for us)

slide-80
SLIDE 80

summary...

DATA DATA DATA

(what worked for us)

slide-81
SLIDE 81

ismail.land/velocity

Thank You