Recommended For You: A First Look at Content Recommendation - - PowerPoint PPT Presentation

recommended for you a first look at content
SMART_READER_LITE
LIVE PREVIEW

Recommended For You: A First Look at Content Recommendation - - PowerPoint PPT Presentation

Recommended For You: A First Look at Content Recommendation Networks Muhammad Ahmad Bashir, Sajjad Arshad, Christo Wilson Content Recommendation Networks (CRNs) 2 Content Recommendation Networks (CRNs) 2 Content Recommendation


slide-1
SLIDE 1

“Recommended For You”: A First Look at Content Recommendation Networks

Muhammad Ahmad Bashir, Sajjad Arshad, Christo Wilson

slide-2
SLIDE 2

Content Recommendation Networks (CRNs)

2

slide-3
SLIDE 3

Content Recommendation Networks (CRNs)

2

slide-4
SLIDE 4

Content Recommendation Networks (CRNs)

2

slide-5
SLIDE 5

Content Recommendation Networks (CRNs)

2

slide-6
SLIDE 6

Content Recommendation Networks (CRNs)

2

slide-7
SLIDE 7

Content Recommendation Networks (CRNs)

2

slide-8
SLIDE 8

Some Terminology

3

slide-9
SLIDE 9

Some Terminology

3

Widgets

slide-10
SLIDE 10

Some Terminology

3

Widgets

Headline

slide-11
SLIDE 11

Some Terminology

3

Widgets

Disclosure Headline

slide-12
SLIDE 12

Some Terminology

3

Widgets

Disclosure Headline

  • Link to the publisher (first-party link) — Recommendation
slide-13
SLIDE 13

Some Terminology

3

Widgets

Disclosure Headline

  • Link to the publisher (first-party link) — Recommendation
  • Links to third-party websites — Ads
slide-14
SLIDE 14

Some Terminology

3

Widgets

Disclosure Headline

  • Link to the publisher (first-party link) — Recommendation
  • Links to third-party websites — Ads

Links + $$$

slide-15
SLIDE 15

Some Terminology

3

Widgets

Disclosure Headline

  • Link to the publisher (first-party link) — Recommendation
  • Links to third-party websites — Ads

Links + $$$

Widgets

slide-16
SLIDE 16

Some Terminology

3

Widgets

Disclosure Headline

  • Link to the publisher (first-party link) — Recommendation
  • Links to third-party websites — Ads

Links + $$$

Click Widgets

slide-17
SLIDE 17

Some Terminology

3

Widgets

Disclosure Headline

  • Link to the publisher (first-party link) — Recommendation
  • Links to third-party websites — Ads

Links + $$$

Click $$ Widgets $$

slide-18
SLIDE 18

Why study CRNs?

4

slide-19
SLIDE 19

Why study CRNs?

4

slide-20
SLIDE 20

Why study CRNs?

4

slide-21
SLIDE 21

Why study CRNs?

4

slide-22
SLIDE 22

Why study CRNs?

4

slide-23
SLIDE 23

Why study CRNs?

4

slide-24
SLIDE 24

Why study CRNs?

4

Third party links

slide-25
SLIDE 25

Goal

5

slide-26
SLIDE 26

Goal

5

Conduct a study to observe current practices of CRNs.

slide-27
SLIDE 27

Goal

5

Conduct a study to observe current practices of CRNs.

  • How are the widgets labeled and disclosed?
slide-28
SLIDE 28

Goal

5

Conduct a study to observe current practices of CRNs.

  • What kind of content is being advertised?
  • How are the widgets labeled and disclosed?
slide-29
SLIDE 29

Data Collection

6

slide-30
SLIDE 30

Data Collection

6

  • 5 CRNs. Outbrain, Taboola, Gravity, ZergNet, Revcontent
slide-31
SLIDE 31

Data Collection

6

289

Alexa Media and News Category

  • 5 CRNs. Outbrain, Taboola, Gravity, ZergNet, Revcontent
slide-32
SLIDE 32

Data Collection

6

289 211

+

Alexa Media and News Category Random sample from Alexa top 1M

  • 5 CRNs. Outbrain, Taboola, Gravity, ZergNet, Revcontent
slide-33
SLIDE 33

Data Collection

6

289 211

+

Alexa Media and News Category Random sample from Alexa top 1M

500 Publishers

  • 5 CRNs. Outbrain, Taboola, Gravity, ZergNet, Revcontent
slide-34
SLIDE 34

Data Collection

6

289 211

+

Alexa Media and News Category Random sample from Alexa top 1M

500 Publishers

  • For each publisher we visited 40 pages, including homepage.
  • 5 CRNs. Outbrain, Taboola, Gravity, ZergNet, Revcontent
slide-35
SLIDE 35

Data Collection

6

289 211

+

Alexa Media and News Category Random sample from Alexa top 1M

500 Publishers

  • For each publisher we visited 40 pages, including homepage.

User

XPath Queries

  • Developed 12 queries.
  • Covering 5 CRNs.

Examples

  • Outbrain:

//a[@class=‘ob-dynamic-rec-link’]

  • ZergNet:

//div[@class=‘zergentity’]

  • 5 CRNs. Outbrain, Taboola, Gravity, ZergNet, Revcontent
slide-36
SLIDE 36

Data Collection

6

289 211

+

Alexa Media and News Category Random sample from Alexa top 1M

500 Publishers

  • For each publisher we visited 40 pages, including homepage.

User

XPath Queries

  • Developed 12 queries.
  • Covering 5 CRNs.

Examples

  • Outbrain:

//a[@class=‘ob-dynamic-rec-link’]

  • ZergNet:

//div[@class=‘zergentity’]

  • 5 CRNs. Outbrain, Taboola, Gravity, ZergNet, Revcontent

Extracted Data

  • Widgets from 5 CRNs.
  • For each widget:
  • Elements
  • Headline
  • Disclosure
  • Advertiser URLs
slide-37
SLIDE 37

Final Dataset

7

  • 5 CRNs
  • 500 publishers
  • 53K recommendations (links to first-party).
  • 131K ads (links to third-party).
  • 2,689 unique advertised domains.
slide-38
SLIDE 38

High level Statistics

8

CRN #Publishers Total Ads Total Recs Outbrain 147 57446 35476 Taboola 176 56860 15660 Revcontent 29 576 16 Gravity 13 744 2054 ZergNet 14 15375

slide-39
SLIDE 39

High level Statistics

8

  • Outbrain and Taboola are the major players.

CRN #Publishers Total Ads Total Recs Outbrain 147 57446 35476 Taboola 176 56860 15660 Revcontent 29 576 16 Gravity 13 744 2054 ZergNet 14 15375

slide-40
SLIDE 40

High level Statistics

8

  • Outbrain and Taboola are the major players.
  • Outbrain (1.5x), Taboola & Revcontent (5x) display more ads on

average per page.

CRN #Publishers Total Ads Total Recs Outbrain 147 57446 35476 Taboola 176 56860 15660 Revcontent 29 576 16 Gravity 13 744 2054 ZergNet 14 15375

slide-41
SLIDE 41

High level Statistics

8

  • Outbrain and Taboola are the major players.
  • Outbrain (1.5x), Taboola & Revcontent (5x) display more ads on

average per page.

  • Gravity shows 2.7x more recommendations per page on average.

CRN #Publishers Total Ads Total Recs Outbrain 147 57446 35476 Taboola 176 56860 15660 Revcontent 29 576 16 Gravity 13 744 2054 ZergNet 14 15375

slide-42
SLIDE 42

Headlines and Disclosures

9

Question: Are CRNs explicitly labeling sponsored links as advertisements?

  • We take a look at widget headlines and disclosures.

v v v v

Headline Disclosure

slide-43
SLIDE 43

Headlines

10

slide-44
SLIDE 44

Headlines

10

  • 88% widgets provide a headline.
slide-45
SLIDE 45

Headlines

10

  • 88% widgets provide a headline.

Recommendations Ads Headline % Headline %

you might also like 17 around the web 18 featured stories 12 promoted stories 15 you may like 7 you may like 15 we recommend 7 you might also like 6 more from this site 4 from around the wb 2 you might be interested in 2 trending today 2 trending now 1 we recommend 2

slide-46
SLIDE 46

Headlines

10

  • 88% widgets provide a headline.

Recommendations Ads Headline % Headline %

you might also like 17 around the web 18 featured stories 12 promoted stories 15 you may like 7 you may like 15 we recommend 7 you might also like 6 more from this site 4 from around the wb 2 you might be interested in 2 trending today 2 trending now 1 we recommend 2

slide-47
SLIDE 47

Headlines

10

  • 88% widgets provide a headline.

Recommendations Ads Headline % Headline %

you might also like 17 around the web 18 featured stories 12 promoted stories 15 you may like 7 you may like 15 we recommend 7 you might also like 6 more from this site 4 from around the wb 2 you might be interested in 2 trending today 2 trending now 1 we recommend 2

  • Keywords used across all headlines for ad widgets.
  • ‘promoted’ (12 %), ‘advertiser’ (<1 %)
slide-48
SLIDE 48

Disclosures

11

CRN Disclosed (%) Verdict Outbrain 90.8 Taboola 97.1 Revcontent 100.0 Gravity 81.6 ZergNet 24.1

slide-49
SLIDE 49

Disclosures

11

Overall 94 % widgets had disclosures.

CRN Disclosed (%) Verdict Outbrain 90.8 Taboola 97.1 Revcontent 100.0 Gravity 81.6 ZergNet 24.1

slide-50
SLIDE 50

Disclosures

11

Overall 94 % widgets had disclosures.

CRN Disclosed (%) Verdict Outbrain 90.8 Taboola 97.1 Revcontent 100.0 Gravity 81.6 ZergNet 24.1

slide-51
SLIDE 51

Disclosures

11

Overall 94 % widgets had disclosures.

CRN Disclosed (%) Verdict Outbrain 90.8 Taboola 97.1 Revcontent 100.0 Gravity 81.6 ZergNet 24.1

✔ ✔ ✔

slide-52
SLIDE 52

Disclosures

11

Overall 94 % widgets had disclosures.

CRN Disclosed (%) Verdict Outbrain 90.8 Taboola 97.1 Revcontent 100.0 Gravity 81.6 ZergNet 24.1

slide-53
SLIDE 53

Disclosures

11

Overall 94 % widgets had disclosures.

CRN Disclosed (%) Verdict Outbrain 90.8 Taboola 97.1 Revcontent 100.0 Gravity 81.6 ZergNet 24.1

slide-54
SLIDE 54

Disclosures

11

Overall 94 % widgets had disclosures.

CRN Disclosed (%) Verdict Outbrain 90.8 Taboola 97.1 Revcontent 100.0 Gravity 81.6 ZergNet 24.1

slide-55
SLIDE 55

Disclosures

11

Overall 94 % widgets had disclosures.

CRN Disclosed (%) Verdict Outbrain 90.8 Taboola 97.1 Revcontent 100.0 Gravity 81.6 ZergNet 24.1

slide-56
SLIDE 56

Disclosures

11

Overall 94 % widgets had disclosures.

CRN Disclosed (%) Verdict Outbrain 90.8 Taboola 97.1 Revcontent 100.0 Gravity 81.6 ZergNet 24.1

slide-57
SLIDE 57

Disclosures

11

Overall 94 % widgets had disclosures.

CRN Disclosed (%) Verdict Outbrain 90.8 Taboola 97.1 Revcontent 100.0 Gravity 81.6 ZergNet 24.1

slide-58
SLIDE 58

Disclosures

11

Overall 94 % widgets had disclosures.

CRN Disclosed (%) Verdict Outbrain 90.8 Taboola 97.1 Revcontent 100.0 Gravity 81.6 ZergNet 24.1

slide-59
SLIDE 59

Disclosures

11

Overall 94 % widgets had disclosures.

CRN Disclosed (%) Verdict Outbrain 90.8 Taboola 97.1 Revcontent 100.0 Gravity 81.6 ZergNet 24.1

✗ ✗

slide-60
SLIDE 60

Widgets with Mixed Content

12

slide-61
SLIDE 61

Widgets with Mixed Content

12

slide-62
SLIDE 62

Widgets with Mixed Content

12

slide-63
SLIDE 63

Widgets with Mixed Content

12

CRN Mixed (%) Verdict Outbrain 16.9 Taboola 9.0 Revcontent Gravity 25.5 ZergNet

slide-64
SLIDE 64

Widgets with Mixed Content

12

CRN Mixed (%) Verdict Outbrain 16.9 Taboola 9.0 Revcontent Gravity 25.5 ZergNet

slide-65
SLIDE 65

Widgets with Mixed Content

12

CRN Mixed (%) Verdict Outbrain 16.9 Taboola 9.0 Revcontent Gravity 25.5 ZergNet

✗ ✗ ✗

slide-66
SLIDE 66

Widgets with Mixed Content

12

CRN Mixed (%) Verdict Outbrain 16.9 Taboola 9.0 Revcontent Gravity 25.5 ZergNet

✗ ✔ ✔ ✗ ✗

slide-67
SLIDE 67

Widgets with Mixed Content

12

Overall 11.9% widgets display mixed content.

CRN Mixed (%) Verdict Outbrain 16.9 Taboola 9.0 Revcontent Gravity 25.5 ZergNet

✗ ✔ ✔ ✗ ✗

slide-68
SLIDE 68

What is being Advertised?

13

slide-69
SLIDE 69

What is being Advertised?

13

We look at 131K third-party URLs from 2689 unique advertisers.

  • We did not click on any ads.
slide-70
SLIDE 70

What is being Advertised?

13

We look at 131K third-party URLs from 2689 unique advertisers.

  • We did not click on any ads.

Topic Example Keywords Landing pages (%) Listicles

improve, scams, experience 18.46

Credit Cards

credit, card, interest 16.09

Celebrity Gossip Kardashians, sexiest, caught

10.94

Mortgages

mortgage, HARP, loan 8.76

Movies

Hollywood, Batman, Marvel 5.90

Health & Diet

diabetes, fat, stomach 5.62

Investment

Dow, dividend, stocks 1.57

Penny Auctions

auction, bid, pennies 1.15

slide-71
SLIDE 71

What is being Advertised?

13

We look at 131K third-party URLs from 2689 unique advertisers.

  • We did not click on any ads.

Topic Example Keywords Landing pages (%) Listicles

improve, scams, experience 18.46

Credit Cards

credit, card, interest 16.09

Celebrity Gossip Kardashians, sexiest, caught

10.94

Mortgages

mortgage, HARP, loan 8.76

Movies

Hollywood, Batman, Marvel 5.90

Health & Diet

diabetes, fat, stomach 5.62

Investment

Dow, dividend, stocks 1.57

Penny Auctions

auction, bid, pennies 1.15

  • Almost 35% articles are ‘content’.
slide-72
SLIDE 72

What is being Advertised?

13

We look at 131K third-party URLs from 2689 unique advertisers.

  • We did not click on any ads.

Topic Example Keywords Landing pages (%) Listicles

improve, scams, experience 18.46

Credit Cards

credit, card, interest 16.09

Celebrity Gossip Kardashians, sexiest, caught

10.94

Mortgages

mortgage, HARP, loan 8.76

Movies

Hollywood, Batman, Marvel 5.90

Health & Diet

diabetes, fat, stomach 5.62

Investment

Dow, dividend, stocks 1.57

Penny Auctions

auction, bid, pennies 1.15

  • Almost 35% articles are ‘content’.
  • More than 20% articles about Mortgage or Credit Cards.
slide-73
SLIDE 73

What is being Advertised?

13

We look at 131K third-party URLs from 2689 unique advertisers.

  • We did not click on any ads.

Topic Example Keywords Landing pages (%) Listicles

improve, scams, experience 18.46

Credit Cards

credit, card, interest 16.09

Celebrity Gossip Kardashians, sexiest, caught

10.94

Mortgages

mortgage, HARP, loan 8.76

Movies

Hollywood, Batman, Marvel 5.90

Health & Diet

diabetes, fat, stomach 5.62

Investment

Dow, dividend, stocks 1.57

Penny Auctions

auction, bid, pennies 1.15

  • Almost 35% articles are ‘content’.
  • More than 20% articles about Mortgage or Credit Cards.
  • Articles about financial services, miracle diets.
slide-74
SLIDE 74

Summary

  • We present the first evaluation on Content Recommendation

Networks (CRNs).

  • What can be done?
  • Stronger regulations.
  • CRNs should enforce clear labels.
  • Effective spam filtering.

14

CRN Headlines Disclosures Mixed Content Quality Outbrain Taboola Revcontent Gravity ZergNet

✔ ✔ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✔ ✔ ✗ ✗ ✗ ✗ ✗ ✗ ✗

slide-75
SLIDE 75

Summary

  • We present the first evaluation on Content Recommendation

Networks (CRNs).

  • What can be done?
  • Stronger regulations.
  • CRNs should enforce clear labels.
  • Effective spam filtering.

14

Questions?

ahmad@ccs.neu.edu

personalization.ccs.neu.edu

CRN Headlines Disclosures Mixed Content Quality Outbrain Taboola Revcontent Gravity ZergNet

✔ ✔ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✔ ✔ ✗ ✗ ✗ ✗ ✗ ✗ ✗