TraffickStop: Detecting and Measuring Illicit Traffic Monetization - - PowerPoint PPT Presentation

traffickstop
SMART_READER_LITE
LIVE PREVIEW

TraffickStop: Detecting and Measuring Illicit Traffic Monetization - - PowerPoint PPT Presentation

TraffickStop: Detecting and Measuring Illicit Traffic Monetization Through Large-scale DNS Analysis Baojun Liu, Zhou Li, Peiyuan Zong, Chaoyi Lu , Haixin Duan, Ying Liu, Sumayah Alrwais, Xiaofeng Wang, Shuang Hao, Yaoqi Jia, Yiming Zhang, Kai


slide-1
SLIDE 1

Baojun Liu, Zhou Li, Peiyuan Zong, Chaoyi Lu, Haixin Duan, Ying Liu, Sumayah Alrwais, Xiaofeng Wang, Shuang Hao, Yaoqi Jia, Yiming Zhang, Kai Chen and Zaifeng Zhang

TraffickStop:

Detecting and Measuring Illicit Traffic Monetization Through Large-scale DNS Analysis

slide-2
SLIDE 2

Illicit Traffic Monetization

2

https://marketingland.com/study-how-pay-per-view-networks-cost-advertisers-180-million-a-year-in-impression-fraud-55484 https://www.forbes.com/sites/thomasbrewster/2016/12/20/methbot-biggest-ad-fraud-busted/#64ae66fe4899 https://adage.com/article/digital/search-ad-click-fraud-scheme-cost-business-2-3-million/307933

slide-3
SLIDE 3

Connects site owners and affiliates.

Traffic Network

3

Site owner

(Needs traffic)

Affiliate

(Refers traffic)

Traffic Network

(Finds affiliates)

Search engine

GO!

Web publisher PC software

Traffic Reward

>_

slide-4
SLIDE 4

Connects site owners and affiliates.

Traffic Network

4

eCommerce Network Advertising Network Navigation Network

slide-5
SLIDE 5

Cheaters earn profit from site owners using invalid traffic.

Cheating in Traffic Networks

Site owner

(Needs traffic)

Traffic Network & Affiliates Real users Cheaters

Traffic Reward Traffic Reward Traffic Reward

slide-6
SLIDE 6

Cheaters earn profit from site owners using invalid traffic.

Cheating in Traffic Networks

Site owner

(Needs traffic)

Traffic Network & Affiliates Real users Cheaters

Traffi c Reward Traffic Reward Traffic Reward

A fraudulent site (FS) redirects user traffic to a program site (PS) of a traffic network. The process violates rules of traffic networks.

slide-7
SLIDE 7

Cheating happens EVERYWHERE!

7

Client-side:

Browser Hijacking Install PUP / Malware

  • n client machines

Reroute user traffic to targeted sites

Caused $8M loss in 2013

https://blog.malwarebytes.com/detections/adware-yontoo/

slide-8
SLIDE 8

Transport-layer:

ISP Injection

Cheating happens EVERYWHERE!

8

Inject extra ads into web responses Mitigation: HTTPS Relies on adoption rate

https://techscience.org/a/2015103003/ http://xahlee.info/w/china_ISP_ad_injection.html

slide-9
SLIDE 9

Cheating happens EVERYWHERE!

9

Server-side:

Search Ad Impersonation Publish fake ads in search engines Impersonate popular brands to trap more users

slide-10
SLIDE 10

Transport-layer:

ISP Injection

Cheating happens EVERYWHERE!

10

Client-side:

Browser Hijacking

Server-side:

Search Ad Impersonation Install PUP / Malware

  • n client machines

Reroute user traffic to targeted sites Inject extra ads into web responses Mitigation: HTTPS Relies on adoption rate Publish fake ads in search engines Impersonate popular brands to trap more users

slide-11
SLIDE 11

Previous Works

“Active” approaches.

12

Honey ads

[Dave 2012]

Inspection JS

[Reis 2008, Thomas 2015]

Network probe

[Dagon 2008, Kuhrer 2015]

JavaScript JavaScript

Require deep involvement

  • f publisher websites

Work on only one type of traffic fraud

slide-12
SLIDE 12

Our approach: Passive Analysis

slide-13
SLIDE 13

Ground Truth Collection

Manually collect 151 FSes for empirical study.

14

Search Ad Impersonation Browser Hijacking ISP Injection Cases from four-month Baidu search results of popular brand products Cases from online posts and tech forums Collected by custom Flash advertisement 57 FS 50 FS 44 FS

slide-14
SLIDE 14

Key Features of FS

Manually collect 151 FSes for empirical study.

15

Webpage of bd.114la6.com, a typical FS

Key Feature 1: AUTOMATIC & IMMEDIATE redirection to program sites. Result: Strong domain correlation

Affiliate Code Traffic Network

slide-15
SLIDE 15

Key Features of FS

Manually collect 151 FSes for empirical study.

16

Webpage of bd.114la6.com, a typical FS

Key Feature 2: The page only performs redirection, without anything else. Result: Meaningless content

slide-16
SLIDE 16

TraffickStop: Passive Analysis

Passive DNS & DNS logs

Data Collection Association Finder Content Analyzer

URL WHOIS

http://

Finds domains with strong correlation Examines suspicious behaviors between domains

slide-17
SLIDE 17

Association Finder

Find domain pairs {X, Y} with strong correlation.

18

  • A. X and Y appear together

with high frequency

Criteria

  • B. When X is observed,

Y can be observed with high probability

  • C. The visit interval between

X and Y is small

Metric

support confidence

Association analysis

decay

slide-18
SLIDE 18

Association Finder

Implementation: FP-Growth algorithm with MapReduce.

19

Map procedure: Calculate the interval between two domain visits Reduce procedure: Calculate the frequency of domain pairs, to find those highly correlated.

slide-19
SLIDE 19

Content Analyzer

Examine Redirection + Meaningless content.

20

Suspicious Domain Program Site Strong correlation Top 10 URLs URL dataset Dynamic crawler Webpages

http:// http://

If redirect to...

eCommerce Navigation

FS

Advertising

Content-based clustering

slide-20
SLIDE 20

System Evaluation

21

2-week DNS logs (231 billion requests) Association Finder Content Analyzer

2,465 fraud URLs FS

72.7% accuracy

(1,792/2,465)

Validation Rules:

  • A. Serving illegal or unreadable

content

  • B. Forcing redirection
  • C. URL contains affiliate ID

Detect three types of fraud at a time.

89.4% 67.5% 74.8% eCommerce Navigation Advertising

slide-21
SLIDE 21

Measurement & Analysis

slide-22
SLIDE 22

Fraud Scale

1,457 FS SLDs are confirmed by TraffickStop.

23

1-year passive DNS data (May 2017 - Apr 2018, ~15% of DNS traffic in China)

53 Billion

Total DNS queries to these FSes

300+ Days

85%+ FSes are active for

100K+ Queries

96%+ FSes receive each

slide-23
SLIDE 23

Search Ad Impersonation

Buying ads on search engines to attract visits.

24

API

1,457 fraud SLDs FS 23 Ad fraud SLDs

(All redirecting to taobao.com) AD

slide-24
SLIDE 24

Search Ad Impersonation

23 Ad fraud SLDs redirecting to taobao.com.

25

1M+

Total visits

Hundreds of

keywords bought under each domain

slide-25
SLIDE 25

Economic Loss

Loss = (Total Visits x Traffic Ratio) x Reward x Probability

26

taobao.com jd.com Baidu Hao123 360 Navigation $53.8K $18.9K $13.3K $2.5K $1.0K

Thousands per day

dollars lost due to traffic fraud

slide-26
SLIDE 26

New Strategy: Ad Reselling

Evading fraud detection of advertising platforms.

27

Publisher Advertiser

http://

Other sites Load Ads Load Ads Revenue Revenue Check fraud

No Relation

slide-27
SLIDE 27

New Strategy: Ad Reselling

Evading fraud detection of advertising platforms.

28

FS Advertiser Gray Publisher

slide-28
SLIDE 28

Case Study: P2P Traffic Pal

Distributed platform that generate traffic from real users.

29

“Help me like this post at http://xxx!” Clients with this software “Help me play this video: http://yyy!”

slide-29
SLIDE 29

Summary

A new passive approach to detect three kinds of illicit traffic monetization 1,457 fraudulent sites detected 72.7% overall accuracy Measurement on scale, evasion and impact on legitimate parties

slide-30
SLIDE 30

Baojun Liu, Zhou Li, Peiyuan Zong, Chaoyi Lu, Haixin Duan, Ying Liu, Sumayah Alrwais, Xiaofeng Wang, Shuang Hao, Yaoqi Jia, Yiming Zhang, Kai Chen and Zaifeng Zhang

TraffickStop:

Detecting and Measuring Illicit Traffic Monetization Through Large-scale DNS Analysis