The Ad Wars: Retrospective Measurement and Analysis of Anti-Adblock Filter Lists
Umar r Iqbal* al*, Zubair Shafiq*, and Zhiyun Qian†
The University y of Iowa wa*
University of California-Riverside†
Retrospective Measurement and Analysis of Anti-Adblock Filter Lists - - PowerPoint PPT Presentation
The Ad Wars: Retrospective Measurement and Analysis of Anti-Adblock Filter Lists Umar r Iqbal* al* , Zubair Shafiq*, and Zhiyun Qian The University y of Iowa wa* University of California- Riverside Agenda The he Ad Ad Wars Online
The University y of Iowa wa*
University of California-Riverside†
Online ads Adblocking Anti-Adblocking Anti-Anti-Adblocking
Anti-Adblock filter list analysis Retrospective coverage analysis Detecting Anti-Adblock Scripts
2 Umar Iqbal
Advertising enables free content
Publishers show free content Earn revenue with ads
Problems with ads
Privacy Intrusive Malware Performance
Solution
Adblocking
3 Umar Iqbal
Privacy Badger & Ghostery Adblock Plus & Adblock Brave browser & Cliqz browser
4
Apple Safari & Google Chrome
Umar Iqbal
5
Client 3rd Party Server www.example.com 3rd Party Content Ads Ad Server HTTP Request HTTP Response 1st Party Content 3rd Party Content Block HTTP Requests Block HTML Elements
Crowdsou
ced Filt lter Lis ists EasyList Disconnect.me
Umar Iqbal
6 Umar Iqbal
7 Umar Iqbal
Online ads Adblocking Anti-Adblocking Anti-Anti-Adblocking
Anti-Adblock filter list analysis Retrospective coverage analysis Detecting Anti-Adblock Scripts
8 Umar Iqbal
HTTP Request Filter Rules Domain anchor || || Domain tag do doma main= n= HTML Element Filter Rules Domain restriction Without domain restriction Exception Rules HTTP exception rules HTML exception rules
! Rule with domain anchor || || example.com ! Rule with domain tag /example.js $script, domain main = example.com ! Rule with domain restriction example.com###examplebanner ! Rule without domain restriction ###examplebanner ! Exception rule for HTTP request @@/example.js $script domain = example1.com ! Exception rule for HTML element example.com#@##examplebanner
9 Umar Iqbal
Anti-Adblock Killer ( 2014 )
353 to 1,811 filter rules 6.2 filter rules for every revision
EasyList ( 2011 )
Anti-Adblock sections 67 to 1,317 filter rules 0.6 filter rules per day
Warning Removal List ( 2013 )
4 to 167 filter rules 0.2 filter rules per day
10 Umar Iqbal
Anti-Adblock Killer 1,415 Combined EasyList 1,394 Common domains 282
Combined EasyList 4:1 Anti-Adblock Killer 1:1
11
Domain ain Categor
ization ion
Umar Iqbal
Prompt in adding new rules
12
64% appear r first st in Combin ined Easyli ylist 34% appear r first in Anti-Ad Adblock lock Killer
2% appear at the same time
Umar Iqbal
Online ads Adblocking Anti-Adblocking Anti-Anti-Adblocking
Anti-Adblock filter list analysis Retrospective coverage analysis Detecting Anti-Adblock Scripts
13 Umar Iqbal
Archives web pages
279 billion webpages Archives webpage resources as well Used in prior literature [USENIX Security ‘16] API to retrieve content
Alexa top 5K websites
5 years (2011 – 2016)
Wayback Machine is incomplete!
robots.txt permissions Partial snapshots Outdated URLs Not archived URLs
14
Missin ing Snaphots
Umar Iqbal
T
domains List of Wayback URLs with timestamps Data Repository Filter list matching 15
Remove not archived domains Request to the Wayback Machine JSON API Remove
Request Wayback Machine URLs with Selenium Store requests/responses and HTML content Match crawled content with anti-adblock filter lists Remove partial snapshots
Umar Iqbal
16 Numbe mber of websi sites s that trigge gger HTTP rules Numbe mber of websi sites s that trigge gger HTML L rules
331 Websites 16 Websites 5 Websites 4 Websites
Umar Iqbal
17
Umar Iqbal
18
82% Anti-Adblockers 32% of Anti-Adblockers
Umar Iqbal
Online ads Adblocking Anti-Adblocking Anti-Anti-Adblocking
Anti-Adblock filter list analysis Retrospective coverage analysis Detecting Anti-Adblock Scripts
19 Umar Iqbal
20 Umar Iqbal
JS file Unpacked JS file Anti-Adblocking JS Non Anti-Adblocking JS
Extract features from ASTs and filter features with low correlation Construct ASTs from Unpacked JavaScript Code Train AdaBoost using SVM as base classifier
21
Unpack packed JavaScript files with V8 engine Classify Anti-Adblocking and Non Anti-Adblocking JavaScripts
Umar Iqbal
Java avaScript Script Code Example xample if (ad_element.clinetHeight == 0){ BlockAdBlock = "abp"; }
Unpack eval() using V8 Engine Construct Abstract Syntax Tree (AST)
All (AssignmentExpression:BlockAdBlock) Literal (Literal:abp) Keyword (Identifier:clientHeight)
22 Umar Iqbal ∅ ∶ 𝑦 → ∅𝑡 𝑦
𝑡 ∈ 𝑇
∅𝑡 𝑦 = ቊ1, 𝑗𝑔 𝑦 𝑑𝑝𝑜𝑢𝑏𝑗𝑜𝑡 𝑢ℎ𝑓 𝑔𝑓𝑏𝑢𝑣𝑠𝑓 𝑡 0, 𝑝𝑢ℎ𝑓𝑠𝑥𝑗𝑡𝑓
Packed cked Code eval( “ var BlockAdBlock = “abp”; ” ); Unpacked ked Code var BlockAdBlock = “abp”; Identifier
ExpressionStatement
clientHeight ad_element BlockAdBlock abp IfStatement
372 anti-adblocking 4021 non anti-adblocking
Filter using χ2 correlation Reduce features
AdaBoost + SVM 10 fold cross validation
23 Umar Iqbal
Feat ature e Set Clas assifier ifier Numb mber er of
Feat atures es TP TP rate te (%) FP FP rat ate (%) all AdaBoost + SVM 10K 99.6 3.9 liter teral AdaBoost + SVM 10K 99.6 3.9 keywor word AdaBoost + SVM 1K 99.7 3.2
Results in term of
True Positiv ive e (TP) rate Correctly classified anti-adblocking scripts False se Positiv ive e (FP) rate Incorrectly classified anti-adblocking scripts
T est in the wild on Alexa top 100K websites
2,701 detected anti-adblockers TP rate of 92.5%
Complement manual analysis
Periodic crawl to expedite manual process Substantial reduction of manual effort
24 Umar Iqbal
Retrospective analysis on Alexa top 5K websites from 2011 to 2016 Effectiveness and evolution
Static analysis to detect anti-adblocking scripts Complement filter lists rules creation
Can be used to study similar filter lists Malware, Tracking, Censorship
25 Umar Iqbal
Study of Web Tracking from 1996 to 2016. In USENIX Security Symposium, 2016.
USENIX Security Symposium, 2011.
Detection of Trackers via One-class Learning . In Privacy Enhancing T echnologies Symposium (PETS), 2017.
27 Umar Iqbal