retrospective measurement and
play

Retrospective Measurement and Analysis of Anti-Adblock Filter Lists - PowerPoint PPT Presentation

The Ad Wars: Retrospective Measurement and Analysis of Anti-Adblock Filter Lists Umar r Iqbal* al* , Zubair Shafiq*, and Zhiyun Qian The University y of Iowa wa* University of California- Riverside Agenda The he Ad Ad Wars Online


  1. The Ad Wars: Retrospective Measurement and Analysis of Anti-Adblock Filter Lists Umar r Iqbal* al* , Zubair Shafiq*, and Zhiyun Qian† The University y of Iowa wa* University of California- Riverside†

  2. Agenda The he Ad Ad Wars Online ads Adblocking Anti-Adblocking Anti-Anti-Adblocking Contrib ntributions utions Anti-Adblock filter list analysis Retrospective coverage analysis Detecting Anti-Adblock Scripts Concl nclusion usion 2 Umar Iqbal

  3. Online Advertising Advertising enables free content Publishers show free content Earn revenue with ads Problems with ads Privacy Intrusive Malware Performance Solution Adblocking 3 Umar Iqbal

  4. Ad/Tracker Blocking Solutions Trackers Blocking Extensions Privacy Badger & Ghostery Adblocking Extensions Adblock Plus & Adblock Ad/Tracker Blocking Browsers Brave browser & Cliqz browser Mainstream Ad/Tracker Blocking Browsers Apple Safari & Google Chrome 4 Umar Iqbal

  5. How do Adblockers Work? Crowdsou ource ced Filt lter Lis ists EasyList Disconnect.me Ad Server Block HTML Elements Block HTTP Requests Client Ads 3 rd Party Content 3 rd Party Content 1 st Party Content HTTP Request HTTP Response 3 rd Party Server www.example.com 5 Umar Iqbal

  6. Publishers vs Adblockers Acceptable Ads Program Whitelisting fee Transparency concerns Enabled by default in major adblockers Use of Anti-Adblockers Insert bait elements Detect adblockers Prompt to disable adblockers/whitelist website 6 Umar Iqbal

  7. Anti-Anti-Adblocking Block/allow bait HTTP requests Hide/allow bait HTML elements Use anti-adblocking filter lists Anti-Adblock Killer EasyList 7 Umar Iqbal

  8. Agenda The he Ad Ad Wars Online ads Adblocking Anti-Adblocking Anti-Anti-Adblocking Contrib ntributions utions Anti-Adblock filter list analysis Retrospective coverage analysis Detecting Anti-Adblock Scripts Concl nclusion usion 8 Umar Iqbal

  9. Filter List Rules HTTP Request Filter Rules ! Rule with domain anchor || example.com || Domain anchor || || ! Rule with domain tag Domain tag do doma main= n= /example.js $script, domain main = example.com HTML Element Filter Rules ! Rule with domain restriction example.com###examplebanner Domain restriction ! Rule without domain restriction Without domain restriction ###examplebanner Exception Rules ! Exception rule for HTTP request HTTP exception rules @@/example.js $script domain = example1.com HTML exception rules ! Exception rule for HTML element example.com#@##examplebanner 9 Umar Iqbal

  10. Popular Filter Lists Anti-Adblock Killer ( 2014 ) 353 to 1,811 filter rules 6.2 filter rules for every revision EasyList ( 2011 ) Anti-Adblock sections 67 to 1,317 filter rules EasyList + Warning Removal List 0.6 filter rules per day Combined ined EasyL yList st Warning Removal List ( 2013 ) 4 to 167 filter rules 0.2 filter rules per day 10 Umar Iqbal

  11. Anti-Adblock Killer vs Combined EasyList Number of domains Anti-Adblock Killer 1,415 Combined EasyList 1,394 Common domains 282 Different Strategies of Crafting Anti-Adblocking Rules Similar distribution of Alexa ranking Similar distribution for categories Exception vs Non-Exception domains Combined EasyList 4:1 Anti-Adblock Killer 1:1 Domain ain Categor oriza ization ion 11 Umar Iqbal

  12. Anti-Adblock Killer vs Combined EasyList 282 common domains Prompt in adding new rules 34% appear r first in Anti-Ad Adblock lock Killer Combined EasyList is More Prompt in Adding New Rules 2% appear at the same time 64% appear r first st in Combin ined Easyli ylist 12 Umar Iqbal

  13. Agenda The he Ad Ad Wars Online ads Adblocking Anti-Adblocking Anti-Anti-Adblocking Contrib ntributions utions Anti-Adblock filter list analysis Retrospective coverage analysis Detecting Anti-Adblock Scripts Concl nclusion usion 13 Umar Iqbal

  14. The Internet Archive’s Wayback Machine Archives web pages 279 billion webpages Archives webpage resources as well Used in prior literature [USENIX Security ‘16] API to retrieve content Alexa top 5K websites 5 years (2011 – 2016) Wayback Machine is incomplete! robots.txt permissions Partial snapshots Outdated URLs Missin ing Snaphots Not archived URLs 14 Umar Iqbal

  15. Analysis Workflow Match crawled content with Store requests/responses Remove Remove not anti-adblock filter lists and HTML content outdated URLs archived domains T op 5K Alexa List of Wayback URLs Filter list Data domains with timestamps matching Repository Request to the Wayback Request Wayback Machine Remove partial Machine JSON API URLs with Selenium snapshots 15 Umar Iqbal

  16. Anti-Adblock Filter Lists Coverage 331 Websites HTTP matching HTML matching 16 Websites Use respective filter lists Anti-Adblock Killer Filter List Has Better Coverage Numbe mber of websi sites s that trigge gger HTTP rules Anti-Adblock Killer filter list 5 Websites Combined EasyList filter list 4 Websites Numbe mber of websi sites s that trigge gger HTML L rules 16 Umar Iqbal

  17. Anti-Adblock Filter Lists Coverage Detec ectio ion n on the Live e Web b Alexa top 100K Anti-Adblock Killer Anti-Adblock Killer Filter List Has Better Coverage on the Live Web 4,942 websites Combined EasyList 195 websites 17 Umar Iqbal

  18. Anti-Adblock Filter Lists Lag Crowdsourced Manually maintained Challenging to keep pace Combined EasyList is More Prompt in Adding New Rules 82% Anti-Adblockers New rules within 100 days 32% of Anti-Adblockers Combined EasyList While Anti-Adblock Killer Has More Coverage Anti-Adblock Killer 18 Umar Iqbal

  19. Agenda The he Ad Ad Wars Online ads Adblocking Anti-Adblocking Anti-Anti-Adblocking Contrib ntributions utions Anti-Adblock filter list analysis Retrospective coverage analysis Detecting Anti-Adblock Scripts Concl nclusion usion 19 Umar Iqbal

  20. Static Code Analysis Anti-Adblocking code from 3 rd party vendors Anti-Adblocking code have structural similarities Static analysis to capture code structure Fingerprint anti-adblocking JavaScript Curtsinger [USENIX Security ’11] Ikram [PETS ’17] 20 Umar Iqbal

  21. Anti-Adblock Detection Workflow Unpack packed JavaScript Extract features from ASTs and Classify Anti-Adblocking and files with V8 engine Non Anti-Adblocking JavaScripts filter features with low correlation Anti-Adblocking JS JS file Unpacked JS file Non Anti-Adblocking JS Construct ASTs from Train AdaBoost using Unpacked JavaScript Code SVM as base classifier 21 Umar Iqbal

  22. Feature Extraction Preprocessing Java avaScript Script Code Example xample Packed cked Code eval ( “ var BlockAdBlock = “ abp ”; ” ); Unpack eval() using V8 Engine if (ad_element.clinetHeight == 0){ Unpacked ked Code BlockAdBlock = "abp"; Construct Abstract Syntax Tree (AST) } var BlockAdBlock = “ abp ”; Features (context : text) IfStatement All (AssignmentExpression:BlockAdBlock) Identifier ExpressionStatement Literal (Literal:abp) Keyword (Identifier:clientHeight) Map scripts to a vector space clientHeight ad_element BlockAdBlock abp ∅ ∶ 𝑦 → ∅ 𝑡 𝑦 𝑡 ∈ 𝑇 ∅ 𝑡 𝑦 = ቊ1, 𝑗𝑔 𝑦 𝑑𝑝𝑜𝑢𝑏𝑗𝑜𝑡 𝑢ℎ𝑓 𝑔𝑓𝑏𝑢𝑣𝑠𝑓 𝑡 0, 𝑝𝑢ℎ𝑓𝑠𝑥𝑗𝑡𝑓 22 Umar Iqbal

  23. Feature Selection & Training Labeled Data 372 anti-adblocking 4021 non anti-adblocking Feature selection Filter using χ 2 correlation Reduce features Classifier training AdaBoost + SVM 10 fold cross validation 23 Umar Iqbal

  24. Results & Evaluation Results in term of Feat ature e Set Clas assifier ifier Numb mber er of of TP TP rate te (%) FP FP rat ate (%) Feat atures es True Positiv ive e (TP) rate all AdaBoost + SVM 10K 99.6 3.9 Correctly classified anti-adblocking scripts liter teral AdaBoost + SVM 10K 99.6 3.9 False se Positiv ive e (FP) rate Incorrectly classified anti-adblocking scripts keywor word AdaBoost + SVM 1K 99.7 3.2 T est in the wild on Alexa top 100K websites 2,701 detected anti-adblockers TP rate of 92.5% Complement manual analysis Periodic crawl to expedite manual process Substantial reduction of manual effort 24 Umar Iqbal

  25. Key T akeaways Comprehensive measurement study of anti-adblocking filter lists Retrospective analysis on Alexa top 5K websites from 2011 to 2016 Effectiveness and evolution Lightweight machine learning approach Static analysis to detect anti-adblocking scripts Complement filter lists rules creation The Wayback Machine enables retrospective analysis Can be used to study similar filter lists Malware, Tracking, Censorship 25 Umar Iqbal

  26. Questions? Umar Iqbal www.umariqbal.com @umaarr6

  27. References A. Lerner, A. K. Simpson, T. Kohno, and F. Roesner. Internet Jones and the Raiders of the Lost Trackers: An Archaeological Study of Web Tracking from 1996 to 2016. In USENIX Security Symposium, 2016. C. Curtsinger, B. Livshits, B. Zorn, and C. Seifert. ZOZZLE: Fast and Precise In-Browser JavaScript Malware Detection. In USENIX Security Symposium, 2011. M. Ikram, H. J. Asghar, M. A. Kaafar, A. Mahanti, and B. Krishnamurthy. T owards Seamless Tracking-FreeWeb:Improved Detection of Trackers via One-class Learning . In Privacy Enhancing T echnologies Symposium (PETS), 2017. 27 Umar Iqbal

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend