deSEO: Combating Search-Result Poisoning
John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin Abadi University of Washington & MSR, Silicon Valley
deSEO: Combating Search-Result Poisoning John P John Fang Yu, - - PowerPoint PPT Presentation
deSEO: Combating Search-Result Poisoning John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin Abadi University of Washington & MSR, Silicon Valley The malware pipeline find vulnerable web servers compromise web servers and
John P John Fang Yu, Yinglian Xie, Arvind Krishnamurthy, Martin Abadi University of Washington & MSR, Silicon Valley
bad stufg spread malicious links via email, IM, search results compromise web servers and host malicious content find vulnerable web servers
networks, search results, etc.
bad stufg spread malicious links via email, IM, search results compromise web servers and host malicious content find vulnerable web servers
malicious link in top results
year
malicious link in top results
year
search engine redirection server exploit server compromised Web server
search query
search engine redirection server exploit server compromised Web server
search query
search engine redirection server exploit server compromised Web server
search query
search engine redirection server exploit server compromised Web server
search query
search engine redirection server exploit server compromised Web server
search query
search engine redirection server exploit server compromised Web server
server - including executables
www.example.com/admin/file_manager.php/login.php? action=processuploads!
www.example.com/images/page.php?page=kobayashi+arrested
www.example.com/images/page.php?page=kobayashi+arrested
kobayashi arrested
www.example.com/images/page.php?page=kobayashi+arrested
Check if search crawler Generate page for keyword
Check if search crawler Generate page for keyword Fetch: snippets from google images from bing
Check if search crawler Generate page for keyword Fetch: snippets from google images from bing Add links to other compromised sites
Check if search crawler Generate page for keyword Fetch: snippets from google images from bing Add links to other compromised sites Cache page
crawling included links
τ0 τ1 τ2 τ3 δ1 τ0+T δ3 δ2
!" #!!!" $!!!" %!!!" &!!!" '!!!" (!!!" )!!!" *!!!" !"#$%&'()'*+,-#'*+.+/.' 01/%'
τ0 τ1 τ2 τ3 δ1 τ0+T δ3 δ2
!" #!!!" $!!!" %!!!" &!!!" '!!!" (!!!" )!!!" *!!!" !"#$%&'()'*+,-#'*+.+/.' 01/%'
detected?
select domains where many new pages are set up, different from older pages
using K-means++
select groups where new pages are similar across domains
Sample web URLs with trendy keywords
http://www.askania-fachmaerkte.de/images/news.php? page=justin+bieber+breaks+neck
Sample web URLs with trendy keywords History based detection
History based detection Domain clustering
String features- keyword separators, arguments, filename, path Numerical features- number of arguments, length of arguments, length of keywords Bag of words- set of keywords
Sample web URLs with trendy keywords
History based detection Domain clustering
Group analysis
Sample web URLs with trendy keywords
History based detection Domain clustering
Group analysis
Sample web URLs with trendy keywords
History based detection Domain clustering
Group analysis
! !"!# !"!$ !"!% !"!& !"!' !"!( !"!) !"!* !"!+ !"# #! %! '! )! +! ##! #%! #'! #)! #+! $#! $%! $'! $)! %!! %(! &!! &$! '#! ()!
!"#$%&'()'*)+#,-.)) /)'*)012.)
! !"# !"$ !"% !"& !"' !"( !") !"* ! # $ ) + #! $! $+ %$ %* (! (' (( ### #+#
!"#$%&'()'*)+#,-.)) /)'*)012.)
Sample web URLs with trendy keywords
History based detection Domain clustering
Group analysis
Regular expressions
.*\/xmlrpc\.php\/\?showc=\w+(\+\w+)+$
Sample web URLs with trendy keywords
in January 2011
<* 5* 4* 8* :* 3<* 3* 5* 6* 4* 7* 8* 9* :* ;* !"#$%&''()'#*+,-,(".'+,/0.' 1%*&-2'&%."+3'4*5%'
and used them to build deSEO