Web crawler system for collecting malicious activities FIRST TC - - PowerPoint PPT Presentation
Web crawler system for collecting malicious activities FIRST TC - - PowerPoint PPT Presentation
Web crawler system for collecting malicious activities FIRST TC Mauritius 2016 Hisao Nashiwa Internet Initiative Japan Inc. Who am I? Threat analyst at Internet Initiative Japan Inc. that is short for IIJ. IIJ is a Japanese
Who am I?
- Threat analyst at “Internet Initiative Japan Inc.”
that is short for “IIJ”.
– IIJ is a Japanese ISP (We are the first commercial ISP in Japan).
- Member of CSIRT team called “IIJ-SECT”
– Our jobs include…
- Malware Analysis
- Forensic Investigation
- Incident Response and Handling
- Developing and operating honeypot and web crawler
system
- Surveying malware and attacking technique trends
- Writing reports for our quarterly
report (called “IIR”) and blogs
Hisao Nashiwa CISSP
Motivation
- Recent typical malware infection vectors are
drive-by download and malware-attached email.
- We want to observe web-based threat trend,
especially, Exploit kit activities.
- For the reason, we should crawl large amount
- f websites(more than 300,000 sites/day) for
Japanese user and detect threat.
Issues
- Simple web crawling tools do not implement
DOM parser act as Internet Explorer.
– wget, curl, even spidermonkey and jsunpack-n can’t process JavaScript contents made for Internet Explorer. – thug has DOM parser, but it’s not the same as Internet Explorer’s one.
- Several sandbox products can parse DOM with
real Internet Explorer on Windows VM. But these take a long time for analysis. (5-15min/website).
Solution
- We made in-house light-weight sandbox that
runs real Internet Explorer on Windows VM for crawling websites.(in other words, web client honeypot)
– That takes 15-60 sec/website for light-weight analysis.
- We adopt customized proxy server to analyze
the crawling traffic.
(1) Controller order the target website. (2) Client honeypot crawl the website through the proxy. Then it analyzes traffic and VM activity. – phase 1 analysis (3) If some suspicious activities detected in phase 1 analysis, another sandbox product crawl the same website to deep analysis(takes several minutes) (4) Controller collect sandbox report and the phase1 analysis result, then do phase2 analysis to classify the threat.
Web crawler system
Redirection Detector/Analyzer Redirection Detector/Analyzer
Flow and Components
Light weight sandbox (client honeypot)
Custom MITM Proxy
Honey Clients Honey Clients
Windows VM Sandbox product
The internet
Phase1 analyzer
Mgmt server
Controller/ DB Phase2 analyzer
(1) (2) (2) (3) (4)
Details of analysis
- Phase1 analysis
– HTTP header info
- FQDN length, destination port number, domain transitions, User-
agent, content-type, header length, response size and so on...
– Simple content analysis
- File magic checking (PE, ZIP(jar, xap) and PDF are suspicious)
- reversing action scripts of swf contents by swf dumping tool and grep
some suspicious sentence(like XOR order).
– Windows VM activity
- Sysmon log, RAM usage, Java related process..
- Flash trace log
These analyses apply for all session(takes 15-60 sec/website)
Details of analysis(cont.)
- Sandbox analysis
– We use commercial sandbox product to automated analysis.
This analysis apply for 0.1 percent of all session(takes 5-15 min/website)
Details of analysis(cont.)
- Phase 2 analysis
– Analyze the results of phase 1 analysis and report of sandbox product. – Classify the threat and identify the name of Exploit kit by our in-house pattern match signatures. – The false positive rate of pattern matching is modestly low because this analysis apply only sessions raised by phase 1 analysis.
This analysis apply for 0.1 percent of all session(takes several sec/website)
Websites to crawl
- We’re crawling about 400,000 websites/day.
– .jp domain websites from Alexa top 1 million (automatic collection) – public offices, local governments, listed companies, media companies…(manual collection)
- Googling for the lists then parse the lists… very tired.
– Hot websites from keywords of search engines (automatic collection)
Recent observations
Encounter rate of Exploit kit threat in Japan
0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 1-Jul-16 1-Aug-16 1-Sep-16 (encount rate(%)) (date) Sundown KaiXin Rig Neutrino
Recent activities of Rig exploit kit
20 40 60 80 100 29-Sep-16 4-Oct-16 9-Oct-16 14-Oct-16 (date)
Number of defaced websites redirect to Rig EK
- The pandemic of Rig EK is worldwide trend.
- Heavily obfuscated Landing page.
- Exploit Internet Explorer, Adobe Flash and Silverlight.
- We observed Locky, Cerber and Ursnif as exploit payload.
Rig Exploit kit
Case: a certain blog defaced and redirected visitors to Rig EK
http://blogs.XXXXX.com @Mar-2016
Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)
- 1. GET http://blogs.XXXXX.com/ HTTP/1.1 200 0 25833 text/html
- 2. GET http://blogs.XXXXX.com/wp-includes/js/wp-emoji-release.min.js HTTP/1.1 200 0 6519 application/javascript
- 3. GET http://xc.rottencouchtomatoes.com/hlfvviewforumym.php HTTP/1.1 200 0 901 text/javascript
- 4. GET http://ef.scber.com/?wXqBcrWVLRbJCII=l3SKfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqE HTTP/1.1 200 0 5254 text/html
- 5. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-BF3H6PXl5gv2pHn4oieWX_P93mpMmmA HTTP/1.1 200 0 14779 application/x-shockwave-flash
- 6. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-KAWf6PXl5gv2pHn4oieWX_PR3lJImmA HTTP/1.1 200 0 13938 application/x-silverlight-app
- 7. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-fECT6PXl5gv2pHn4oieWX_PJwnJAmmA&dfsdf=11010 HTTP/1.1 200 0 323584 application/x-msdownload
Compromised webpage Redirector Infector
Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)
- 1. GET http://blogs.XXXXX.com/ HTTP/1.1 200 0 25833 text/html
- 2. GET http://blogs.XXXXX.com/wp-includes/js/wp-emoji-release.min.js HTTP/1.1 200 0 6519 application/javascript
- 3. GET http://xc.rottencouchtomatoes.com/hlfvviewforumym.php HTTP/1.1 200 0 901 text/javascript
- 4. GET http://ef.scber.com/?wXqBcrWVLRbJCII=l3SKfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqE HTTP/1.1 200 0 5254 text/html
- 5. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-BF3H6PXl5gv2pHn4oieWX_P93mpMmmA HTTP/1.1 200 0 14779 application/x-shockwave-flash
- 6. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-KAWf6PXl5gv2pHn4oieWX_PR3lJImmA HTTP/1.1 200 0 13938 application/x-silverlight-app
- 7. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-fECT6PXl5gv2pHn4oieWX_PJwnJAmmA&dfsdf=11010 HTTP/1.1 200 0 323584 application/x-msdownload
Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)
- 1. GET http://blogs.XXXXX.com/ HTTP/1.1 200 0 25833 text/html
- 2. GET http://blogs.XXXXX.com/wp-includes/js/wp-emoji-release.min.js HTTP/1.1 200 0 6519 application/javascript
- 3. GET http://xc.rottencouchtomatoes.com/hlfvviewforumym.php HTTP/1.1 200 0 901 text/javascript
- 4. GET http://ef.scber.com/?wXqBcrWVLRbJCII=l3SKfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqE HTTP/1.1 200 0 5254 text/html
- 5. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-BF3H6PXl5gv2pHn4oieWX_P93mpMmmA HTTP/1.1 200 0 14779 application/x-shockwave-flash
- 6. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-KAWf6PXl5gv2pHn4oieWX_PR3lJImmA HTTP/1.1 200 0 13938 application/x-silverlight-app
- 7. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-fECT6PXl5gv2pHn4oieWX_PJwnJAmmA&dfsdf=11010 HTTP/1.1 200 0 323584 application/x-msdownload
SCRIPT src = "http://xc.rottencouchtomatoes.com/hlfvviewforumym.php"
Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)
- 1. GET http://blogs.XXXXX.com/ HTTP/1.1 200 0 25833 text/html
- 2. GET http://blogs.XXXXX.com/wp-includes/js/wp-emoji-release.min.js HTTP/1.1 200 0 6519 application/javascript
- 3. GET http://xc.rottencouchtomatoes.com/hlfvviewforumym.php HTTP/1.1 200 0 901 text/javascript
- 4. GET http://ef.scber.com/?wXqBcrWVLRbJCII=l3SKfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqE HTTP/1.1 200 0 5254 text/html
- 5. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-BF3H6PXl5gv2pHn4oieWX_P93mpMmmA HTTP/1.1 200 0 14779 application/x-shockwave-flash
- 6. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-KAWf6PXl5gv2pHn4oieWX_PR3lJImmA HTTP/1.1 200 0 13938 application/x-silverlight-app
- 7. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-fECT6PXl5gv2pHn4oieWX_PJwnJAmmA&dfsdf=11010 HTTP/1.1 200 0 323584 application/x-msdownload
Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)
- 1. GET http://blogs.XXXXX.com/ HTTP/1.1 200 0 25833 text/html
- 2. GET http://blogs.XXXXX.com/wp-includes/js/wp-emoji-release.min.js HTTP/1.1 200 0 6519 application/javascript
- 3. GET http://xc.rottencouchtomatoes.com/hlfvviewforumym.php HTTP/1.1 200 0 901 text/javascript
- 4. GET http://ef.scber.com/?wXqBcrWVLRbJCII=l3SKfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqE HTTP/1.1 200 0 5254 text/html
- 5. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-BF3H6PXl5gv2pHn4oieWX_P93mpMmmA HTTP/1.1 200 0 14779 application/x-shockwave-flash
- 6. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-KAWf6PXl5gv2pHn4oieWX_PR3lJImmA HTTP/1.1 200 0 13938 application/x-silverlight-app
- 7. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
- fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh
PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-fECT6PXl5gv2pHn4oieWX_PJwnJAmmA&dfsdf=11010 HTTP/1.1 200 0 323584 application/x-msdownload
Landing page Flash exploit Silverlight exploit Payload = malware!!
Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)
Landing page
Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)
Flash Exploit
Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)
Silverlight Exploit
Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)
Ursnif as encoded payload