Hunting malware using its fingerprints Piotr Biaczak About me - - PowerPoint PPT Presentation

hunting malware using its fingerprints
SMART_READER_LITE
LIVE PREVIEW

Hunting malware using its fingerprints Piotr Biaczak About me - - PowerPoint PPT Presentation

Hunting malware using its fingerprints Piotr Biaczak About me Piotr Biaczak Senior Researcher CERT Polska/NASK/ Warsaw University of Technology piotr.bialczak@cert.pl Twitter: @bialczakp 2 Malware hunting A mouse and cat game? 3


slide-1
SLIDE 1

Hunting malware using its fingerprints

Piotr Białczak

slide-2
SLIDE 2

About me

Piotr Białczak

Senior Researcher CERT Polska/NASK/ Warsaw University of Technology piotr.bialczak@cert.pl Twitter: @bialczakp

2

slide-3
SLIDE 3

3

Malware hunting

A mouse and cat game?

slide-4
SLIDE 4

Hunting malware

4

▷ Identification based on network infrastructure

not effective as we wish to

▷ Constant change of IPs and domains ▷ Dedicated mechanisms for evasion (DGA,

FastFlux)

▷ Problem with unknown threats

slide-5
SLIDE 5

What if we could identify something constant and distinguishable?

5

slide-6
SLIDE 6

Malware fingerprinting

▷ Provide mechanism to

identify threats

▷ Pinpoint some rarely

changed elements

▷ Make fingerprints unique ▷ Give good generalization ▷ Should be easy to share

6

slide-7
SLIDE 7

Different notions of malware fingerprinting

7

▷ Identification of malware families

○ Directly providing family name ○ More like multiclass malware detector

▷ Generic feature extractor

○ Providing representation of some features ○ It can be labeled with malware family name

▷ I will focus on the second group

slide-8
SLIDE 8

Network traffic fingerprints

8

▷ Mainly extraction of

some protocol fields

▷ Choosing fields hard to

change

▷ Presentation in two

forms: full and concise

Source: https://commons.wikimedia.org/wiki/File:Ipv4_header.svg

slide-9
SLIDE 9

Examples of usage scenarios

▷ Grouping activities ▷ Identification of malware families ▷ Identification of different operations in single

family

▷ Correlation of similar behavior between families

9

slide-10
SLIDE 10

Popular Tools

10

slide-11
SLIDE 11

JA3

11

▷ Fingerprinting TLS

clients/servers

▷ Decimal values of chosen

fields in Client Hello messages (then MD5 hash)

▷ Extensively supported ▷ Available:

https://github.com/salesforce /ja3

Source: https://engineering.salesforce.com/open-sourcing-ja3-92c9e53c3c41

slide-12
SLIDE 12

HASSH

12

▷ Fingerprinting SSH

clients/servers

▷ Decimal values of chosen

fields in SSH_MSG_KEXINIT messages (then MD5 hash)

▷ Available here:

https://github.com/salesforce /hassh

slide-13
SLIDE 13

FATT

13

▷ Fingerprint all the things ▷ Supports: JA3, HASSH ▷ Also: RDFP, gQUIC, HTTP

header fingerprint

▷ Live network and pcap files via

pyshark

▷ Available here:

https://github.com/0x4D31/fatt/

slide-14
SLIDE 14

Other great tools

▷ HTTP - https://lcamtuf.coredump.cx/p0f3/ ▷ TLS, DTLS, SSH, HTTP, TCP -

https://github.com/cisco/mercury

▷ Identification framework,

https://github.com/rapid7/recog

14

slide-15
SLIDE 15

SMTP fingerprinting

15

▷ More complex approach ▷ Fingerprinting SMTP implementation (dialect)

○ Exchanged messages (also their case) ○ SMTP extensions ○ IMF fields

▷ Detection of spambots ▷ Stringhini et al. B@bel: Leveraging Email Delivery for

Spam Mitigation

▷ Also SISSDEN.eu

slide-16
SLIDE 16

SMTP dialects

16

Source: https://sissden.eu/blog/analysis-of-smtp-dialects

Unfortunately no open-sourced tool

slide-17
SLIDE 17

DEMO

17

slide-18
SLIDE 18

hfinger

18

slide-19
SLIDE 19

Problems of current HTTP fingerprinting

▷ Collisions between families ▷ Collisions with benign software ▷ Limited analysis of URL and payload

19

slide-20
SLIDE 20

hfinger

Payload

  • length
  • entropy
  • presence of non-ASCII

characters

URL

  • URL length
  • Number of directories,

variables

  • File extension
  • URL parts lengths

Header structure

  • request method
  • protocol version
  • header order
  • popular headers'

values

20

slide-21
SLIDE 21

Fingerprint creation

21

POST /dir1/dir2?var1=val1 HTTP/1.1 Host: 127.0.0.1:8000 Accept: */* User-Agent: My UA Content-Length: 9 Content-Type: application/x-www-form-urlencoded misc=test

Direct representation of features - for easier interpretation

slide-22
SLIDE 22

Encoding

22

Analyzed popular headers

  • Connection
  • Accept-Encoding
  • Content-Encoding
  • Cache-Control
  • TE
  • Accept-Charset
  • Content-Type
  • Accept
  • Accept-Language
  • User-Agent

"application/javascript":"ap-ja", "application/json":"ap-js", "application/octet-stream":"ap-os", "application/pdf":"ap-pd", "application/x-octet-stream":"ap-x-o-s", "application/x-www-form-urlencoded":"ap-x-w-f-u", "audio/mpeg":"au-mp", "binary":"bi", "image/gif":"im-gi", "multipart/form-data":"mu-f-d", "octet/binary":"oc-bi", "octet-stream":"oc-st", "text/csv":"te-cs", "text/html":"te-ht", "text/plain":"te-pl", "text/xml":"te-xm"

slide-23
SLIDE 23

Different report modes

▷ Configurable report modes - depending on desired

information level

▷ Default

○ Tries to optimize uniqueness vs generalization

▷ Informative

○ More features presented

▷ All features

○ Full information, but also more fingerprints

23

slide-24
SLIDE 24

More info

▷ https://github.com/CERT-Polska/hfinger ▷ Working prototype stage - there will be

some changes

▷ Written in Python, uses Tshark for pcap

parsing and HTTP reassembly

▷ Comments, issues, PRs are welcomed

24

slide-25
SLIDE 25

hfinger demo

25

slide-26
SLIDE 26

Conclusion

26

▷ Fingerprints are helpful in hunting malware ▷ Until malware developers start to evade fingerprinters ▷ Problem of generalization vs collisions ▷ Some popular protocols are already covered, but not all of them ▷ Give a try to hfinger :-)

slide-27
SLIDE 27

27

slide-28
SLIDE 28

Hunting malware using its fingerprints

Piotr Białczak

Presentation template - slidescarnival.com