Hunting malware using its fingerprints Piotr Biaczak About me - - PowerPoint PPT Presentation
Hunting malware using its fingerprints Piotr Biaczak About me - - PowerPoint PPT Presentation
Hunting malware using its fingerprints Piotr Biaczak About me Piotr Biaczak Senior Researcher CERT Polska/NASK/ Warsaw University of Technology piotr.bialczak@cert.pl Twitter: @bialczakp 2 Malware hunting A mouse and cat game? 3
About me
Piotr Białczak
Senior Researcher CERT Polska/NASK/ Warsaw University of Technology piotr.bialczak@cert.pl Twitter: @bialczakp
2
3
Malware hunting
A mouse and cat game?
Hunting malware
4
▷ Identification based on network infrastructure
not effective as we wish to
▷ Constant change of IPs and domains ▷ Dedicated mechanisms for evasion (DGA,
FastFlux)
▷ Problem with unknown threats
What if we could identify something constant and distinguishable?
5
Malware fingerprinting
▷ Provide mechanism to
identify threats
▷ Pinpoint some rarely
changed elements
▷ Make fingerprints unique ▷ Give good generalization ▷ Should be easy to share
6
Different notions of malware fingerprinting
7
▷ Identification of malware families
○ Directly providing family name ○ More like multiclass malware detector
▷ Generic feature extractor
○ Providing representation of some features ○ It can be labeled with malware family name
▷ I will focus on the second group
Network traffic fingerprints
8
▷ Mainly extraction of
some protocol fields
▷ Choosing fields hard to
change
▷ Presentation in two
forms: full and concise
Source: https://commons.wikimedia.org/wiki/File:Ipv4_header.svg
Examples of usage scenarios
▷ Grouping activities ▷ Identification of malware families ▷ Identification of different operations in single
family
▷ Correlation of similar behavior between families
9
Popular Tools
10
JA3
11
▷ Fingerprinting TLS
clients/servers
▷ Decimal values of chosen
fields in Client Hello messages (then MD5 hash)
▷ Extensively supported ▷ Available:
https://github.com/salesforce /ja3
Source: https://engineering.salesforce.com/open-sourcing-ja3-92c9e53c3c41
HASSH
12
▷ Fingerprinting SSH
clients/servers
▷ Decimal values of chosen
fields in SSH_MSG_KEXINIT messages (then MD5 hash)
▷ Available here:
https://github.com/salesforce /hassh
FATT
13
▷ Fingerprint all the things ▷ Supports: JA3, HASSH ▷ Also: RDFP, gQUIC, HTTP
header fingerprint
▷ Live network and pcap files via
pyshark
▷ Available here:
https://github.com/0x4D31/fatt/
Other great tools
▷ HTTP - https://lcamtuf.coredump.cx/p0f3/ ▷ TLS, DTLS, SSH, HTTP, TCP -
https://github.com/cisco/mercury
▷ Identification framework,
https://github.com/rapid7/recog
14
SMTP fingerprinting
15
▷ More complex approach ▷ Fingerprinting SMTP implementation (dialect)
○ Exchanged messages (also their case) ○ SMTP extensions ○ IMF fields
▷ Detection of spambots ▷ Stringhini et al. B@bel: Leveraging Email Delivery for
Spam Mitigation
▷ Also SISSDEN.eu
SMTP dialects
16
Source: https://sissden.eu/blog/analysis-of-smtp-dialects
Unfortunately no open-sourced tool
DEMO
17
hfinger
18
Problems of current HTTP fingerprinting
▷ Collisions between families ▷ Collisions with benign software ▷ Limited analysis of URL and payload
19
hfinger
Payload
- length
- entropy
- presence of non-ASCII
characters
URL
- URL length
- Number of directories,
variables
- File extension
- URL parts lengths
Header structure
- request method
- protocol version
- header order
- popular headers'
values
20
Fingerprint creation
21
POST /dir1/dir2?var1=val1 HTTP/1.1 Host: 127.0.0.1:8000 Accept: */* User-Agent: My UA Content-Length: 9 Content-Type: application/x-www-form-urlencoded misc=test
Direct representation of features - for easier interpretation
Encoding
22
Analyzed popular headers
- Connection
- Accept-Encoding
- Content-Encoding
- Cache-Control
- TE
- Accept-Charset
- Content-Type
- Accept
- Accept-Language
- User-Agent
"application/javascript":"ap-ja", "application/json":"ap-js", "application/octet-stream":"ap-os", "application/pdf":"ap-pd", "application/x-octet-stream":"ap-x-o-s", "application/x-www-form-urlencoded":"ap-x-w-f-u", "audio/mpeg":"au-mp", "binary":"bi", "image/gif":"im-gi", "multipart/form-data":"mu-f-d", "octet/binary":"oc-bi", "octet-stream":"oc-st", "text/csv":"te-cs", "text/html":"te-ht", "text/plain":"te-pl", "text/xml":"te-xm"
Different report modes
▷ Configurable report modes - depending on desired
information level
▷ Default
○ Tries to optimize uniqueness vs generalization
▷ Informative
○ More features presented
▷ All features
○ Full information, but also more fingerprints
23
More info
▷ https://github.com/CERT-Polska/hfinger ▷ Working prototype stage - there will be
some changes
▷ Written in Python, uses Tshark for pcap
parsing and HTTP reassembly
▷ Comments, issues, PRs are welcomed
24
hfinger demo
25
Conclusion
26
▷ Fingerprints are helpful in hunting malware ▷ Until malware developers start to evade fingerprinters ▷ Problem of generalization vs collisions ▷ Some popular protocols are already covered, but not all of them ▷ Give a try to hfinger :-)
27
Hunting malware using its fingerprints
Piotr Białczak
Presentation template - slidescarnival.com