Understanding Linux Malware Emanuele Cozzi 1 , Mariano Graziano 2 , - - PowerPoint PPT Presentation

understanding linux malware
SMART_READER_LITE
LIVE PREVIEW

Understanding Linux Malware Emanuele Cozzi 1 , Mariano Graziano 2 , - - PowerPoint PPT Presentation

Understanding Linux Malware Emanuele Cozzi 1 , Mariano Graziano 2 , Yanick Fratantonio 1 , Davide Balzarotti 1 1 EURECOM 2 Cisco Systems, Inc. IEEE Symposium on Security & Privacy, May 2018 Malware and operating systems Malware and operating


slide-1
SLIDE 1

Understanding Linux Malware

Emanuele Cozzi1, Mariano Graziano2, Yanick Fratantonio1, Davide Balzarotti1

1EURECOM 2Cisco Systems, Inc.

IEEE Symposium on Security & Privacy, May 2018

slide-2
SLIDE 2

Malware and operating systems

slide-3
SLIDE 3

Malware and operating systems

slide-4
SLIDE 4

Linux malware on the rise

Mirai

slide-5
SLIDE 5

Linux malware on the rise

Mirai Erebus

slide-6
SLIDE 6

Linux malware on the rise

Mirai Erebus OutlawCountry

slide-7
SLIDE 7

Linux malware on the rise

Mirai Erebus OutlawCountry

slide-8
SLIDE 8

Objectives

  • Develop a dynamic analysis sandbox for Linux binaries (and IoT devices)
slide-9
SLIDE 9

Objectives

  • Develop a dynamic analysis sandbox for Linux binaries (and IoT devices)

◮ Previous studies only looked at the network behavior 1 2 1Antonakakis et al. ”Understanding the mirai botnet,” USENIX Security Symposium 2017. 2Yin Minn Pa et al. ”IoTPOT: analysing the rise of IoT compromises,” USENIX Workshop on Offensive

Technologies 2015.

slide-10
SLIDE 10

Objectives

  • Develop a dynamic analysis sandbox for Linux binaries (and IoT devices)

◮ Previous studies only looked at the network behavior 1 2

  • Identify challenges and limitations of porting traditional techniques to the new

environment

1Antonakakis et al. ”Understanding the mirai botnet,” USENIX Security Symposium 2017. 2Yin Minn Pa et al. ”IoTPOT: analysing the rise of IoT compromises,” USENIX Workshop on Offensive

Technologies 2015.

slide-11
SLIDE 11

Objectives

  • Develop a dynamic analysis sandbox for Linux binaries (and IoT devices)

◮ Previous studies only looked at the network behavior 1 2

  • Identify challenges and limitations of porting traditional techniques to the new

environment

  • Understand differences in the malware characteristics (packing, obfuscantion, VM

detection, privilege excalation, persistence...) wrt Windows malware

1Antonakakis et al. ”Understanding the mirai botnet,” USENIX Security Symposium 2017. 2Yin Minn Pa et al. ”IoTPOT: analysing the rise of IoT compromises,” USENIX Workshop on Offensive

Technologies 2015.

slide-12
SLIDE 12

Target devices

slide-13
SLIDE 13

Target devices

slide-14
SLIDE 14

Target devices Diversity

slide-15
SLIDE 15

Diversity

CPU: Intel

slide-16
SLIDE 16

Diversity

CPU: Intel, ARM, MIPS, Motorola, Sparc

slide-17
SLIDE 17

Diversity

CPU: Intel, ARM, MIPS, Motorola, Sparc OS: Linux

slide-18
SLIDE 18

Diversity

CPU: Intel, ARM, MIPS, Motorola, Sparc OS: Linux, BSD, Android

slide-19
SLIDE 19

Diversity

CPU: Intel, ARM, MIPS, Motorola, Sparc OS: Linux, BSD, Android Libraries: glibc

slide-20
SLIDE 20

Diversity

CPU: Intel, ARM, MIPS, Motorola, Sparc OS: Linux, BSD, Android Libraries: glibc, uclibc, libpcap, libopencl

slide-21
SLIDE 21

Diversity

CPU: Intel, ARM, MIPS, Motorola, Sparc OS: Linux, BSD, Android Libraries: glibc, uclibc, libpcap, libopencl Statically-linked ELF unportable

slide-22
SLIDE 22

Diversity

CPU: Intel, ARM, MIPS, Motorola, Sparc OS: Linux, BSD, Android Libraries: glibc, uclibc, libpcap, libopencl Statically-linked ELF unportable Unknown device

slide-23
SLIDE 23

Analysis infrastructure

Data collection File & metadata analysis

File recognition AVClass ELF anomaly

Static analysis

Code analysis Packing identification

Dynamic analysis

Packer analysis Emulation Trace analysis Sandbox preparation

slide-24
SLIDE 24

Analysis infrastructure

Data collection File & metadata analysis

File recognition AVClass ELF anomaly

Static analysis

Code analysis Packing identification

Dynamic analysis

Packer analysis Emulation Trace analysis Sandbox preparation

slide-25
SLIDE 25

Analysis infrastructure

Data collection File & metadata analysis

File recognition AVClass ELF anomaly

Static analysis

Code analysis Packing identification

Dynamic analysis

Packer analysis Emulation Trace analysis Sandbox preparation

slide-26
SLIDE 26

Data collection

From November 2016 to November 2017 200 candidate samples per day Dataset of 10,548 Linux malware

slide-27
SLIDE 27

File & metadata analysis

Data collection File & metadata analysis

File recognition AVClass ELF anomaly

Static analysis

Code analysis Packing identification

Dynamic analysis

Packer analysis Emulation Trace analysis Sandbox preparation

slide-28
SLIDE 28

Dataset

Architecture Samples Percentage X86-64 3018 28.61% MIPS I 2120 20.10% PowerPC 1569 14.87% Motorola 68000 1216 11.53% Sparc 1170 11.09% Intel 80386 720 6.83% ARM 32-bit 555 5.26% Hitachi SH 130 1.23% AArch64 (ARM 64-bit) 47 0.45%

  • thers

3 0.03% Distribution of the 10,548 downloaded samples across architectures

slide-29
SLIDE 29

Dataset

Architecture Samples Percentage X86-64 3018 28.61% MIPS I 2120 20.10% PowerPC 1569 14.87% Motorola 68000 1216 11.53% Sparc 1170 11.09% Intel 80386 720 6.83% ARM 32-bit 555 5.26% Hitachi SH 130 1.23% AArch64 (ARM 64-bit) 47 0.45%

  • thers

3 0.03% Distribution of the 10,548 downloaded samples across architectures

slide-30
SLIDE 30

Dataset

Architecture Samples Percentage X86-64 3018 28.61% MIPS I 2120 20.10% PowerPC 1569 14.87% Motorola 68000 1216 11.53% Sparc 1170 11.09% Intel 80386 720 6.83% ARM 32-bit 555 5.26% Hitachi SH 130 1.23% AArch64 (ARM 64-bit) 47 0.45%

  • thers

3 0.03% Distribution of the 10,548 downloaded samples across architectures

slide-31
SLIDE 31

ELF manipulation

\x07ELF ELF header Program header table .text .data Section header table

slide-32
SLIDE 32

ELF manipulation

\x07ELF ELF header Program header table .text .data Section header table

  • Anomalous ELF

◮ Sections table removed

slide-33
SLIDE 33

ELF manipulation

\x07ELF ELF header Program header table .text .data Section header table

  • Anomalous ELF

◮ Sections table removed

  • Invalid ELF

◮ Segments table points beyond file ◮ Overlapping header/segment ◮ Sections table points beyond file

slide-34
SLIDE 34

ELF manipulation

\x07ELF ELF header Program header table .text .data Section header table

  • Anomalous ELF

◮ Sections table removed

  • Invalid ELF

◮ Segments table points beyond file ◮ Overlapping header/segment ◮ Sections table points beyond file

  • Problems with common analysis tools

✘ readelf 2.26.1 ✘ GDB 7.11.1 ✘ pyelftools 0.24 ✔ IDA Pro 7

slide-35
SLIDE 35

AVClass3

Pymadro Miner Ebolachan Golad Lady Connectback Mirai Elfpatch Pomedaj Liora Ddostf Cinarek Ztorg Elknot Shishiga Aidra Chinaz Fysbis Ganiw Scanner Roopre Mrblack Equation Logcleaner Sniff Tsunami Sshbrute Probe Znaich Erebus Xingyi Xaynnalc Gafgyt Flood Coinminer Bassobo Killdisk Eicar Remaiten Bossabot Midav Getshell Drobur Webshell Dcom Cloudatlas Luabot Iroffer Mayday Grip Darkkomet Prochider Ircbot Xhide Portscan Xunpes Diesel Setag Raas Shelma Shellshock Nixgi Wuscan Cleanlog Sshdoor Psybnc Themoon Rekoobe Intfour Pulse Sickabs Hajime Hijacker Mumblehard Darlloz Sotdas Ladvix Pnscan Ropys Lightaidra Moose Vmsplice Ddoser Spyeye

3Sebastin et al. ”Avclass: A tool for massive malware labeling,” International Symposium on Research in

Attacks, Intrusions, and Defenses 2016.

slide-36
SLIDE 36

Static analysis

Data collection File & metadata analysis

File recognition AVClass ELF anomaly

Static analysis

Code analysis Packing identification

Dynamic analysis

Packer analysis Emulation Trace analysis Sandbox preparation

slide-37
SLIDE 37

Packing

  • oooooooo .
  • ‘888 ’

‘8 ’ ‘888 ‘ Y88 . ‘8888 d8 ’ 888 8 888 . d88 ’ Y888 . . 8 P 888 8 888ooo88P ’ ‘8888 ’ 888 8 888 .8 PY888 . ‘ 88 . . 8 ’ 888 d8 ’ ‘888 b ‘YbodP ’

  • 888o
  • 888o
  • 88888o

The Ultimate Packer f o r eXecutables

  • Vanilla UPX and custom variants are the prevalent packers (almost 4% of the

dataset)

slide-38
SLIDE 38

Packing

  • oo o
  • ooooo .
  • ‘8

‘8 ’ ‘888 ‘Y88 . ‘8888 d8 ’ 888 8 888 . d88 ’ Y8 8 . . 888 8 8 88P’ ‘8888 ’ 8 8 888 Y888 . 8 . 88 d8 ’ ‘88 ‘YbodP ’ 88o

  • 888o
  • 888

The Ultimate Packer f o r eXecutables

  • Vanilla UPX and custom variants are the prevalent packers (almost 4% of the

dataset)

slide-39
SLIDE 39

Packing

  • oo o
  • .
  • ‘8

‘8 ’ ‘888 ‘Y8 ‘8888 d8 ’ 888 8 888 . d8 Y8 8 . . 8 8 88P’ ‘88 8 8 888 Y 8 . 8 d8 ’ ‘88 ‘Yb dP ’ 88o

  • 888

888 The Ultimate Packer f o r eXecutables

  • Vanilla UPX and custom variants are the prevalent packers (almost 4% of the

dataset)

slide-40
SLIDE 40

Packing

  • o
  • .
  • ‘8

‘8 ’ ‘88 ‘Y8 88 d8 8 8 8 . d8 8 8 . . 8 8 ‘88 8 8 88 Y 8 . 8 8 ’ ‘88 b dP ’ 88o 88 88 The Ultimate Packer f o r eXecutables

  • Vanilla UPX and custom variants are the prevalent packers (almost 4% of the

dataset)

◮ modified magic bytes ◮ modified strings ◮ junk bytes

slide-41
SLIDE 41

Packing

  • o
  • .
  • ‘8

‘8 ’ ‘88 ‘Y8 88 d8 8 8 8 . d8 8 8 . . 8 8 ‘88 8 8 88 Y 8 . 8 8 ’ ‘88 b dP ’ 88o 88 88 The Ultimate Packer f o r eXecutables

  • Vanilla UPX and custom variants are the prevalent packers (almost 4% of the

dataset)

◮ modified magic bytes ◮ modified strings ◮ junk bytes

  • At least one malware family is using a custom packer
slide-42
SLIDE 42

Dynamic analysis

Data collection File & metadata analysis

File recognition AVClass ELF anomaly

Static analysis

Code analysis Packing identification

Dynamic analysis

Packer analysis Emulation Trace analysis Sandbox preparation

slide-43
SLIDE 43

Behaviors Persistence Required privileges Process interaction Process injection Privileges escalation Shell commands Information gathering Anti-debugging Anti-execution Sandbox detection Deception Processes enumeration

slide-44
SLIDE 44

Deception

$ ps PID CMD 1234 d41d8cd98f00b204e9800998ecf8427e $ ps PID CMD 1234

sh

$ ps PID CMD 1234

cron

$ ps PID CMD 1234

telnetd

$ ps PID CMD 1234

sshd

  • Malicious processes assume new names to trick process listing tools
  • 52% of the samples renamed the process
slide-45
SLIDE 45

Deception

$ ps PID CMD 1234 d41d8cd98f00b204e9800998ecf8427e $ ps PID CMD 1234

sh

$ ps PID CMD 1234

cron

$ ps PID CMD 1234

telnetd

$ ps PID CMD 1234

sshd

$ ps PID CMD 1234

my-tool

  • Malicious processes assume new names to trick process listing tools
  • 52% of the samples renamed the process
slide-46
SLIDE 46

Deception

$ ps PID CMD 1234 d41d8cd98f00b204e9800998ecf8427e $ ps PID CMD 1234

sh

$ ps PID CMD 1234

cron

$ ps PID CMD 1234

telnetd

$ ps PID CMD 1234

sshd

$ ps PID CMD 1234

my-tool

$ ps PID CMD 1234

a5ux38y

  • Malicious processes assume new names to trick process listing tools
  • 52% of the samples renamed the process
slide-47
SLIDE 47

Deception

$ ps PID CMD 1234 d41d8cd98f00b204e9800998ecf8427e $ ps PID CMD 1234

sh

$ ps PID CMD 1234

cron

$ ps PID CMD 1234

telnetd

$ ps PID CMD 1234

sshd

$ ps PID CMD 1234

my-tool

$ ps PID CMD 1234

a5ux38y

$ ps PID CMD 1234

  • Malicious processes assume new names to trick process listing tools
  • 52% of the samples renamed the process
slide-48
SLIDE 48

Evasion

/proc || /sys

  • Detect VMware, VirtualBox, QEMU, KVM but also OpenVZ, XEN or chroot jails
slide-49
SLIDE 49

Evasion

/proc || /sys

  • Detect VMware, VirtualBox, QEMU, KVM but also OpenVZ, XEN or chroot jails
  • Malware may also check their file name before real execution
slide-50
SLIDE 50

Evasion

/proc || /sys

  • Detect VMware, VirtualBox, QEMU, KVM but also OpenVZ, XEN or chroot jails
  • Malware may also check their file name before real execution

i f ( ! sandbox ) { //do e v i l } e l s e { p r i n t (” https :// lmgtfy . com/q=how+to+ @@@@@@@@@@@@@@@@@@@”) rm −r f / }

slide-51
SLIDE 51
slide-52
SLIDE 52
  • OS/ABI field in ELF header is not

used

slide-53
SLIDE 53
  • OS/ABI field in ELF header is not

used

  • Malware executed by root or user
slide-54
SLIDE 54
  • OS/ABI field in ELF header is not

used

  • Malware executed by root or user
  • Processes enumeration
slide-55
SLIDE 55
  • OS/ABI field in ELF header is not

used

  • Malware executed by root or user
  • Processes enumeration
  • Unstripped symbols (?)
slide-56
SLIDE 56

Conclusions

  • Linux malware still in its infancy
  • Already a broad range of behaviors and tricks
  • ELF binaries could run anywhere from a thermostat to a large server
  • New research needed to overcome the lack of information about the execution

environment

slide-57
SLIDE 57

Thank you

https://padawan.s3.eurecom.fr/

Emanuele Cozzi cozzi@eurecom.fr @invano