LIEF: Library to Instrument Executable Formats Table of Contents - - PowerPoint PPT Presentation

lief library to instrument executable formats table of
SMART_READER_LITE
LIVE PREVIEW

LIEF: Library to Instrument Executable Formats Table of Contents - - PowerPoint PPT Presentation

RMLL 2017 Romain Thomas - rthomas@quarkslab.com LIEF: Library to Instrument Executable Formats Table of Contents Introduction Project Overview Demo Conclusion About Romain Thomas (rthomas@quarkslab.com) - Security engineer Working on


slide-1
SLIDE 1

RMLL 2017

Romain Thomas - rthomas@quarkslab.com

LIEF: Library to Instrument Executable Formats

slide-2
SLIDE 2

Table of Contents

Introduction Project Overview Demo Conclusion

slide-3
SLIDE 3

About

Romain Thomas (rthomas@quarkslab.com) - Security engineer Working on obfuscation, software protection and reverse engineering Contributor to the Triton project, a dynamic binary analysis

framework.

slide-4
SLIDE 4

Layers of information

pefile, readelf, otool, LLVM . . . LLVM, IDA, capstone . . . Frida, Intel Pin, Triton, Qemu . . .

Tools Format

ELF, PE, Mach-O, COFF, XCOFF...

Content

x86, ARM, MIPS, AArch64 . . .

Behavior

DBI, emulator, sandbox, debugger . . .

slide-5
SLIDE 5

Layers of information

pefile, readelf, otool, LLVM . . . LLVM, IDA, capstone . . . Frida, Intel Pin, Triton, Qemu . . .

Tools Format

ELF, PE, Mach-O, COFF, XCOFF...

Content

x86, ARM, MIPS, AArch64 . . .

Behavior

DBI, emulator, sandbox, debugger . . .

slide-6
SLIDE 6

Layers of information

pefile, readelf, otool, LLVM . . . LLVM, IDA, capstone . . . Frida, Intel Pin, Triton, Qemu . . .

Tools Format

ELF, PE, Mach-O, COFF, XCOFF...

Content

x86, ARM, MIPS, AArch64 . . .

Behavior

DBI, emulator, sandbox, debugger . . .

slide-7
SLIDE 7

Layers of information

pefile, readelf, otool, LLVM . . . LLVM, IDA, capstone . . . Frida, Intel Pin, Triton, Qemu . . .

Tools Format

ELF, PE, Mach-O, COFF, XCOFF...

Content

x86, ARM, MIPS, AArch64 . . .

Behavior

DBI, emulator, sandbox, debugger . . .

slide-8
SLIDE 8

Layers of information

pefile, readelf, otool, LLVM . . . LLVM, IDA, capstone . . . Frida, Intel Pin, Triton, Qemu . . .

Tools Format

ELF, PE, Mach-O, COFF, XCOFF...

Content

x86, ARM, MIPS, AArch64 . . .

Behavior

DBI, emulator, sandbox, debugger . . .

slide-9
SLIDE 9

Layers of information

pefile, readelf, otool, LLVM . . . LLVM, IDA, capstone . . . Frida, Intel Pin, Triton, Qemu . . .

Tools Format

ELF, PE, Mach-O, COFF, XCOFF...

Content

x86, ARM, MIPS, AArch64 . . .

Behavior

DBI, emulator, sandbox, debugger . . .

slide-10
SLIDE 10

Layers of information

pefile, readelf, otool, LLVM . . . LLVM, IDA, capstone . . . Frida, Intel Pin, Triton, Qemu . . .

Tools Format

ELF, PE, Mach-O, COFF, XCOFF...

Content

x86, ARM, MIPS, AArch64 . . .

Behavior

DBI, emulator, sandbox, debugger . . .

slide-11
SLIDE 11

Howto?

Get assembly code? Get symbols? Get imported functions? Get entry point?

slide-12
SLIDE 12

Executable Formats What is an executable format ?

slide-13
SLIDE 13

Executable File Formats in a Nutshell

slide-14
SLIDE 14

Executable File Formats in a Nutshell

Executable file format gives information such as:

First instruction address to execute

slide-15
SLIDE 15

Executable File Formats in a Nutshell

Executable file format gives information such as:

First instruction address to execute Libraries used

slide-16
SLIDE 16

Executable File Formats in a Nutshell

Executable file format gives information such as:

First instruction address to execute Libraries used Target architecture (x86, ARM . . . )

slide-17
SLIDE 17

Executable File Formats in a Nutshell

The three mainstream formats:

ELF: Linux, Android . . . PE: Windows Mach-O: OS-X, iOS, . . .

slide-18
SLIDE 18

Purpose of LIEF

Provide a cross-platform library to parse ELF, PE and Mach-O

formats

slide-19
SLIDE 19

Purpose of LIEF

Provide a cross-platform library to parse ELF, PE and Mach-O

formats

Abstract common features from the different formats (section,

header, entry point, symbols . . . )

slide-20
SLIDE 20

Purpose of LIEF

Provide a cross-platform library to parse ELF, PE and Mach-O

formats

Abstract common features from the different formats (section,

header, entry point, symbols . . . )

Enable format modifications

slide-21
SLIDE 21

Purpose of LIEF

Provide a cross-platform library to parse ELF, PE and Mach-O

formats

Abstract common features from the different formats (section,

header, entry point, symbols . . . )

Enable format modifications Provide an API for different languages (Python, C++, C . . . )

slide-22
SLIDE 22

Purpose of LIEF

Provide a cross-platform library to parse ELF, PE and Mach-O

formats

Abstract common features from the different formats (section,

header, entry point, symbols . . . )

Enable format modifications Provide an API for different languages (Python, C++, C . . . )

Provide an all-in-one library to deal with executable formats

slide-23
SLIDE 23

Table of Contents

Introduction Project Overview Architecture Abstract Layer Tests and CI Demo Conclusion

slide-24
SLIDE 24

Architecture

slide-25
SLIDE 25

Architecture

LIEF

Python / C ELF Binary Parser Builder PE Binary Parser Builder Mach-O Binary Parser Builder Abstract layer C++

slide-26
SLIDE 26

Architecture

Format Binary Parser Builder

Format modeling

Header Sections . . .

Files

ELF/{Binary Header Section ...}.cpp PE/{Binary DosHeader Section ...}.cpp MachO/{Binary Header LoadCommand ...}.cpp

Parse the format and create a Binary object Files

ELF/Parser.{tcc,cpp} PE/Parser.{tcc,cpp} MachO/BinaryParser.{tcc,cpp}

Take the Binary object and reconstruct an executable Files

ELF/Builder.{tcc,cpp} PE/Builder.{tcc,cpp} MachO/Builder.{tcc,cpp}

slide-27
SLIDE 27

Architecture

Format Binary Parser Builder

Format modeling

Header Sections . . .

Files

ELF/{Binary Header Section ...}.cpp PE/{Binary DosHeader Section ...}.cpp MachO/{Binary Header LoadCommand ...}.cpp

Parse the format and create a Binary object Files

ELF/Parser.{tcc,cpp} PE/Parser.{tcc,cpp} MachO/BinaryParser.{tcc,cpp}

Take the Binary object and reconstruct an executable Files

ELF/Builder.{tcc,cpp} PE/Builder.{tcc,cpp} MachO/Builder.{tcc,cpp}

slide-28
SLIDE 28

Architecture

Format Binary Parser Builder

Format modeling

Header Sections . . .

Files

ELF/{Binary Header Section ...}.cpp PE/{Binary DosHeader Section ...}.cpp MachO/{Binary Header LoadCommand ...}.cpp

Parse the format and create a Binary object Files

ELF/Parser.{tcc,cpp} PE/Parser.{tcc,cpp} MachO/BinaryParser.{tcc,cpp}

Take the Binary object and reconstruct an executable Files

ELF/Builder.{tcc,cpp} PE/Builder.{tcc,cpp} MachO/Builder.{tcc,cpp}

slide-29
SLIDE 29

Architecture

Format Binary Parser Builder

Format modeling

Header Sections . . .

Files

ELF/{Binary Header Section ...}.cpp PE/{Binary DosHeader Section ...}.cpp MachO/{Binary Header LoadCommand ...}.cpp

Parse the format and create a Binary object Files

ELF/Parser.{tcc,cpp} PE/Parser.{tcc,cpp} MachO/BinaryParser.{tcc,cpp}

Take the Binary object and reconstruct an executable Files

ELF/Builder.{tcc,cpp} PE/Builder.{tcc,cpp} MachO/Builder.{tcc,cpp}

slide-30
SLIDE 30

Architecture

Format Binary Parser Builder

Format modeling

Header Sections . . .

Files

ELF/{Binary Header Section ...}.cpp PE/{Binary DosHeader Section ...}.cpp MachO/{Binary Header LoadCommand ...}.cpp

Parse the format and create a Binary object Files

ELF/Parser.{tcc,cpp} PE/Parser.{tcc,cpp} MachO/BinaryParser.{tcc,cpp}

Take the Binary object and reconstruct an executable Files

ELF/Builder.{tcc,cpp} PE/Builder.{tcc,cpp} MachO/Builder.{tcc,cpp}

slide-31
SLIDE 31

Architecture

Format Binary Parser Builder

Format modeling

Header Sections . . .

Files

ELF/{Binary Header Section ...}.cpp PE/{Binary DosHeader Section ...}.cpp MachO/{Binary Header LoadCommand ...}.cpp

Parse the format and create a Binary object Files

ELF/Parser.{tcc,cpp} PE/Parser.{tcc,cpp} MachO/BinaryParser.{tcc,cpp}

Take the Binary object and reconstruct an executable Files

ELF/Builder.{tcc,cpp} PE/Builder.{tcc,cpp} MachO/Builder.{tcc,cpp}

slide-32
SLIDE 32

Architecture

Format Binary Parser Builder

Format modeling

Header Sections . . .

Files

ELF/{Binary Header Section ...}.cpp PE/{Binary DosHeader Section ...}.cpp MachO/{Binary Header LoadCommand ...}.cpp

Parse the format and create a Binary object Files

ELF/Parser.{tcc,cpp} PE/Parser.{tcc,cpp} MachO/BinaryParser.{tcc,cpp}

Take the Binary object and reconstruct an executable Files

ELF/Builder.{tcc,cpp} PE/Builder.{tcc,cpp} MachO/Builder.{tcc,cpp}

slide-33
SLIDE 33

Abstract Layer

slide-34
SLIDE 34

Abstract Layer

ELF PE Mach-O Entry point Sections Symbols Libraries Relocations

slide-35
SLIDE 35

What is abstracted - Binary

Binary level

Imported functions Exported functions Patch value(s) from a given address Retrieve value(s) from a given address

slide-36
SLIDE 36

What is abstracted - Header

Header:

Type Entry point Architecture Modes Endianness

slide-37
SLIDE 37

What is abstracted - Header

Header:

Type

LIEF::OBJECT_TYPES::TYPE_EXECUTABLE LIEF::OBJECT_TYPES::TYPE_LIBRARY . . .

Entry point Architecture Modes Endianness

slide-38
SLIDE 38

What is abstracted - Header

Header:

Type Entry point Architecture

LIEF::ARCHITECTURES::ARCH_ARM LIEF::ARCHITECTURES::ARCH_X86 LIEF::ARCHITECTURES::ARCH_ARM64 . . .

Modes Endianness

slide-39
SLIDE 39

What is abstracted - Header

Header:

Type Entry point Architecture Modes

LIEF::MODES::MODE_64 LIEF::MODES::MODE_THUMB LIEF::MODES::MODE_V9 . . .

Endianness

slide-40
SLIDE 40

What is abstracted - Header

Header:

Type Entry point Architecture Modes Endianness

LIEF::ENDIANNESS::ENDIAN_BIG LIEF::ENDIANNESS::ENDIAN_LITTLE

slide-41
SLIDE 41

What is abstracted - Section

Section:

Name Offset Size Virtual Address Raw content Entropy

slide-42
SLIDE 42

What is abstracted - Symbol

Symbol:

Name

slide-43
SLIDE 43

Architecture

slide-44
SLIDE 44

Abstract Layer

import lief def get_abstract_binary(binary): return super(binary.__class__, binary) pe_exe = get_abstract_binary(lief.parse("PE64_x86-64_HelloWorld.exe")) macho_exe = get_abstract_binary(lief.parse("MachO64_x86-64_ls.bin")) elf_exe = get_abstract_binary(lief.parse("ELF64_x86-64_ls.bin")) binaries = [pe_exe, macho_exe, elf_exe] assert(all( binary.header.object_type == lief.OBJECT_TYPES.EXECUTABLE for binary in binaries))

slide-45
SLIDE 45

Abstract Layer

import lief def get_abstract_binary(binary): return super(binary.__class__, binary) pe_exe = get_abstract_binary(lief.parse("PE64_x86-64_HelloWorld.exe")) macho_exe = get_abstract_binary(lief.parse("MachO64_x86-64_ls.bin")) elf_exe = get_abstract_binary(lief.parse("ELF64_x86-64_ls.bin")) binaries = [pe_exe, macho_exe, elf_exe] assert(all( binary.header.architecture == lief.ARCHITECTURES.X86 for binary in binaries))

slide-46
SLIDE 46

Abstract Layer

import lief def get_abstract_binary(binary): return super(binary.__class__, binary) pe_exe = get_abstract_binary(lief.parse("PE64_x86-64_HelloWorld.exe")) macho_exe = get_abstract_binary(lief.parse("MachO64_x86-64_ls.bin")) elf_exe = get_abstract_binary(lief.parse("ELF64_x86-64_ls.bin")) binaries = [pe_exe, macho_exe, elf_exe] assert(all( lief.MODES.M64 in binary.header.modes for binary in binaries))

slide-47
SLIDE 47

Abstract Layer

import lief def get_abstract_binary(binary): return super(binary.__class__, binary) pe_exe = get_abstract_binary(lief.parse("PE64_x86-64_HelloWorld.exe")) macho_exe = get_abstract_binary(lief.parse("MachO64_x86-64_ls.bin")) elf_exe = get_abstract_binary(lief.parse("ELF64_x86-64_ls.bin")) binaries = [pe_exe, macho_exe, elf_exe] assert(all( binary.header.endianness == lief.ENDIANNESS.LITTLE for binary in binaries))

slide-48
SLIDE 48

nm

GNU nm lists the symbols from object files . . . nm utility

slide-49
SLIDE 49

nm

Binutils/BFD Version:

slide-50
SLIDE 50

nm

/* Print a single symbol. */ static void print_symbol (bfd *abfd, asymbol *sym, bfd_vma ssize, bfd *archive_bfd) { symbol_info syminfo; struct extended_symbol_info info; PROGRESS (1); format->print_symbol_filename (archive_bfd, abfd); bfd_get_symbol_info (abfd, sym, &syminfo); info.sinfo = &syminfo; info.ssize = ssize; if (bfd_get_flavour (abfd) == bfd_target_elf_flavour) info.elfinfo = (elf_symbol_type *) sym; else info.elfinfo = NULL; format->print_symbol_info (&info, abfd); if (line_numbers) { static asymbol **syms; static long symcount; const char *filename, *functionname; unsigned int lineno; ...

slide-51
SLIDE 51

nm

LIEF Version:

slide-52
SLIDE 52

nm

import lief import sys binary = lief.parse(sys.argv[1]) for symbol in binary.symbols: print(symbol)

slide-53
SLIDE 53

nm

$ python nm.py winhello64-mingw.exe __mingw_invalidPa... 0 1 NULL FUNCTION STATIC pre_c_init 10 1 NULL FUNCTION STATIC .rdata$.refptr.mi... 470 3 NULL NULL STATIC ...

slide-54
SLIDE 54

nm

$ python nm.py FAT_libc++abi.dylib ___bzero EXT 100 0 ___maskrune EXT 100 0 ___stack_chk_fail EXT 100 0 ___stack_chk_guard EXT 100 0 ___stderrp EXT 100 0 _fputc EXT 100 0 _free EXT 100 0 _fwrite EXT 100 0 _malloc EXT 100 0 _memcmp EXT 100 0 _memcpy EXT 100 0 _memmove EXT 100 0 ...

slide-55
SLIDE 55

nm

$ python nm.py /bin/ls getenv FUNC GLOBAL 0 0 GLIBC_2.2.5(3) cap_to_text FUNC GLOBAL 0 0 * Local * sigprocmask FUNC GLOBAL 0 0 GLIBC_2.2.5(3) raise FUNC GLOBAL 0 0 GLIBC_2.2.5(3) localtime FUNC GLOBAL 0 0 GLIBC_2.2.5(3) __mempcpy_chk FUNC GLOBAL 0 0 GLIBC_2.3.4(4) ...

slide-56
SLIDE 56

nm

Sectionless binary

slide-57
SLIDE 57

nm

With LIEF, we removed the sections from the ls binary.

$ readelf -S ls_no_sections There are no sections in this file. $ nm ls_no_sections nm: ls_no_sections: File format not recognized

slide-58
SLIDE 58

nm

$ python nm.py ls_no_sections getenv FUNC GLOBAL 0 0 GLIBC_2.2.5(3) cap_to_text FUNC GLOBAL 0 0 * Local * sigprocmask FUNC GLOBAL 0 0 GLIBC_2.2.5(3) raise FUNC GLOBAL 0 0 GLIBC_2.2.5(3) localtime FUNC GLOBAL 0 0 GLIBC_2.2.5(3) __mempcpy_chk FUNC GLOBAL 0 0 GLIBC_2.3.4(4) ...

slide-59
SLIDE 59

Howto? (answers)

Get assembly code? Howto #1

slide-60
SLIDE 60

Howto? (answers)

Get assembly code? Howto #1

import lief binary = lief.parse("C:\\Windows\\explorer.exe") # PE asm = binary.get_section(".text")

slide-61
SLIDE 61

Howto? (answers)

Get symbols? Howto #2

slide-62
SLIDE 62

Howto? (answers)

Get symbols? Howto #2

import lief binary = lief.parse("/bin/ls") # ELF for symbol in binary.symbols: print(symbols)

slide-63
SLIDE 63

Howto? (answers)

Get imported functions? Howto #3

slide-64
SLIDE 64

Howto? (answers)

Get imported functions? Howto #3

import lief binary = lief.parse("/usr/lib/libc++abi.dylib") # Mach-O for function in binary.imported_functions: print(function)

slide-65
SLIDE 65

Tests and CI

slide-66
SLIDE 66

Test suite

Unit tests

slide-67
SLIDE 67

Test suite

Unit tests ELF parser is fuzzed with Melkor

slide-68
SLIDE 68

Test suite

Unit tests ELF parser is fuzzed with Melkor Builder tests: We run the (reconstructed) binary and check that it

doesn’t crash

slide-69
SLIDE 69

Continuous Integration

Every commits are tested on Linux, OSX and Windows:

slide-70
SLIDE 70

Continuous Integration

Every commits are tested on Linux, OSX and Windows:

slide-71
SLIDE 71

Release

For each tagged versions we provide prebuilt SDK and Python packages

slide-72
SLIDE 72

Table of Contents

Introduction Project Overview Demo PE Hooking Petya signature ELF obfuscation Conclusion

slide-73
SLIDE 73

PE Hooking

slide-74
SLIDE 74

Petya signature

slide-75
SLIDE 75

ELF obfuscation

slide-76
SLIDE 76

Table of Contents

Introduction Project Overview Demo Conclusion Format modifications Documentation Version 0.7

slide-77
SLIDE 77

Format modifications

slide-78
SLIDE 78

Modifications

Format modifications can be a starting point to:

Packing Watermarking Hooking: Perform interposition on functions Persistent code injection Malware analysis (static unpacking . . . )

slide-79
SLIDE 79

Documentation

slide-80
SLIDE 80

Documentation

LIEF documentation includes:

Tutorials API: Python, C++ and C References: Existing projects that deals with executable formats Installation and compilation guide

slide-81
SLIDE 81

Documentation

See: https://lief.quarkslab.com/doc

slide-82
SLIDE 82

Version 0.7

slide-83
SLIDE 83

Version 0.7

What’s new ?

Function hooking through the IAT Icons, Manifest . . . modification with the ResourcesManager Serialize PE object into JSON Parse Rich Header

slide-84
SLIDE 84

Version 0.7

What’s new ?

Fully handle section-less binaries Parse notes: .note.ABI-tag, .note.gnu.build-id, . . . Parse SYSV hash table

slide-85
SLIDE 85

Version 0.7

Full changelog https://lief.quarkslab.com/doc/changelog.html#july-3-2017

slide-86
SLIDE 86

Source code is available on GitHub:

https://github.com/lief-project (Apache 2.0 license)

Website: https://lief.quarkslab.com

slide-87
SLIDE 87

Source code is available on GitHub:

https://github.com/lief-project (Apache 2.0 license)

Website: https://lief.quarkslab.com

Missing feature or bug?

slide-88
SLIDE 88

Source code is available on GitHub:

https://github.com/lief-project (Apache 2.0 license)

Website: https://lief.quarkslab.com

Missing feature or bug? lief@quarkslab.com

  • r

Open an issue / pull request

slide-89
SLIDE 89

Thank you!

Twitter: @rh0main