The Geometry of Innocent Flesh on the Bone Return-into-libc without - - PowerPoint PPT Presentation

the geometry of innocent flesh on the bone
SMART_READER_LITE
LIVE PREVIEW

The Geometry of Innocent Flesh on the Bone Return-into-libc without - - PowerPoint PPT Presentation

The Geometry of Innocent Flesh on the Bone Return-into-libc without Function Calls (on the x86) Hovav Shacham CCS 07 hovav@cs.ucsd.edu Technical Background Gadget: a short instructions sequence (e.x. pop %edx; ret;)


slide-1
SLIDE 1

The Geometry of Innocent Flesh on the Bone

Return-into-libc without Function Calls (on the x86)

Hovav Shacham hovav@cs.ucsd.edu CCS ‘07

slide-2
SLIDE 2

Technical Background

  • Gadget: a short instructions sequence (e.x. pop %edx; ret;)
  • Return-Oriented Programming(ROP): a technique by which an

attacker can induce arbitrary behavior in a program whose control flow he has diverted, without injecting any code

  • Return-into-libc: a buffer overflow attack which causes the

vulnerable program to jump to some existing code which is already loaded into the memory

slide-3
SLIDE 3

Background: Attacker Model

  • Must find some way to subvert the program’s control flow

○ Overwriting a return address on the stack

  • Must cause the program to act in the manner of his choosing

○ Injecting code into the process image

slide-4
SLIDE 4
  • Non-executable stack
  • W X:

Marks all writeable(“W”) locations in a process’ address space as non-executable(“X”)

  • Deployment:

Linux (via PaX patches); OpenBSD; Windows; OS X;

  • Hardware support:

Intel “XD” bit, AMD “NX” bit

Background: Defenses

slide-5
SLIDE 5
  • Return-into-libc is considered a limited attack

○ the attacker can execute only straight-line code ■ calling one libc function after another ○ the attacker can be restricted ■ removing certain functions from libc

Background: Existing Problem

slide-6
SLIDE 6
  • Traditional return-into-libc building blocks are functions

○ can be removed by the maintainers of libc

  • Our building blocks are short code sequences

○ very difficult to eliminate

Building Blocks:Traditional vs New return-into-libc

slide-7
SLIDE 7

We rely on the following:

  • x86 instructions are not aligned

○ we can make out more words on the page

  • x86 ISA is extremely dense

○ random byte stream can be interpreted as a series of valid instruction with high probability ➔ Our goal ◆ Find sequences that end in a return instruction ( c3 )

  • libc contains many such sequences

Building Blocks

slide-8
SLIDE 8

f7 c7 07 00 00 00 test $0x00000007, %edi 0f 95 45 c3 setnzb -61(%ebp) Starting one byte later c7 07 00 00 00 0f movl $0x0f000000, (%edi) 95 xchg %ebp, %eax 45 inc %ebp c3 ret

Building Blocks: Example

slide-9
SLIDE 9
  • Apply an instruction alignment scheme ( like the one MIPS uses )

➔ Cons ◆ Code compiled for this scheme cannot call libraries not compiled for this ◆ May introduce slowdowns

Building Blocks: Defense

slide-10
SLIDE 10
  • Short code sequences
  • Calls no functions at all
  • The code sequences have random interfaces,

unlike function-call interface is standard.

  • Called code sequences weren’t placed in libc by the authors,

so are not easily removed.

New return-into-libc Techniques

slide-11
SLIDE 11
  • Valid instructions sequences that :

○ could be used on our gadgets ○ end in a ret instruction ➔ None of the instructions should cause the processor to transfer execution away ( not reaching the ret ) ➔ ret causes the processor to continue to the next step

Finding Sequences: Useful Instruction Sequences

slide-12
SLIDE 12
  • Any suffix of an instruction sequence is also a useful instruction sequence

e.g. If we find "a; b; c; ret" then "b; c; ret" also exist

  • We care if some sequence occurs but not how often it does

➔ We chose to record sequences on a trie ➔ root of the trie is the ret

Finding Sequences: Recording our Findings

slide-13
SLIDE 13

The Idea

  • Scan backwards from an already found sequence for valid instructions

➔ Some sequences ending with ret are ignored ◆ leave; ret; ◆ pop %ebp; ret; ◆ ret; or unconditional jump

Finding Sequences: Producing the Trie

slide-14
SLIDE 14

Finding Sequences: Implementation

slide-15
SLIDE 15

➔ Results ◆ Analyzed 1,189,501 bytes of libc's executable segment

  • yielded a trie with 15,121 nodes
  • took 1,6 sec on 1.33GHz PowerPC G4 with 1GB RAM

Implementation & Performance

slide-16
SLIDE 16
  • Stack pointer (%esp) determines which instruction sequence to fetch &

execute

  • Processor doesn’t automatically increment %esp;

but the “ret” at end of each instruction sequence does

Return-Oriented Programming (ROP)

slide-17
SLIDE 17

Operations

  • Load/Store

○ Loading a Constant ○ Loading from Memory ○ Storing to Memory

  • Control Flow

○ Unconditional Jump ○ Conditional Jumps

  • Arithmetic & Logic

○ Add ○ Exclusive OR (XOR) ○ And, Or, Not ○ Shift and Rotate

  • System Calls
slide-18
SLIDE 18

Load a constant into a register

slide-19
SLIDE 19

Add Operation

slide-20
SLIDE 20

Control Flow: Unconditional Jump

slide-21
SLIDE 21

Control Flow: Conditional Jump

Strategy:

  • Check whether a value is equal to zero by using neg

○ clears CF if equal to zero or sets CF otherwise

  • We have a word that contains either esp_delta (if flag is 1)
  • r 0 (if flag is 0)

esp_delta is the amount we’d like to perturb %esp by

  • Perturb %esp by the computed amount

(using esp_delta or 0)

slide-22
SLIDE 22

System Calls (1/2)

  • Syscalls have simple wrappers in libc with the following behavior:

○ move the arguments from the stack to the registers

arguments are loaded in register %ebx, %ecx, %edx, %esi, %edi, %ebp (in this order)

○ set the syscall number in %eax ○ trap into the kernel ○ check for error and translate the return value ➔ we can invoke any syscall we want by: ◆ setting up the parameters ourselves ◆ jump into a wrapper that is immediately before lcall

slide-23
SLIDE 23

System Calls (2/2)

slide-24
SLIDE 24

Return-Oriented Shellcode (1/2)

➢ Invokes the execve system call to run a shell. Requirements: ➔ Set the system call index in %eax to 0xb (execve) ➔ Set the path of the program to run in %ebx to the string “/bin/sh” ➔ Set the argument vector argv in %ecx ➔ Set the environment vector envp in %edx

slide-25
SLIDE 25

Return-Oriented Shellcode (2/2)

slide-26
SLIDE 26

Return-Oriented Shellcode (2/2)

lacll is invoked with arguments:

  • %ebx = “/bin/sh”
  • %ecx = addr of argv
  • %edx = addr of envp
slide-27
SLIDE 27

Catalog of rets: Origin of c3 byte

  • Check whether c3:

○ belongs to a function exported in libc’s SYMTAB section ■ disassemble the function until we discover which instruction includes the c3 ➔ Out of 975,626 covered bytes, 5,483 are c3 bytes (one in every 178)

slide-28
SLIDE 28

Catalog of rets: avoid spurious rets

  • each procedure could have a single exit point

( early exits jump to this point )

  • %ebx could be avoided as an accumulator for adds
  • moves from %eax to %ebx could be avoided

(or written using instruction other than mov )

  • instruction placements could be adjusted (avoid offsets with c3)

➔ Drawbacks ◆ compiler would be less transparent and complicated ◆ loss of efficiency in the use of registers

slide-29
SLIDE 29

Catalog of rets: kinds of returns

  • c3

○ near return

  • c2 imm16

○ near return with stack unwind

  • cb

○ far return

  • ca imm16

○ far return with stack unwind

➔ Last three variants are more difficult to use in the exploits we described

slide-30
SLIDE 30

Conclusion

  • A new way of organising return-into-libc exploits on x86
  • Discovered short instruction sequences
  • Showed how to combine such sequences into gadgets
slide-31
SLIDE 31

Thank you! Questions?

Παπαδόπουλος Παναγιώτης-Ηλίας Παπαδογιαννάκη Ευαγγελία Κλεφτογιώργος Κωνσταντίνος