Run-DMA Michael Rushanan, Stephen Checkoway Johns Hopkins - - PowerPoint PPT Presentation

run dma
SMART_READER_LITE
LIVE PREVIEW

Run-DMA Michael Rushanan, Stephen Checkoway Johns Hopkins - - PowerPoint PPT Presentation

Run-DMA Michael Rushanan, Stephen Checkoway Johns Hopkins University, University of Illinois at Chicago 1 Introduction Arbitrary computation using Direct Memory Access engine Access all resources of the device Implement the following


slide-1
SLIDE 1

Run-DMA

Michael Rushanan, Stephen Checkoway Johns Hopkins University, University of Illinois at Chicago

1

slide-2
SLIDE 2

Introduction

  • Arbitrary computation using Direct Memory Access engine
  • Access all resources of the device
  • Implement the following as an example:
  • Brainfuck
  • Rootkit

2

Glorified memcpy

slide-3
SLIDE 3

Direct Memory Access

  • Offload task of copying memory to/from auxiliary processors

(e.g., NIC, GPU, etc)

  • Free CPU to do more interesting work

CPU Auxiliary Processor Main Memory DMA

3

slide-4
SLIDE 4

DMA Engine

  • CPU configures DMA transfer by setting control registers
  • Control registers specify transfer operation

src dest length next_cb

4

Control Block Structure

slide-5
SLIDE 5

Control Blocking Chaining

  • Scatter/gather DMA can transfer to/from multiple memory

areas in a single transaction

  • Configure a sequence of control blocks

5

src dest length next_cb src dest length next_cb src dest length next_cb

slide-6
SLIDE 6

Required DMA Properties

  • Perform memory-to-memory copies
  • Programmed by loading address of control blocks
  • Supports scatter/gather mode

6

slide-7
SLIDE 7

Target Device

  • Raspberry Pi 2 single-board computer
  • Other Potential DMA Engines:
  • Intel 8237 (e.g., legacy IBM PC/ATs)
  • Cell multi-core microprocessor (e.g., PS3)

BCM2836

7

slide-8
SLIDE 8

DMA Gadgets

  • DMA “programs” require self-modifying constructs
  • Overwrite members of later control blocks

8

src 01 00 00 00 01 00 00 00 cb0 cb1

slide-9
SLIDE 9

Table Lookups

9

src 01 00 00 00 dest 01 00 00 00 next_cb cb0 cb1 00 04 01 … sqr_tbl 01 04 sqr_tbl 02

slide-10
SLIDE 10

Basic Building Blocks

10

Unary Functions Lookup value in table and store to memory

y = f(x)

Variable Dereferencing Copy value pointed to into src/dest of subsequent control block

*x

slide-11
SLIDE 11

Basic Building Blocks

11

Conditional Goto Address of a control block written to the next_cb member of a trampoline Switch Offset table with entries that are offsets into an address table Memory-mapped I/O Registers Loop over memory-mapped flag or status register

slide-12
SLIDE 12

BrainFuck

12

slide-13
SLIDE 13

BrainFuck

13

+

increment the cell pointed to by head

++*ptr;

  • decrement the cell pointed to

by head

  • -*ptr;

>

increment head to point to the next cell

++ptr;

<

decrement head to point to the previous cell

  • -ptr;
slide-14
SLIDE 14

BrainFuck

14

[

if the cell pointed to by head is nonzero, execute next instruction; otherwise, jump to the instruction following ]

while (*ptr) {

]

if the cell pointed to by head is zero, execute next instruction; otherwise, jump to the instruction following [

}

,

store the input to the cell pointed to by head

*ptr=getchar();

.

  • utput the cell pointed to by

head

putchar(*ptr);

slide-15
SLIDE 15

Interpreter Implementation

  • 8 gadgets corresponding to BrainFuck instructions
  • Dispatch
  • Increment word and decrement word
  • Fetch Next instruction (i.e., increment PC and dispatch)

15

slide-16
SLIDE 16

Increment

16

01 02 ff 00 … inc_tbl 00 03 00 fb 04 00 00 00 04 00 00 00 cb0 cb1 01 00 00 00 cb2 01 00 00 00 cb3 01 00 03 00 fb 04 03 00 fb 00 10 00 fb

slide-17
SLIDE 17

Increment

17

01 02 ff 00 … inc_tbl 00 03 00 fb 04 00 00 00 04 00 00 00 cb0 cb1 01 00 00 00 cb2 01 00 00 00 cb3 01 00 03 00 fb 00 10 00 fb

Variable Dereference

slide-18
SLIDE 18

Increment

18

01 02 ff 00 … inc_tbl 00 03 00 fb 04 00 00 00 04 00 00 00 cb0 cb1 01 00 00 00 cb2 01 00 00 00 cb3 01 00 03 00 fb 00 10 00 fb

Unary Function

slide-19
SLIDE 19

Increment

19

01 02 ff 00 … inc_tbl 00 03 00 fb 04 00 00 00 04 00 00 00 cb0 cb1 01 00 00 00 cb2 01 00 00 00 cb3 01 00 03 00 fb 04 03 00 fb 00 10 00 fb 04 03 00 fb 01 10 00 fb

slide-20
SLIDE 20

Dispatch

20

quit nop inc dec … insn_tbl 00 30 00 fb 04 00 00 00 04 00 00 00 cb0 cb1 01 00 00 00 cb2 01 00 00 00 cb3 2b e0 30 00 fb 00 20 00 fb 00 23 00 fb 00 00 00 00 trampoline 00 04 … … … … 08 0c 10 04 dispatch_tbl 2b 20 00 fb 08 23 00 fb

slide-21
SLIDE 21

Variable Dereference

Dispatch

21

quit nop inc dec … insn_tbl 00 30 00 fb 04 00 00 00 04 00 00 00 cb0 cb1 01 00 00 00 cb2 01 00 00 00 cb3 2b e0 30 00 fb 00 20 00 fb 00 23 00 fb 00 00 00 00 trampoline 00 04 … … … … 08 0c 10 04 dispatch_tbl 2b 20 00 fb 08 23 00 fb

slide-22
SLIDE 22

Switch

Dispatch

22

quit nop inc dec … insn_tbl 00 30 00 fb 04 00 00 00 04 00 00 00 cb0 cb1 01 00 00 00 cb2 01 00 00 00 cb3 2b e0 30 00 fb 00 20 00 fb 00 23 00 fb 00 00 00 00 trampoline 00 04 … … … … 08 0c 10 04 dispatch_tbl 2b 20 00 fb 08 23 00 fb

slide-23
SLIDE 23

Dispatch

23

quit nop inc dec … insn_tbl 00 30 00 fb 04 00 00 00 01 00 00 00 cb0 cb1 01 00 00 00 cb2 04 00 00 00 cb3 2b e0 30 00 fb 00 20 00 fb 00 23 00 fb 00 00 00 00 trampoline 00 04 … … … … 08 0c 10 04 dispatch_tbl 2b 20 00 fb 08 23 00 fb

slide-24
SLIDE 24

Turing-Complete

  • BrainFuck is Turing-complete
  • We implemented BrainFuck with DMA gadgets
  • Thus DMA gadgets are Turing-complete

Simulate any other computational device/language

24

slide-25
SLIDE 25

Resource-Complete

  • DMA has access to memory-mapped IO registers
  • Thus DMA gadgets are resource-complete

Access all resources

  • f system from within

the language

25

slide-26
SLIDE 26

Hello World

https://github.com/stevecheckoway/rundma

26

slide-27
SLIDE 27

More Gadgets

  • Binary functions
  • f : {0,1}8 × {0,1}8 → {0,1}8
  • Relational operators
  • Equality (e.g., =)
  • Inequality (e.g., <)

27

slide-28
SLIDE 28

Raspbian Rootkit

  • Raspbian Linux
  • task_structs hold information about a process
  • pointer to cred structure (e.g., UID of process)
  • pointer to next structure

init_task task 1 task n

28

slide-29
SLIDE 29

DMA Performance

29

Gadget Control Blocks

inc/dec 4 inc/dec word 4 + 2 trampolines dispatch 33 right/left 26 left/right condition 2 I/O 5

slide-30
SLIDE 30

Total DMA Transfers

30

Program Control Blocks

Interpreter 148 Hello World 36356 Rootkit 20

slide-31
SLIDE 31

DMA Malware

  • DMA Malware
  • Code running on auxilary processor/external device with

DMA access

  • Example: firewire, thunderbolt, NIC, GPU
  • Main difference of our work:
  • DMA gadgets run entirely on DMA engine
  • No additional processors

31

slide-32
SLIDE 32

Countermeasures

  • Input/out memory management (Duflot, 2011)
  • Peripheral firmware load-time integrity (Stewin, 2012)
  • Anomaly detection systems (Duflot, 2011)
  • Bus agent runtime monitors (Stewin, 2013)

32

slide-33
SLIDE 33

Conclusion

  • Everything non-trivial ends up being Turing-complete
  • Parsing file formats
  • Page Tables
  • DMA Engine is yet another example
  • We need to consider specialized hardware

33

slide-34
SLIDE 34

Questions?

34