Efficient Memory Integrity Verification and Encryption for Secure - - PowerPoint PPT Presentation

efficient memory integrity verification and encryption
SMART_READER_LITE
LIVE PREVIEW

Efficient Memory Integrity Verification and Encryption for Secure - - PowerPoint PPT Presentation

Efficient Memory Integrity Verification and Encryption for Secure Processors G. Edward Suh, Dwaine Clarke, Blaise Gassend, Marten van Dijk, Srinivas Devadas Massachusetts Institute of Technology New Security Challenges Current computer


slide-1
SLIDE 1

Efficient Memory Integrity Verification and Encryption for Secure Processors

  • G. Edward Suh, Dwaine Clarke,

Blaise Gassend, Marten van Dijk, Srinivas Devadas Massachusetts Institute of Technology

slide-2
SLIDE 2

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

New Security Challenges

  • Current computer systems have a large Trusted

Computing Base (TCB)

– Trusted hardware: processor, memory, etc. – Trusted operating systems, device drivers

  • Future computers should have a much smaller TCB

– Untrusted OS – Physical attacks Without additional protection, components cannot be trusted

  • Why smaller TCB?

– Easier to verify and trust – Enables new applications

slide-3
SLIDE 3

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Applications

  • Emerging applications require TCBs that are secure

even from an owner

  • Distributed computation on Internet/Grid computing

– SETI@home, distributed.net, and more – Interact with a random computer on the net how can we trust the result?

  • Software licensing

– The owner of a system is an attacker

  • Mobile agents

– Software agents on Internet perform a task on behalf of you – Perform sensitive transactions on a remote (untrusted) host

slide-4
SLIDE 4

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Single-Chip AEGIS Secure Processors

Trusted Environment Memory I/O Check Integrity, Encrypt

  • Only trust a single chip: tamper-resistant

– Off-chip memory: verify the integrity and encrypt – Untrusted OS: identify a core part or protect against OS attacks

  • Cheap, Flexible, High Performance

Identify or Protect against Untrusted OS

slide-5
SLIDE 5

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Secure Execution Environments

  • Tamper-Evident (TE) environment

– Guarantees a valid execution and the identity of a program; no privacy – Any software or physical tampering to alter the program behavior should be detected Integrity verification

  • Private Tamper-Resistant (PTR) environment

– TE environment + privacy – Assume programs do not leak information via memory access patterns Encryption + Integrity verification

slide-6
SLIDE 6

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Other Trusted Computing Platforms

  • IBM 4758 cryptographic coprocessor

– Entire system (processor, memory, and trusted software) in a tamper-proof package – Expensive, requires continuous power

  • XOM (eXecution Only Memory): David Lie et al

– Stated goal: Protect integrity and privacy of code and data – Memory integrity checking does not prevent replay attacks – Always encrypt off-chip memory

  • Palladium/NGSCB: Microsoft

– Stated goal: Protect from software attacks – Memory integrity and privacy are assumed (only software attacks)

slide-7
SLIDE 7

Memory Encryption

slide-8
SLIDE 8

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Memory Encryption

Untrusted RAM L2 Cache Processor

ENCRYPT DECRYPT

write read

  • Encrypt on an L2 cache block granularity

– Use symmetric key algorithms (AES, 16 Byte chunks) – Should be randomized to prevent comparing two blocks – Adds decryption latency to each memory access

slide-9
SLIDE 9

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Direct Encryption (CBC mode): encrypt

Processor Memory

L2 Block B[1] B[2] B[3] B[4] RV Random # AESK AESK AESK AESK EB[1] EB[2] EB[3] EB[4] RV

slide-10
SLIDE 10

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Direct Encryption (CBC mode): decrypt

Processor Memory

EB[1] EB[2] EB[3] EB[4]

RV

B[1] B[2] B[3] B[4] AESK

  • 1

AESK

  • 1

AESK

  • 1

AESK

  • 1

L2 Miss!! Memory Request Read

  • Off-chip access latency

= latency for the last chunk of an L2 block + AES + XOR

Decryption directly impacts off-chip latency

slide-11
SLIDE 11

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

One-Time-Pad Encryption (OTP): encrypt

Processor Memory

B[1] B[2] B[3] B[4] Counter (Addr,TS,1) (Addr,TS,2) (Addr,TS,3) (Addr,TS,4) Time Stamp (TS) AESK

  • 1

AESK

  • 1

AESK

  • 1

AESK

  • 1

One-Time-Pad (OTP) EB[1] EB[2] EB[3] EB[4] TS To Memory

slide-12
SLIDE 12

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

One-Time-Pad Encryption (OTP): decrypt

Processor Memory

EB[1] EB[2] EB[3] EB[4] B[1] B[2] B[3] B[4] AESK

  • 1

AESK

  • 1

AESK

  • 1

AESK

  • 1

L2 Miss!!

Memory Request Read TS (Addr,TS,1) (Addr,TS,2) (Addr,TS,3) (Addr,TS,4)

  • Off-chip access latency = MAX( latency for the time stamp

+ AES, latency for an L2 block ) + XOR

Overlap the decryption with memory accesses

slide-13
SLIDE 13

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Effects of Encryption on Performance

  • Simulations based on the SimpleScalar tool set

– 9 SPEC CPU2000 benchmarks – 256-KB, 1-MB, 4-MB L2 caches with 64-B blocks – 32-bit time stamps and random vectors No caching! – Memory latency: 80/5, decryption latency: 40

  • Performance degradation by encryption

8% 13% Average 18% 25% Worst Case One-Time-Pad Direct (CBC)

slide-14
SLIDE 14

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Security and Optimizations

  • The security of the OTP is at least as good as the

conventional CBC scheme

– OTP is essentially a counter-mode (CTR) encryption

  • Further optimizations are possible

– For static data such as instructions, time stamps are not required completely overlap the AES computations with memory accesses – Cache time stamps on-chip, or speculate the value

  • Will be used for instruction encryption of Philips media

processors

slide-15
SLIDE 15

Integrity Verification

slide-16
SLIDE 16

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Difficulty of Integrity Verification

Untrusted RAM Trusted State Processor

ENCRYPT DECRYPT

Program

V E R I F Y E(124), MAC(0x45, 124) Address 0x45 E(120), MAC(0x45, 120) IGNORE

write read

Cannot simply MAC on writes and check the MAC on reads Replay attacks Hash trees for integrity verification

slide-17
SLIDE 17

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Hash Trees

Processor V1 V3 V4 L2 block Data Values

Logarithmic overhead for every cache miss Low performance ( 10x slowdown) Cached hash trees

MISS V2 READ VERIFY h1=h(V1.V2) h2=h(V3.V4) root = h(h1.h2) VERIFY Untrusted Memory

slide-18
SLIDE 18

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Cached Hash Trees (HPCA’03)

Processor V1 V2 V3 V4

Cache hashes in L2 L2 is trusted

Stop checking earlier Less overhead ( 22% average, 51% worst case) Still expensive

In L2 MISS In L2 h1=h(V1.V2) h2=h(V3.V4) root = h(h1.h2) VERIFY VERIFY MISS VERIFY DONE!!! Untrusted Memory

slide-19
SLIDE 19

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Can we do better?

  • Some applications only require to verify memory accesses after

a long execution

– Distributed computation – No need to check after each memory access

  • Can we just check a sequence of accesses?

Job Dispatcher

Processor’s Private Key

Secure Processor

RESULT RESULT

enter_aegis Execute Get results Verify results

  • H(Prog)
  • signature

Program, Data Processor’s Public Key

slide-20
SLIDE 20

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Log Hash Integrity Verification: Idea

  • At run-time, maintain a log of reads and writes

– Reads: make a ‘read’ note with (address, value) in the log – Writes: make a ‘write’ note with (address, value) in the log

  • check: go thru log, check each read has the most recent value

written to the address

  • Problem!!: Log grows use cryptographic hashes

Write 1 at 0x40 Write 2 at 0x50 Write ( 0x40, 1) Write (0x50, 2) Read (0x50, 2) Read (0x40, 1) Read 2 from 0x50 Read 1 from 0x40 Checker Log Write ( 0x40, 1) Write (0x50, 2) Read (0x50, 2) Read (0x40, 1) Write ( 0x40, 1) Write (0x50, 2) Read (0x50, 2) Read (0x40, 1) Untrusted Memory

slide-21
SLIDE 21

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Log Hash Algorithms: Run-Time

WriteHash (0x40, 0, 0) (0x50, 0, 0) (0x40, 10, 1) Timer: 0 Processor Initialize (all zero) Read 2 from 0x50 Cache Miss!! Read 0 from 0x40 ReadHash (0x40, 0, 0) Timer: 1 Cache eviction Write 10 at 0x40

Only one additional time stamp access for each memory access

  • Use set hashes as compressed logs

– Set hash: maps a set to a fixed length string – ReadHash: a set of read entries (addr, val, time) in the log – WriteHash: a set of write entries (addr, val, time) in the log

  • Use Timer (time stamp) to keep the ordering of entries

Untrusted Memory

slide-22
SLIDE 22

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Log Hash Algorithms: Integrity Check

  • Read all the addresses that are not in a cache
  • Compare ReadHash and WriteHash (same set?)

WriteHash (0x40, 0, 0) (0x50, 0, 0) (0x40, 10, 1)

Timer: 0

Processor ReadHash (0x40, 0, 0) (0x40, 10, 1) (0x50, 0, 0) Timer: 1=? Read 2 from 0x50 Read all Untrusted Memory

slide-23
SLIDE 23

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Checking Overhead of Log Hash Scheme

0.1 0.2 0.3 0.4 0.5 0.6 1.E+00 1.E+02 1.E+04 1.E+06 1.E+08 1.E+10

Off-chip Accesses IPC LHash LHash-RT CHTree

  • Integrity check requires reading the entire memory space

being used

– Cost depends on the size and the length of an application

  • For long programs, the checking overhead is negligible

– Amortized over a long execution time

SWIM, 1MB L2, Uses 192MB

Better than Hash Trees for programs w/ more than 10 million accesses Check overhead is negligible for programs w/ more than a billion accesses

slide-24
SLIDE 24

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Performance Comparisons

  • Overhead for TE environments

– Integrity verification

  • Overhead for PTR environments

– Integrity verification + encryption 4% 22% Average 15% 52% Worst Case LHash CHTree 10% 31% Average 23% 59% Worst Case LHash + OTP CHTree + CBC

slide-25
SLIDE 25

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Summary

  • Untrusted owners are becoming more prevalent

– Untrusted OS, physical attacks requires a small TCB

  • Single-chip secure processors require off-chip protection

mechanisms: Integrity verification and Encryption

  • OTP encryption scheme reduces the overhead of

encryption in all cases

– Allows decryption to be overlapped with memory accesses – Cache or speculate time stamps to further hide decryption latency

  • Log Hash scheme significantly reduces the overhead of

integrity verification for certified execution when programs are long enough

slide-26
SLIDE 26

MICRO36 — December 3-5, 2003

  • G. Edward Suh — MIT Computer Science and Artificial Intelligence Laboratory

Questions?

More Information at www.csg.lcs.mit.edu