Non-transient Side Channels Mengjia Yan Fall 2020 6.888 - PowerPoint PPT Presentation

Non-transient Side Channels Mengjia Yan Fall 2020 6.888 L5-Non-transient Side Channels 1

Lab Assignment • Handout on course website • Each (regular) student will receive an email • Solo or 2-person group • Individual GitHub repo • Info about accessing a server machine • Listeners can send us an email if you want to try the lab • Advice: • Start early. The first step is not to implement the attack, but to reverse engineer the machine. 6.888 L5-Non-transient Side Channels 2

Recap: Prime+Probe Sender line Sender Receiver Receiver line # ways Time Cache Set Prime Shared Cache 6.888 L5-Non-transient Side Channels 3

Recap: Prime+Probe Sender line Sender Receiver Receiver line # ways Access Time Cache Set Prime Wait Shared Cache 6.888 L5-Non-transient Side Channels 4

Recap: Prime+Probe Sender line Sender Receiver Receiver line # ways Access Time Cache Set Probe Prime Wait Shared Cache Receive “1” = 8 accesses à 1 miss 6.888 L5-Non-transient Side Channels 5

Analogy: Bucket/Ball How many cache lines in total in the system? How to find the bucket used by the sender? Sender Receiver Receiver’s address Sender’s address # ways Cache Set Shared Cache Each cache set is a bucket that can hold 8 balls 6.888 L5-Non-transient Side Channels 6

Practical Cache Side Channels 6.888 L5-Non-transient Side Channels 7

Cache Mapping – Directly Mapped Cache • Can think cache mapping as a hash table with limited size • Linear cache set mapping using modular arithmetic index Tag Data (64 bytes) 31 0 Physical 0 32bit Address: 1 2 3 Set Index = (Addr / Block Size) % Number of Sets 4 5 6 7 6.888 L5-Non-transient Side Channels 8

Cache Mapping – Directly Mapped Cache • Can think cache mapping as a hash table with limited size • Linear cache set mapping using modular arithmetic Assuming byte-addressable index Tag Data (64 bytes) 31 9 31 9 8 6 8 6 5 0 5 0 Physical 0 Tag 32bit Set Index Line offset Address: (high order bits) (3 bits) (6 bits) 1 2 3 To distinguish addresses Number of bits for set index = in the same set log 2 (Number of sets) 4 5 Question: Given an 1MB L2 with 1024 sets, how 6 many bits are used for set index? 7 6.888 L5-Non-transient Side Channels 9

Cache Mapping – Set Associative Cache • Can think cache mapping as a hash table with limited size • Linear cache set mapping using modular arithmetic 2-way cache Tag Data index Tag Data 31 9 8 6 5 0 31 9 8 6 5 0 0 Physical Tag Tag Set Index Index Line offset Line offset Address: 1 (high order bits) (high order bits) (3 bits) (3 bits) (6 bits) (6 bits) 2 Find eviction set 3 == 4 Find addresses with the same set index bits 5 6 Question: How to decide which way to use? 7 Answer: Cache replacement policy. 6.888 L5-Non-transient Side Channels 10

Address Translation (4KB page) 48 12 11 0 Programmer’s view Page offset Virtual page number Virtual Address (48bit): (12 bits) Page Copy Table page offset 31 12 11 0 system’s view Page offset physical page number Physical Address (32bit): (12 bits) 6.888 L5-Non-transient Side Channels 11

Find Eviction Set Using Virtual Addresses 48 12 11 0 Virtual Address (48bit): Virtual page number Page offset 31 12 11 0 Physical Address (32bit): Page offset physical page number 4KB page (12 bits) Cache mapping: Tag Index Line offset (8 sets) (3 bits) (6 bits) Cache mapping: 2 Tag Set Index Line offset bit (8 bits) (6 bits) (256 sets) Not controllable via virtual address. 6.888 L5-Non-transient Side Channels 12

Huge Pages • Huge page size: 2MB or 1GB • Number of bits for page offset? 48 12 11 0 Virtual Address : Page offset Virtual page number 4KB page (12 bits) 48 21 20 0 Virtual Address : Page offset 2MB page Virtual page number (21 bits) Cache mapping: Tag Set Index Line offset (256 sets) (8 bits) (6 bits) 6.888 L5-Non-transient Side Channels 13

Multi-level Caches core core D-L1 D-L1 I-L1 I-L1 … • Motivation: L2 L2 • A memory cannot be large and fast. Add level of cache to reduce miss penalty LLC A typical configuration of Intel Ivy Bridge. Configurations are different with processor types. L1-I/D cache L2 cache L3 cache (LLC) DRAM Size 32KB 256KB 1MB/core 16GB Associativity 4 or 8 8 16 N/A (# ways) Latency 1-5 12 ~40 ~150 (cycles) 6.888 L5-Non-transient Side Channels 14

Multi-level Caches core core D-L1 D-L1 I-L1 I-L1 … • Motivation: L2 L2 • A memory cannot be large and fast. Add level of cache to reduce miss penalty LLC • LLC is generally divided into multiple slices • Conflict happens if addresses map to the same slice and the same set Tag Set Index Line offset An undocumented secret hash function Slice ID = Hash(bits) 6.888 L5-Non-transient Side Channels 15

Eviction Set Construction Algorithm Sender line Sender Receiver Receiver line Time Access Candidate Addresses Vila et al. Theory and Practice of Finding Eviction Sets. S&P’19 Shared Cache 6.888 L5-Non-transient Side Channels 16

Eviction Set Construction Algorithm Sender line Sender Receiver Receiver line Access Target Time Address Access Candidate Wait Addresses Vila et al. Theory and Practice of Finding Eviction Sets. S&P’19 6.888 L5-Non-transient Side Channels 17

Eviction Set Construction Algorithm Sender line Sender Receiver Receiver line Access Target Address Time Access Candidate Measure Latency of Wait Addresses Each Candidate Address Vila et al. Theory and Practice of Finding Eviction Sets. S&P’19 6.888 L5-Non-transient Side Channels 18

Problems Due to Replacement Policy • Self-eviction due to replacement policy • An LRU (least recently used) example Initial: • A small trick: 1 2 3 4 5 6 7 8 Prime: • Access addresses in reverse order Victim access: 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 Probe: Which to evict? 6.888 L5-Non-transient Side Channels 19

Measure Latency of Multiple Accesses • HW Prefetcher + Out-of-order execution What we expect: Ld A1 Ld A2 …… Ld A7 Ld A8 T1 = rdtsc() Time Dummy1=Ld(Addr1) What actually will happen: …… Dummy8=Ld(Addr8) Ld A1 T2 = rdtsc() Ld A2 …… Latency = T2-T1 Ld A7 Ld A8 Time 6.888 L5-Non-transient Side Channels 20

Out-of-Order Processor Writeback Fetch Decode RegRead Execute (Commit) Check whether the register to read is ready. Ld A1 Ld A2 Question: How to serialize …… Ld A7 data accesses? Ld A8 Time 6.888 L5-Non-transient Side Channels 21

Serialize Data Accesses • A special instruction “mfence” https://www.felixcloutier.com/x86/mfence • Add data dependency by creating a linked list Pointer to the Dummy1 = Ld(Addr1) content next node dummy A1 dummy A2 dummy A3 …… Addr2 = Ld(Addr1) • Double linked list to access addresses in reverse order A1 A2 A3 …… A1 A2 6.888 L5-Non-transient Side Channels 22

Handle Noise A real-world example: Square-and-Multiply Exponentiation • What you generally see in papers: for i = n-1 to 0 do r = sqr(r) mod n if e i == 1 then r = mul(r, b) mod n end end 6.888 L5-Non-transient Side Channels 23

The Multiply Function 6.888 L5-Non-transient Side Channels 24

Raw Trace Access latencies measured in the probe operation in Prime+Probe. A sequence of “01010111011001” can be deduced as part of the exponent. 6.888 L5-Non-transient Side Channels 25

There may exist other problems • Tips for lab assignment • Build the attack step-by-step • Recommend to read “Last-Level Cache Side-Channel Attacks are Practical” • Ask questions via Piazza 6.888 L5-Non-transient Side Channels 26

Defenses 6.888 L5-Non-transient Side Channels 27

Micro-architecture Side Channels secret-dependent execution A Channel (a micro-architecture structure) Victim Attacker X {Cache, DRAM, TLB, NoC, etc.} {Transient, Non-transient} Kiriansky et al. DAWG: a defense against cache timing attacks in speculative execution processors. MICRO’18 6.888 L5-Non-transient Side Channels 28

Micro-architecture Side Channels secret-dependent execution A Channel (a micro-architecture structure) Victim Attacker Defenses: Block creation of signals: Block detection of signals: Close the channel: Oblivious execution, Randomization, etc. Isolation, etc. speculative execution defenses, etc. Kiriansky et al. DAWG: a defense against cache timing attacks in speculative execution processors. MICRO’18 6.888 L5-Non-transient Side Channels 29

Non-transient Side Channels Mengjia Yan Fall 2020 6.888 - PowerPoint PPT Presentation

Non-transient Side Channels Mengjia Yan Fall 2020 6.888 L5-Non-transient Side Channels 1 Lab Assignment Handout on course website Each (regular) student will receive an email Solo or 2-person group Individual GitHub repo

Non-transient Side Channels Mengjia Yan Fall 2020 6.888 L5-Non-transient Side Channels 1 Lab

Outline Side and covert channels Transient execution CSci 5271 Introduction to Computer

Side Channels and Covert Channels Daniel Bosk Department of Information and Communication Systems

Multiple Input and Output Channels Multiple Input and Output Channels Multiple Input Channels In

FUCHSIA: Data-Driven Debugging for Functional Side Channels Saeid Tizpaz-Niari* , Pavol Cerny,

Transient Fault Detection and Reducing Transient Error Rate Jose Lugo-Martinez CSE 240C:

Transient Test Reactors Dr. Daniel M. Wachs National Technical Lead for Transient Testing Idaho

SCHISM numerical formulation Joseph Zhang Horizontal grid: hybrid 2 3 Side 2 4 3 Side 1

Screaming Ch Channels When Electromagnetic Side Channels Meet Radio Transceivers Giovanni

Transient Side Channels Mengjia Yan Fall 2020 Based on slides from Christopher W. Fletcher

Transient Side Channels Mengjia Yan Fall 2020 Based on slides from Christopher W. Fletcher

Outline Transient execution covert channels (contd) OS trust and assurance CSci 5271

Lecture 7 Space- Multiplexed Channel Electrical Channels-2 Types of Electrical Channels

Other Communication Channels Select channels of communication that will reach your audiences.

Chapter 6 Marks and Channels Vis/Visual Analytics, Chap 6 Marks/Channels 1 CGGM Lab., CS

Covert and Side Channels Robert Brotzman-Smith Covert Channels Any communication channel that

1 Trace Cache Summary of Reducing Cache Hit Time Trace: a dynamic sequence of Small and simple

Lecture 15: Query Processing & Indexes Monday, March 23, 2015 Where we are Annotated

Instruc=on Set Architecture 2 Schedule Today Closer look

multi-byte values in memory Store across contiguous byte locations. 64-bit Words Bytes Address

Embedded C for Zynq C r i s t i a n S i s t e r n a U n i v e r s i d a d N a c i o n a l

CS 126 Lecture A2: TOY Programming Outline Review and Introduction Data representation

Multi-Indexed Files : Outline ! Introduction ! Inverted Files ! Multilist Files rasitjutrakul

Operating Systems Operating Systems CMPSC 473 CMPSC 473 File System Implementation File System

Non-transient Side Channels Mengjia Yan Fall 2020 6.888 - PowerPoint PPT Presentation

Non-transient Side Channels Mengjia Yan Fall 2020 6.888 L5-Non-transient Side Channels 1 Lab Assignment Handout on course website Each (regular) student will receive an email Solo or 2-person group Individual GitHub repo

Non-transient Side Channels Mengjia Yan Fall 2020 6.888 L5-Non-transient Side Channels 1 Lab

Outline Side and covert channels Transient execution CSci 5271 Introduction to Computer

Side Channels and Covert Channels Daniel Bosk Department of Information and Communication Systems

Multiple Input and Output Channels Multiple Input and Output Channels Multiple Input Channels In

FUCHSIA: Data-Driven Debugging for Functional Side Channels Saeid Tizpaz-Niari* , Pavol Cerny,

Transient Fault Detection and Reducing Transient Error Rate Jose Lugo-Martinez CSE 240C:

Transient Test Reactors Dr. Daniel M. Wachs National Technical Lead for Transient Testing Idaho

SCHISM numerical formulation Joseph Zhang Horizontal grid: hybrid 2 3 Side 2 4 3 Side 1

Screaming Ch Channels When Electromagnetic Side Channels Meet Radio Transceivers Giovanni

Transient Side Channels Mengjia Yan Fall 2020 Based on slides from Christopher W. Fletcher

Transient Side Channels Mengjia Yan Fall 2020 Based on slides from Christopher W. Fletcher

Outline Transient execution covert channels (contd) OS trust and assurance CSci 5271

Lecture 7 Space- Multiplexed Channel Electrical Channels-2 Types of Electrical Channels

Other Communication Channels Select channels of communication that will reach your audiences.

Chapter 6 Marks and Channels Vis/Visual Analytics, Chap 6 Marks/Channels 1 CGGM Lab., CS

Covert and Side Channels Robert Brotzman-Smith Covert Channels Any communication channel that

1 Trace Cache Summary of Reducing Cache Hit Time Trace: a dynamic sequence of Small and simple

Lecture 15: Query Processing &amp; Indexes Monday, March 23, 2015 Where we are Annotated

Instruc=on Set Architecture 2 Schedule Today Closer look

multi-byte values in memory Store across contiguous byte locations. 64-bit Words Bytes Address

Embedded C for Zynq C r i s t i a n S i s t e r n a U n i v e r s i d a d N a c i o n a l

CS 126 Lecture A2: TOY Programming Outline Review and Introduction Data representation

Multi-Indexed Files : Outline ! Introduction ! Inverted Files ! Multilist Files rasitjutrakul

Operating Systems Operating Systems CMPSC 473 CMPSC 473 File System Implementation File System

Lecture 15: Query Processing & Indexes Monday, March 23, 2015 Where we are Annotated