br breaking kern rnel ad address ss space la layout
play

Br Breaking Kern rnel Ad Address ss Space La Layout - PowerPoint PPT Presentation

Br Breaking Kern rnel Ad Address ss Space La Layout Randomization (KASLR LR) wi with th Intel TSX Yeongjin Jang , Sangho Lee, and Taesoo Kim Georgia Institute of Technology Kernel Address Space Layout Randomization (KASLR) A


  1. Timing Side Channel (M/U) • For Mapped / Unmapped addresses • Measured performance counters (on 1,000,000 probing) Perf. Counter Mapped Page Unmapped Page Description dTLB-loads 3,021,847 3,020,243 84 2,000,086 dTLB-load-misses TLB-miss on U Observed Timing 209 (fast) 240 (slow) • dTLB hit on mapped pages, but not for unmapped pages. • Timing channel is generated by dTLB hit/miss 44

  2. Timing Side Channel (M/U) • For Mapped / Unmapped addresses • Measured performance counters (on 1,000,000 probing) Perf. Counter Mapped Page Unmapped Page Description dTLB-loads 3,021,847 3,020,243 84 2,000,086 dTLB-load-misses TLB-miss on U Observed Timing 209 (fast) 240 (slow) • dTLB hit on mapped pages, but not for unmapped pages. • Timing channel is generated by dTLB hit/miss 45

  3. Path for an Unmapped Page Probing an unmapped page took 240 cycles Page Table PML4 dTLB PML3 PML3 PML2 PML2 PML2 PML1 PML1 PML1 PTE 46

  4. Path for an Unmapped Page Probing an unmapped page took 240 cycles Page Table PML4 Kernel address access dTLB PML3 PML3 PML2 PML2 PML2 PML1 PML1 PML1 PTE 47

  5. Path for an Unmapped Page Probing an unmapped page took 240 cycles Page Table TLB miss PML4 Kernel address access dTLB PML3 PML3 PML2 PML2 PML2 PML1 PML1 PML1 PTE 48

  6. Path for an Unmapped Page Probing an unmapped page took 240 cycles Page Table TLB miss PML4 Kernel address access dTLB PML3 PML3 PML2 PML2 PML2 PML1 PML1 PML1 PTE Page fault! 49

  7. Path for an Unmapped Page Probing an unmapped page took 240 cycles Page Table TLB miss PML4 Kernel address access dTLB PML3 PML3 PML2 PML2 PML2 PML1 PML1 PML1 PTE Page fault! Always do page table walk (slow) 50

  8. Path for a mapped Page On the first access, 240 cycles Page Table PML4 dTLB PML3 PML3 PML2 PML2 PML2 PML1 PML1 PML1 PTE 51

  9. Path for a mapped Page On the first access, 240 cycles Page Table PML4 Kernel address access dTLB PML3 PML3 PML2 PML2 PML2 PML1 PML1 PML1 PTE 52

  10. Path for a mapped Page On the first access, 240 cycles Page Table TLB miss PML4 Kernel address access dTLB PML3 PML3 PML2 PML2 PML2 PML1 PML1 PML1 PTE 53

  11. Path for a mapped Page On the first access, 240 cycles Page Table TLB miss PML4 Kernel address access dTLB PML3 PML3 PML2 PML2 PML2 PML1 PML1 PML1 PTE Page fault! 54

  12. Path for a mapped Page On the first access, 240 cycles Page Table TLB miss PML4 Kernel address access dTLB PML3 PML3 PTE PML2 PML2 PML2 PML1 PML1 PML1 Cache TLB entry! PTE Page fault! 55

  13. Path for a mapped Page On the second access, 209 cycles Page Table PML4 dTLB PML3 PML3 PTE PML2 PML2 PML2 PML1 PML1 PML1 PTE 56

  14. Path for a mapped Page On the second access, 209 cycles Page Table PML4 Kernel address access dTLB PML3 PML3 PTE PML2 PML2 PML2 PML1 PML1 PML1 PTE 57

  15. Path for a mapped Page On the second access, 209 cycles Page Table PML4 Kernel address access dTLB PML3 PML3 PTE PML2 PML2 PML2 dTLB hit PML1 PML1 PML1 PTE Page fault! 58

  16. Path for a mapped Page On the second access, 209 cycles Page Table PML4 Kernel address access dTLB PML3 PML3 PTE PML2 PML2 PML2 dTLB hit PML1 PML1 PML1 PTE Page fault! No page table walk on the second access (fast) 59

  17. Timing Side Channel (X/NX) • For Executable / Non-executable addresses • Measured performance counters (on 1,000,000 probing) Perf. Counter Exec Page Non-exec Page Unmapped Page 590 iTLB-loads (hit) 1,000,247 272 31 12 1,000,175 iTLB-load-misses 181 (fast) 226 (slow) 226 (slow) Observed Timing • Point #1: iTLB hit on Non-exec, but it is slow (226) why? • iTLB is not the origin of the side channel 60

  18. Timing Side Channel (X/NX) • For Executable / Non-executable addresses • Measured performance counters (on 1,000,000 probing) Perf. Counter Exec Page Non-exec Page Unmapped Page 590 iTLB-loads (hit) 1,000,247 272 31 12 1,000,175 iTLB-load-misses 181 (fast) 226 (slow) 226 (slow) Observed Timing • Point #1: iTLB hit on Non-exec, but it is slow (226) why? • iTLB is not the origin of the side channel 61

  19. Timing Side Channel (X/NX) • For Executable / Non-executable addresses • Measured performance counters (on 1,000,000 probing) Perf. Counter Exec Page Non-exec Page Unmapped Page 590 iTLB-loads (hit) 1,000,247 272 31 12 1,000,175 iTLB-load-misses 181 (fast) 226 (slow) 226 (slow) Observed Timing • Point #2: iTLB does not even hit on Exec page, while NX page hits iTLB • iTLB did not involve in the fast path • Is there any cache that does not require address translation? 62

  20. Intel Cache Architecture From the patent US 20100138608 A1 , 63 registered by Intel Corporation

  21. Intel Cache Architecture • L1 instruction cache • Virtually-indexed, Physically-tagged cache (requires TLB access) • Caches actual x86/x64 opcode From the patent US 20100138608 A1 , 64 registered by Intel Corporation

  22. Intel Cache Architecture • Decoded i-cache • An instruction will be decoded as micro-ops (RISC-like instruction) • Decoded i-cache stores micro-ops • Virtually-indexed, Virtually-tagged cache (no TLB access) From the patent US 20100138608 A1 , 65 registered by Intel Corporation

  23. Path for an Unmapped Page On the second access, 226 cycles Page Table PML4 iTLB PML3 PML3 PML2 PML2 PML2 PML1 PML1 PML1 PTE 66

  24. Path for an Unmapped Page On the second access, 226 cycles Page Table PML4 Kernel address access iTLB PML3 PML3 PML2 PML2 PML2 PML1 PML1 PML1 PTE 67

  25. Path for an Unmapped Page On the second access, 226 cycles Page Table TLB miss PML4 Kernel address access iTLB PML3 PML3 PML2 PML2 PML2 PML1 PML1 PML1 PTE 68

  26. Path for an Unmapped Page On the second access, 226 cycles Page Table TLB miss PML4 Kernel address access iTLB PML3 PML3 PML2 PML2 PML2 PML1 PML1 PML1 PTE Page fault! 69

  27. Path for an Unmapped Page On the second access, 226 cycles Page Table TLB miss PML4 Kernel address access iTLB PML3 PML3 PML2 PML2 PML2 PML1 PML1 PML1 PTE Page fault! Always do page table walk (slow) 70

  28. Path for an Executable Page On the first access Page Table PML4 Decoded iTLB PML3 PML3 I-cache PML2 PML2 PML2 PML1 PML1 PML1 PTE 71

  29. Path for an Executable Page On the first access Page Table Kernel address PML4 Decoded access iTLB PML3 PML3 I-cache PML2 PML2 PML2 PML1 PML1 PML1 PTE 72

  30. Path for an Executable Page On the first access Page Table Kernel address PML4 miss Decoded access iTLB PML3 PML3 I-cache PML2 PML2 PML2 PML1 PML1 PML1 PTE 73

  31. Path for an Executable Page On the first access Page Table Kernel address TLB miss PML4 miss Decoded access iTLB PML3 PML3 I-cache PML2 PML2 PML2 PML1 PML1 PML1 PTE 74

  32. Path for an Executable Page On the first access Page Table Kernel address TLB miss PML4 miss Decoded access iTLB PML3 PML3 I-cache PML2 PML2 PML2 PML1 PML1 PML1 PTE Insufficient privilege, fault! 75

  33. Path for an Executable Page On the first access Page Table Kernel address TLB miss PML4 miss Decoded access iTLB PML3 PML3 I-cache Cache TLB PTE PML2 PML2 PML2 PML1 PML1 PML1 PTE Insufficient privilege, fault! 76

  34. Path for an Executable Page On the first access Page Table Kernel address TLB miss PML4 miss Decoded access iTLB PML3 PML3 I-cache Cache TLB PTE uops PML2 PML2 PML2 PML1 PML1 PML1 PTE Cache Decoded Instructions Insufficient privilege, fault! 77

  35. Path for an Executable Page On the second access, 181 cycles Page Table PML4 Decoded iTLB PML3 PML3 I-cache PTE uops PML2 PML2 PML2 PML1 PML1 PML1 PTE 78

  36. Path for an Executable Page On the second access, 181 cycles Page Table Kernel address PML4 Decoded access iTLB PML3 PML3 I-cache PTE uops PML2 PML2 PML2 PML1 PML1 PML1 PTE 79

  37. Path for an Executable Page On the second access, 181 cycles Page Table Kernel address PML4 Decoded access iTLB PML3 PML3 I-cache PTE uops PML2 PML2 PML2 PML1 PML1 PML1 Decoded I-cache hit! PTE Insufficient privilege, fault! 80

  38. Path for an Executable Page On the second access, 181 cycles Page Table Kernel address PML4 Decoded access iTLB PML3 PML3 I-cache PTE uops PML2 PML2 PML2 PML1 PML1 PML1 Decoded I-cache hit! PTE Insufficient privilege, fault! No TLB access, No page table walk (fast) 81

  39. Path for a non-executable, but mapped Page On the second access, 226 cycles Page Table PML4 Decoded iTLB PML3 PML3 I-cache PTE PML2 PML2 PML2 PML1 PML1 PML1 PTE 82

  40. Path for a non-executable, but mapped Page On the second access, 226 cycles Page Table Kernel address PML4 Decoded access iTLB PML3 PML3 I-cache PTE PML2 PML2 PML2 PML1 PML1 PML1 PTE 83

  41. Path for a non-executable, but mapped Page On the second access, 226 cycles Page Table Kernel address PML4 miss Decoded access iTLB PML3 PML3 I-cache PTE PML2 PML2 PML2 PML1 PML1 PML1 PTE 84

  42. Path for a non-executable, but mapped Page On the second access, 226 cycles Page Table Kernel address PML4 miss Decoded access iTLB PML3 PML3 I-cache PTE PML2 PML2 PML2 TLB hit PML1 PML1 PML1 PTE Page fault! 85

  43. Path for a non-executable, but mapped Page On the second access, 226 cycles Page Table Kernel address PML4 miss Decoded access iTLB PML3 PML3 I-cache PTE PML2 PML2 PML2 TLB hit PML1 PML1 PML1 PTE Page fault! If no page table walk, it should be faster than unmapped (but not!) 86

  44. Cache Coherence and TLB • TLB is not a coherent cache in Intel Architecture 87

  45. Cache Coherence and TLB • TLB is not a coherent cache in Intel Architecture Core 1 1. Core 1 sets 0xff01 as Non-executable memory TLB 0xff01->0x0010, NX 88

  46. Cache Coherence and TLB • TLB is not a coherent cache in Intel Architecture Core 1 1. Core 1 sets 0xff01 as Non-executable memory 2. Core 2 sets 0xff01 as Executable memory TLB No coherency, do not update/invalidate TLB in Core 1 0xff01->0x0010, NX Core 2 TLB 0xff01->0x0010, X 89

  47. Cache Coherence and TLB • TLB is not a coherent cache in Intel Architecture Core 1 1. Core 1 sets 0xff01 as Non-executable memory 2. Core 2 sets 0xff01 as Executable memory TLB No coherency, do not update/invalidate TLB in Core 1 0xff01->0x0010, NX 3. Core 1 try to execute on 0xff01 -> fault by NX Core 2 TLB 0xff01->0x0010, X 90

  48. Cache Coherence and TLB • TLB is not a coherent cache in Intel Architecture Core 1 1. Core 1 sets 0xff01 as Non-executable memory 2. Core 2 sets 0xff01 as Executable memory TLB Execute No coherency, do not update/invalidate TLB in Core 1 0xff01->0x0010, NX 3. Core 1 try to execute on 0xff01 -> fault by NX Core 2 TLB 4. Core 1 must walk through the page table 0xff01->0x0010, X The page table entry is X, update TLB, then execute! 91

  49. Path for a Non-executable, but mapped Page On the second access, 226 cycles Page Table PML4 Decoded iTLB PML3 PML3 I-cache PTE PML2 PML2 PML2 PML1 PML1 PML1 PTE 92

  50. Path for a Non-executable, but mapped Page On the second access, 226 cycles Page Table Kernel address PML4 Decoded access iTLB PML3 PML3 I-cache PTE PML2 PML2 PML2 PML1 PML1 PML1 PTE 93

  51. Path for a Non-executable, but mapped Page On the second access, 226 cycles Page Table Kernel address PML4 miss Decoded access iTLB PML3 PML3 I-cache PTE PML2 PML2 PML2 PML1 PML1 PML1 PTE 94

  52. Path for a Non-executable, but mapped Page On the second access, 226 cycles Page Table Kernel address PML4 miss Decoded access iTLB PML3 PML3 I-cache PTE PML2 PML2 PML2 TLB hit PML1 PML1 PML1 PTE NX, cannot execute! 95

  53. Path for a Non-executable, but mapped Page On the second access, 226 cycles Page Table Kernel address PML4 miss Decoded access iTLB PML3 PML3 I-cache PTE PML2 PML2 PML2 TLB hit PML1 PML1 PML1 PTE NX, cannot execute! 96

  54. Path for a Non-executable, but mapped Page On the second access, 226 cycles Page Table Kernel address PML4 miss Decoded access iTLB PML3 PML3 I-cache Cache TLB PTE PML2 PML2 PML2 TLB hit PML1 PML1 PML1 PTE NX, cannot execute! NX, Page fault! 97

  55. Root-cause of Timing Side Channel (X/NX) • For executable / non-executable addresses Fast Path (X) Slow Path (NX) Slow Path (U) 1. Jmp into the Kernel addr 1. Jmp into the kernel addr 1. Jmp into the kernel addr 2. Decoded I-cache hits 2. iTLB hit 2. iTLB miss 3. Page fault! 3. Protection check fails, 3. Walks through page table page table walk. 4. Page fault! 4. Page fault! Cycles: 181 Cycles: 226 Cycles: 226 • Decoded i-cache generates timing side channel 98

  56. Countermeasures? • Modifying CPU to eliminate timing channels • Difficult to be realized L • Turning off TSX • Cannot be turned off in software manner (neither from MSR nor from BIOS) • Coarse-grained timer? • A workaround could be having another thread to measure the timing indirectly (e.g., counting i++;) 99

  57. Countermeasures? • Using separated page tables for kernel and user processes • High performance overhead (~30%) due to frequent TLB flush • TLB flush on every copy_to_user() • Fine-grained randomization • Compatibility issues on memory alignment, etc. • Inserting fake mapped / executable pages between the maps • Adds some false positives to the DrK Attack 100

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend