SLIDE 1 A COMPARISON OF SOFTWARE AND HARDWARE A COMPARISON OF SOFTWARE AND HARDWARE TECHNIQUES FOR X86 VIRTUALIZATION TECHNIQUES FOR X86 VIRTUALIZATION
by Keith Adams, Ole by Keith Adams, Ole Ageson Ageson Presented by Michael Presented by Michael Wallner Wallner
May 10 May 10th
th, Software Systems Seminar 2007
, Software Systems Seminar 2007 Department of Computer Sciences, University of Salzburg Department of Computer Sciences, University of Salzburg
SLIDE 2
Content Content
Introduction Introduction Virtualization Virtualization Classical Classical Software Software Hardware Hardware Comparison Comparison Opportunities Opportunities Conclusion Conclusion
SLIDE 3
VM
Introduction and Terminology Introduction and Terminology
Virtualization Virtualization Virtual Machine Virtual Machine Guest Guest Virtual Machine Monitor Virtual Machine Monitor Host Host Motivation Motivation Resource utilization Resource utilization Development Development ... ...
VM APP APP OS Kernel APP APP OS Kernel VMM Hardware
SLIDE 4
CLASSICAL VIRTUALIZATION CLASSICAL VIRTUALIZATION
SLIDE 5 Classical Virtualization Classical Virtualization
Three essential characteristics ( Three essential characteristics (Popek Popek and Goldberg) and Goldberg) Fidelity Fidelity – – runs any software runs any software Performance Performance – – fairly fast fairly fast Safety Safety – – VMM manages hardware VMM manages hardware Trap Trap-
and-
Emulate Only real solution until recently Only real solution until recently
SLIDE 6 De De-
Privileging
Read/Write privileged state Instruction Read/Write privileged state Instruction Direct Execution but reduced privileged level Direct Execution but reduced privileged level VMM intercepts traps and emulates VMM intercepts traps and emulates
CPL 1 CPL 1 CPL 3 CPL 3 CPL 0 CPL 0 Virtual Machine Monitor Virtual Machine Monitor Operating System Operating System Applications Applications
SLIDE 7 Shadow Structures Shadow Structures
Virtual state differs from physical state Virtual state differs from physical state VMM provides basic Execution Environment VMM provides basic Execution Environment Shadow Structures Shadow Structures On On-
CPU privileged state Maintained as Image Maintained as Image Off Off-
CPU privileged data Resides in Memory Resides in Memory
SLIDE 8 Memory Traces Memory Traces
Use of hardware page protection mechanisms for Use of hardware page protection mechanisms for coherency of shadow structures coherency of shadow structures Protection for memory Protection for memory-
mapped devices Handling a trace fault: Handling a trace fault: Decode guest instruction Decode guest instruction Emulate its effect in the primary structure Emulate its effect in the primary structure Apply change to the shadow structure Apply change to the shadow structure
SLIDE 9 Tracing Example Tracing Example
Use of Shadow Page Tables to run guest Use of Shadow Page Tables to run guest Vmware manages SPTs as cache Vmware manages SPTs as cache True Page Fault True Page Fault Violation of the protection policy Violation of the protection policy Forwarded to guest Forwarded to guest Hidden Page Fault Hidden Page Fault Missing Page in SPT Missing Page in SPT No guest No guest-
visible effect
SLIDE 10
Refinements Refinements
Flexibility in VMM/guest OS Interface Flexibility in VMM/guest OS Interface Modify guest OS Modify guest OS Performance Gains Performance Gains Extended Features Extended Features Flexibilty in VMM/hardware Interface Flexibilty in VMM/hardware Interface Hardware Execution mode for guest OS Hardware Execution mode for guest OS “Paravirtualization” “Paravirtualization”
SLIDE 11
SOFTWARE VIRTUALIZATION SOFTWARE VIRTUALIZATION
SLIDE 12 x86 Obstacles x86 Obstacles
Visibility of privileged state Visibility of privileged state Lack of Traps at user Lack of Traps at user-
level Example: Unprivileged Example: Unprivileged popf popf Privileged level: ALU & system flags Privileged level: ALU & system flags De De-
- privileged level: ALU changes
privileged level: ALU changes No trap in de No trap in de-
privileged level
SLIDE 13 Simple Binary Translation Simple Binary Translation – – Interpreter Interpreter
Use of an interpreter Use of an interpreter Prevent leakage of privileged state Prevent leakage of privileged state Correct implementation of non Correct implementation of non-
trapping instructions Separation of virtual state from physical state Separation of virtual state from physical state Fails Performance Criteria Fails Performance Criteria
SLIDE 14 Simple Binary Translation Simple Binary Translation
Binary Translation combines Interpreter with Binary Translation combines Interpreter with Performance Performance VMware’s Translator offers this properties: VMware’s Translator offers this properties: Binary Binary Dynamic Dynamic On Demand On Demand System Level System Level Sub Sub-
Setting Adaptive Adaptive
SLIDE 15
Simple Binary Translation Simple Binary Translation – – Example Example
Simple prime validation Simple prime validation Invoke Invoke isPrime(49) isPrime(49)
int isPrime(int a) { int isPrime(int a) { for (int i = 2; i < a; i++) { for (int i = 2; i < a; i++) { if (a % i == 0) return 0; if (a % i == 0) return 0; } return 1; return 1; }
SLIDE 16
Translation Unit Translation Unit
Simple Binary Translation Simple Binary Translation – – Example Example
IR
isPrime: isPrime: mov %ecx, %edi mov %ecx, %edi mov %esi, $2 mov %esi, $2 cmp %esi, %ecx cmp %esi, %ecx jge prime jge prime nexti: nexti: mov %eax, %ecx mov %eax, %ecx cdq cdq idiv %esi idiv %esi test %edx, %edx test %edx, %edx jz notPrime jz notPrime inc %esi inc %esi cmp %esi, %ecx cmp %esi, %ecx jl nexti jl nexti prime: prime: mov %eax, $1 mov %eax, $1 ret ret notPrime: notPrime: xor %eax, %eax xor %eax, %eax ret ret
Compiled Code Fragment
isPrime‘: isPrime‘: mov %ecx, %edi mov %ecx, %edi mov %esi, $2 mov %esi, $2 cmp %esi, %ecx cmp %esi, %ecx jge [takenAddr] jge [takenAddr] jmp [fallthrAddr] jmp [fallthrAddr]
Compiled Code Fragment
nexti‘: nexti‘: mov %eax, %ecx mov %eax, %ecx cdq cdq idiv %esi idiv %esi test %edx, %edx test %edx, %edx jz notPrime‘ jz notPrime‘ jmp [fallthrAddr] jmp [fallthrAddr]
SLIDE 17
Simple Binary Translation Simple Binary Translation – – Example Example
isPrime: isPrime: mov %ecx, %edi mov %ecx, %edi mov %esi, $2 mov %esi, $2 cmp %esi, %ecx cmp %esi, %ecx jge prime jge prime nexti: nexti: mov %eax, %ecx mov %eax, %ecx cdq cdq idiv %esi idiv %esi test %edx, %edx test %edx, %edx jz notPrime jz notPrime inc %esi inc %esi cmp %esi, %ecx cmp %esi, %ecx jl nexti jl nexti prime: prime: mov %eax, $1 mov %eax, $1 ret ret notPrime: notPrime: xor %eax, %eax xor %eax, %eax ret ret isPrime isPrime': ': * *mov mov % %ecx ecx, % , %edi edi mov mov % %esi esi, $2 , $2 cmp cmp % %esi esi, % , %ecx ecx jge jge [ [takenAddr takenAddr] ] nexti nexti': ': * *mov mov % %eax eax, % , %ecx ecx cdq cdq idiv idiv % %esi esi test % test %edx edx, % , %edx edx jz jz notPrime notPrime' ' *inc % *inc %esi esi cmp cmp % %esi esi, % , %ecx ecx jl jl nexti nexti' ' jmp jmp [fallthrAddr3] [fallthrAddr3] notPrime notPrime': ': * *xor xor % %eax eax, % , %eax eax pop %r11 ; RET pop %r11 ; RET mov mov %gs:0xff39eb8(%rip), % %gs:0xff39eb8(%rip), %rcx rcx movzx movzx % %ecx ecx, %r11b , %r11b jmp jmp %gs:0xfc7dde0(8*% %gs:0xfc7dde0(8*%rcx rcx) )
SLIDE 18 Simple Binary Translation Simple Binary Translation – – Exceptions Exceptions
PC PC-
relative addressing Translator output on different location Translator output on different location Direct control flows Direct control flows Code layout changes need reconnection Code layout changes need reconnection Indirect control flows Indirect control flows Dynamically computed targets Dynamically computed targets Privileged instructions Privileged instructions
SLIDE 19
Comparison Comparison
2030 2030 1254 1254 216 216
rdtsc rdtsc #Cycles #Cycles
SLIDE 20
HARDWARE HARDWARE VIRTUALIZATION VIRTUALIZATION
SLIDE 21 x86 Architecture Extensions x86 Architecture Extensions
Allows classical Trap Allows classical Trap-
and-
Emulate Virtual Machine Control Block Virtual Machine Control Block Diagnostics Fields Diagnostics Fields Guest Mode (VMX) vs. Host Mode Guest Mode (VMX) vs. Host Mode vmrun vmrun Command Command
SLIDE 22 Qualitative Comparision Qualitative Comparision
Binary Translator Binary Translator Trap Elimination Trap Elimination Emulation Speed Emulation Speed Callout avoidance Callout avoidance Hardware Hardware-
assisted VMM Code density Code density Precise exceptions Precise exceptions System calls System calls
SLIDE 23
EXPERIMENTS AND RESULTS EXPERIMENTS AND RESULTS
SLIDE 24 Initial Measuring Initial Measuring
0% 0% 20% 20% 40% 40% 60% 60% 80% 80% 100% 100% 120% 120% % % of
native (higher higher is is better better) )
SPECint SPECint 2000 2000 and and SPECjbb SPECjbb 2005 2005
Software Software Hardware Hardware
SLIDE 25 Macrobenchmarks Macrobenchmarks
0% 0% 20% 20% 40% 40% 60% 60% 80% 80% 100% 100% % % of
native (higher higher is is better better) ) Software Software Hardware Hardware
SLIDE 26
Cost of Operations Cost of Operations
0,1 0,1 1 10 10 100 100 1000 1000 10000 10000 100000 100000 CPU CPU cycles cycles ( (smaller smaller is is better better) ) native native software software hardware hardware
SLIDE 27
Opportunities Opportunities
Faster Faster MicroCoreArchitecture MicroCoreArchitecture Implementations Implementations Hardware VMM Algorithms Hardware VMM Algorithms Hybrid VMM Hybrid VMM Hardware MMU Hardware MMU
SLIDE 28 Conclusions Conclusions
First generation of hardware support First generation of hardware support Permit Permit tran tran-
and-
emulate No real performance decrease No real performance decrease Only at system Only at system calls calls New MMU algorithms could help New MMU algorithms could help
SLIDE 29
A COMPARISON OF SOFTWARE AND HARDWARE A COMPARISON OF SOFTWARE AND HARDWARE TECHNIQUES FOR X86 VIRTUALIZATION TECHNIQUES FOR X86 VIRTUALIZATION
Michael Wallner Michael Wallner
SLIDE 30