KVM Live Migration Optimization Li, Liang Zhang, Yang Aug 2015 - PowerPoint PPT Presentation

KVM Live Migration Optimization Li, Liang Zhang, Yang Aug 2015 1

Agenda • Background • Problems • Solutions • Performance • Work in progress 2

Background Live migration usage in cloud computing • › facilitate maintenance › load balancing › energy saving • Goals › Reduce total live migration time › Reduce VM down time › Improve migration successful ratio Existing optimizations • › RDMA › XBZRLE › Auto convergence 3

Problems Network bandwidth could be the bottle neck • › Network is usually shared. › 1Gbps Network is still widely used. › Geographic migration Low efficient data processing in ram bulk stage • › Unused pages can be skipped › Free pages can be skipped › The transmission of zero pages can be skipped Time consuming operation in pause and copy stage • › migration_end › blk_mig_cleanup 4

Solutions Multiple thread (de)compression • • Skip the unused pages in ram bulk stage • Delay the non-emergency operations 5

Multiple thread (de)compression Time spend on different stages • › Most of the time is spent on sending data if the network bandwidth is low Get dirty Zero page Send data page check • Time spend on different stages when using compression › Compression can help to reduce the data traffic, and decrease time spend on sending data › Compression takes extra time Get dirty Zero page Send Compress page page check compressed page › Multiple thread is used to accelerate the (de)compression process Get dirty Zero page Send Compress page page check with multi-thread compressed page 6

Multiple thread (de)compression Multiple thread (de)compression is a new live migration feature • › Instead of sending the guest memory directly, this solution compresses the RAM page before sending. › Have been merged into QEMU 2.4.0. • Relation ship between multiple thread (de)compression and XBZRLE › Both aim for reduce the data traffic in network › XBZRLE compresses the page updates. › Multiple thread (de)compression compresses the original page. › Multiple thread (de)compression transfers compressed data in the ram bulk stage, XBZRLE can’t do that. › Multiple thread co-work with XBZRLE can minimize the data traffic in theory. › Multiple thread only takes effect in the ram bulk stage if co-work with XBZRLE. 7

Multiple thread (de)compression details Get page info Wait to start Notify compression Wait to start thread to start Do compression Do compression Wait for comp done if all comp thread busy Notify migration Notify migration thread thread Put the compressed data to send buffer Compression thread Compression thread Send data if buffer is full Migration thread The relation ship between migration thread and compression threads 8

Multiple thread (de)compression details • About data copy › Data copy happened when putting the compressed page to QEMUFile Gest RAM pages Copy Copy Head + Head + compressed page compressed page QEMUFile buffer • About page sequence › In the block range, the sequence of the page is no matter › If a new block begins, all the pages belong to the previous block should be send out first. 9

Offload the overhead from CPU About the CPU usage? • › 760% on source side › 50% on the source side when use the original implementation. Solutions • › Use some faster compression algorithm, like Quicklz, LZ4. Zlib LZ4 No 8 threads 8 threads compression CPU usage 760% 108% 51% Total migration time (Sec) 20 20 34 › Use hardware (de)compression accelerator to offload the over head from CPU. CPU usage can be reduced more if using the asynchronous mode of the hardware (de)compression accelerator. Zlib Zlib with hardware accelerator 760% 150% CPU usage 10

Skip unused pages in ram bulk stage Inefficient data processing in ram bulk stage • › Unused page can be skipped. › Mark all pages as dirty will cause needless data process. Current migration dirty page bitmap Unused page Used page Optimized migration dirty page bitmap How to • › Using a dirty page bitmap which just contains the used pages. › Start the log dirty before VM running. 11

Delay the non-emergency operations Do clean up operation after data transfer completion • › Delay migration_end. › Delay blk_mig_cleanup. 12

Performance Performance of multiple thread (de)compression • Settings: speed limit No, Compress thread: 8, Decompress thread: 2, Compression level: 1, 1Gbps NIC, Guest RAM: 4G › Idle guest Zlib Original way Multi-thread (de)Compression total time (msec) 3333 1833 ( ↓45% ) downtime (msec) 100 27 ( ↓73% ) transferred ram(kB) 363536 107819 ( ↓70% ) total ram(kB) 4211524 4211524 › Guest with workload writing random numbers to 1GB area of the memory periodically Zlib Original way Multi-thread (de)Compression total time (msec) 37369 15989( ↓57% ) downtime (msec) 337 173( ↓48% ) transferred ram(kB) 4274143 1699824( ↓60% ) total ram(kB) 4211524 4211524 13

Performance Performance comparison between multiple thread and XBZRLE • › Migrating a guest with workload which writes random numbers to memory, LZ4 is used to do the (de)compression. Multi-thread Original Multi-thread XBZRLE (de)compression way (de)compression & XBZRLE total time (msec) 26746 14490 17590 13522 downtime (msec) 35 64 185 167 transferred ram(kB) 3354024 1784685 2131286 1605739 total ram(kB) 8405576 8405576 8405576 8405576 14

Performance Performance for skipping unused pages in the ram bulk stage • Before After optimization optimization Total time(ms) 1386 483( ↓65% ) Transferred ram(KB) 446542 428300 Total ram (KB) 8405576 8405576 Idle guest, 10Gbps NIC 15

Performance Performance for delay the clean up operation • Before optimization After optimization Down time(ms) 38 6( ↓84% ) Total ram (KB) 8405576 8405576 Test is based on QEMU 2.4.0 + Linux kernel 4.2-rc6, idle guest. Set max downtime 0.01S 16

Work in progress Improve the performance of multi-thread (de)compression in 10G • network environment. › With the multi-thread (de)compression on, the performance is worse. Improve the performance of the hardware compression accelerator. • › Using the asynchronous mode instead of the synchronous mode. Using AVX instruction to accelerate zero page checking. • User space network stack • › Live migration based on DPDK & mTCP Live migration performance optimization for the 40Gbps network • 17

Q&A? Thank You

KVM Live Migration Optimization Li, Liang Zhang, Yang Aug 2015 - PowerPoint PPT Presentation

KVM Live Migration Optimization Li, Liang Zhang, Yang Aug 2015 1 Agenda Background Problems Solutions Performance Work in progress 2 Background Live migration usage in cloud computing facilitate maintenance

NVIDIA VGPU LINUX KVM Neo Jia, Dec 19th 2019 AGENDA NVIDIA vGPU

Real-time KVM from the ground up LinuxCon NA 2016 Rik van Riel Red Hat Real-time KVM What

KVM without QEMU Gabriel Laskar <gabriel@lse.epita.fr> Agenda What is kvm ? What we

Introduction to KVM By Sheng-wei Lee swlee@swlee.org #20110929 Outline Hypervisor - KVM

Virtualization in Fedora Virtualization in Fedora (KVM based) (KVM based) Kashyap Chamarthy

KVM on PowerPC This time its the server, baby Donnerstag, 23. September 2010 About Me

KVM on MIPS KVM Forum 14 th October 2014 James Hogan james.hogan@imgtec.com Overview Trap

Improving access to migration data Improving access to migration data Local area migration

How to migrate to a new-age IT stack with KVM Present a method to migrate from traditional

Securing secure boot with System Management Mode Paolo Bonzini Red Hat, Inc. KVM Forum 2015

"ENLIGHTENING" KVM "ENLIGHTENING" KVM HYPER-V EMULATION HYPER-V EMULATION

Virtual Machine Migration Pierre Riteau University of Rennes 1, IRISA Inria Rennes - Bretagne

Exploiting Live Virtual Machine Migration Jon Oberheide University of Michigan February 21,

International Dialogue on Migration (IDM) Human Rights and Migration: Working Together for Safe,

WHY IS MIGRATION SO IMPORTANT? Why migration? German National Team 2014 Why migration?

EU policy on Legal Migration DG Migration and Home Affairs EU migration basic facts and figures

Hardware OS & OS- Application interface Summer 2016 Cornell University 1 Today

Locality Locality CS 105 Tour of the Black Holes of Computing Principle of Locality: Programs

Security Engineering Chester Rebeiro IIT Madras Examples motivated from Prof. Nickolai Zeldovich

Automatic Defect Detection Andrzej Wasylkowski Overview Automatic Defect Detection

on a Cluster Hongbo Rong, Frank Schlimbach Programming & Systems Lab (PSL) Software Systems

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

Machines: Where Next? Murray Cole Machines: Where next? 1 Technological Progress Moores

CENG 4480 L10 Memory 3 Bei Yu Reference : Chapter 11 Memories CMOS VLSI DesignA

KVM Live Migration Optimization Li, Liang Zhang, Yang Aug 2015 - PowerPoint PPT Presentation

KVM Live Migration Optimization Li, Liang Zhang, Yang Aug 2015 1 Agenda Background Problems Solutions Performance Work in progress 2 Background Live migration usage in cloud computing facilitate maintenance

NVIDIA VGPU LINUX KVM Neo Jia, Dec 19th 2019 AGENDA NVIDIA vGPU

Real-time KVM from the ground up LinuxCon NA 2016 Rik van Riel Red Hat Real-time KVM What

KVM without QEMU Gabriel Laskar &lt;gabriel@lse.epita.fr&gt; Agenda What is kvm ? What we

Introduction to KVM By Sheng-wei Lee swlee@swlee.org #20110929 Outline Hypervisor - KVM

Virtualization in Fedora Virtualization in Fedora (KVM based) (KVM based) Kashyap Chamarthy

KVM on PowerPC This time its the server, baby Donnerstag, 23. September 2010 About Me

KVM on MIPS KVM Forum 14 th October 2014 James Hogan james.hogan@imgtec.com Overview Trap

Improving access to migration data Improving access to migration data Local area migration

How to migrate to a new-age IT stack with KVM Present a method to migrate from traditional

Securing secure boot with System Management Mode Paolo Bonzini Red Hat, Inc. KVM Forum 2015

&quot;ENLIGHTENING&quot; KVM &quot;ENLIGHTENING&quot; KVM HYPER-V EMULATION HYPER-V EMULATION

Virtual Machine Migration Pierre Riteau University of Rennes 1, IRISA Inria Rennes - Bretagne

Exploiting Live Virtual Machine Migration Jon Oberheide University of Michigan February 21,

International Dialogue on Migration (IDM) Human Rights and Migration: Working Together for Safe,

WHY IS MIGRATION SO IMPORTANT? Why migration? German National Team 2014 Why migration?

EU policy on Legal Migration DG Migration and Home Affairs EU migration basic facts and figures

Hardware OS &amp; OS- Application interface Summer 2016 Cornell University 1 Today

Locality Locality CS 105 Tour of the Black Holes of Computing Principle of Locality: Programs

Security Engineering Chester Rebeiro IIT Madras Examples motivated from Prof. Nickolai Zeldovich

Automatic Defect Detection Andrzej Wasylkowski Overview Automatic Defect Detection

on a Cluster Hongbo Rong, Frank Schlimbach Programming &amp; Systems Lab (PSL) Software Systems

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

Machines: Where Next? Murray Cole Machines: Where next? 1 Technological Progress Moores

CENG 4480 L10 Memory 3 Bei Yu Reference : Chapter 11 Memories CMOS VLSI DesignA

KVM without QEMU Gabriel Laskar <gabriel@lse.epita.fr> Agenda What is kvm ? What we

"ENLIGHTENING" KVM "ENLIGHTENING" KVM HYPER-V EMULATION HYPER-V EMULATION

Hardware OS & OS- Application interface Summer 2016 Cornell University 1 Today

on a Cluster Hongbo Rong, Frank Schlimbach Programming & Systems Lab (PSL) Software Systems