GPU on OpenStack
Masafumi Ohta @masafumiohta
GPU on OpenStack Masafumi Ohta @masafumiohta Who am I > - - PowerPoint PPT Presentation
GPU on OpenStack Masafumi Ohta @masafumiohta Who am I > Working for System Integrator as Pre-Sales Engineer. Working on some OpenStack PoC projects. Proposing OpenStack system to a manufacturer Investigating OpenStack issues reading some
Masafumi Ohta @masafumiohta
Using GPU for using many cores. It is better for some calculations to use many MPU cores though each MPU is small and low-speed. Low electric power consumption with GPU is great for HPC end users. Compact systems. is very good for us Japanese HPC systems…
It can be used on ‘PCI passthrough’ or GPGPU docker Perhaps so is AWS. ‘PCI passthrough’ depends on KVM VSphere only can split GPU core to each VM. GPGPU Docker is ‘share GPU with containers but not split. Windows cannot work as ‘docker vm’ Can we split with GPU like vSphere to each VM on KVM? NO, we can only add with GPU unit on VM
PCI devices directly connect to VM via Linux hosts Needs to detach the devices from physical host Depends on KVM, not depends on OpenStack One devices to one VM GPU itself cannot share and split the cores each VMs. it is the limitation in KVM, not OpenStack
Linux OS for KVM hypervisor
GPU Driver App
VMM/KVM IOMMU/Vt-d PCI Express x16
Linux/Win OS
GPU Card GPU Card Nova Compute Nova Scheduler AMQP Nova API
Linux OS
Figure1:How GPU passthrough works on OpenStack
Check GPU first on KVM host with lspci -nn | grep -i nvidia
lspci -nn | grep -i nvidia 88:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:11b4] (rev a1) 88:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1)
All of GPU units should be passthroughed. Not only GPU itself but also HDMI ports should be done Or it doesn’t work on VM.. (not completely passthroughed..)
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1”
pci-stub makes physical pci-devices unused on Linux host. It is not used by default so use ‘/etc/module’ to use it and related components (vfio,kvm)
pci_stub vfio vfio_iommu_type1 vfio_pci kvm kvm_intel
echo ‘pci_stub ids=10de:11b4,10de:0e0a’ >> /etc/initramfs-tools/modules sudo update-initramfs -u && sudo reboot
blacklist nvidia blacklist nvidia-uvm
blacklist nouveau
should check pci-stub to ‘unbind from physical host to bind to VM’:entry passthroughed drivers to new_id to use VM and unbind related identifiers from physical host and bind them to vm.
echo 11de 11b4 > /sys/bus/pci/drivers/pci-stub/new_id echo 11de 0e0a > /sys/bus/pci/drivers/pci-stub/new_id echo 0000:88:00.0 > /sys/bus/pci/devices/0000:88:00.0/driver/unbind echo 0000:88:00.1 > /sys/bus/pci/devices/0000:88:00.1/driver/unbind echo 0000:88:00.0 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:88:00.1 > /sys/bus/pci/drivers/pci-stub/bind
Check claimed while booting to remove from physical machine.
pci-stub 0000:88:00.1: claimed by stub
modprobe /etc/modprobe.d/blacklist.conf pci-stab /sys/bus/pci/drivers/pci-stub/ /sys/bus/pci/devices/$(Identifier)/driver/unbind
ramfs /etc/initramfs-tools/modules GRUB /etc/default/grub
modules /etc/modules
UEFI/BIOS Vt-d
Figure2:GPU blacklist process while booting (in Ubuntu Case)
IOMMU IOMMU BLACK LIST BLACK LIST IOMMU BLACK LIST
Physical devices pci-stub (use it on virtual) GPU Units (all the devices) ‘Unbind GPU from physical device and bind to virtual device’
echo 11de 11b4 > /sys/bus/pci/drivers/pci-stub/new_id echo 11de 0e0a > /sys/bus/pci/drivers/pci-stub/new_id echo 0000:88:00.0 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:88:00.1 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:88:00.0 > /sys/bus/pci/devices/0000:88:00.0/driver/unbind echo 0000:88:00.1 > /sys/bus/pci/devices/0000:88:00.1/driver/unbind
lspci -nn | grep -i nvidia 88:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:11b4] (rev a1) 88:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1) 84:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:11b4] (rev a1) 84:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1)
echo 0000:84:00.0 > /sys/bus/pci/devices/0000:84:00.0/driver/unbind echo 0000:84:00.1 > /sys/bus/pci/devices/0000:84:00.1/driver/unbind echo 0000:84:00.0 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:84:00.1 > /sys/bus/pci/drivers/pci-stub/bind
/nbody -benchmark -numdevices=2 -num bodies=65536
ubuntu@guestos$ lspci -nn | grep -i nvidia 00:07.0 VGA compatible controller [0300]: NVIDIA Corporation GK104GL [Quadro K4200] [10de:11b4] (rev a1) 00:08.0 VGA compatible controller [0300]: NVIDIA Corporation GK104GL [Quadro K4200] [10de:11b4] (rev a1)
pci_passthrough_whitelist={"name":"K4200","vendor_id":"10de","product_id": "11b4"}
pci_alias={“name”:”K4200”,"vendor_id":"10de","product_id":"11b4"}
Also in ControllerNodes we should add the pci passthrough filter to nova.conf setting them to /etc/nova/nova.conf following the underline.
scheduler_available_filters=nova.scheduler.filters.all_filters scheduler_available_filters=nova.scheduler.filters.pci_passthrough_filter.PciPass throughFilter scheduler_default_filters=DifferentHostFilter,RetryFilter,AvailabilityZoneFilter,Ra mFilter,CoreFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,ImageProp ertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter,AggregateInst anceExtraSpecsFilter,PciPassthroughFilter
nova flavor-key $flavor_name set “pci_passthrough:alias”=“K4200:$amount_of_gpu”
Images are very small for using GPU thus we need to be resized those cloud images with qemu-img CUDA driver needs perl-packages(dev packages) when installing it. Even though it is .deb or .rpm packages.those package is not binary files,they build the binary from CUDA source codes to run ‘make’ while installing on the system. Nvidia says it will be fixed in CUDA future release.add spec file to those related perl (dev) packages. It will be fixed on CUDA 7.6 or later..
CUDA on Windows is so faster if you succeed installation but it is often jumpy a bit. it might be occurred by disk speeds on vm.. you might better use ephemeral or something faster (SSD,NVMe or..etc) VM works with context switch thus heavy workloads by CUDA or something might cause jumpy a bit. I haven’t tried yet.I should investigate why it happens. Should have more time to investigate
We can't do ‘live-migration’ when using PCI passthrough.vm won’t remove connection to PCI on old host. workaround:remove the old connection below in mysql DB nova.pci_devices.and then reboot old host thus ‘not useful!’
| 2016-08-11 00:54:45 | 2016-08-19 04:58:01 | NULL | 0 | 45 | 21 | 0000:84:00.0 | 11b4 | 10de | type-PCI | pci_0000_84_00_0 | label_10de_11b4 | available | {} | NULL | NULL | 1 | <<-- old-host | 2016-08-11 00:54:45 | 2016-08-19 04:58:01 | NULL | 0 | 48 | 21 | 0000:88:00.0 | 11b4 | 10de | type-PCI | pci_0000_88_00_0 | label_10de_11b4 | available | {} | NULL | NULL | 1 | <<-- old-host
Masafumi Ohta @masafumiohta