KVM, two OS and two nVidia GPU

ChristianVirtual

[H]ard DCOTM x3
Joined
Feb 23, 2013
Messages
2,561
Since ESXi remain an issue with consumer GPU I start thinking to move to KVM.

I would need some advise on the setup.
Idea:
1) CentOS 7 minimal as host
2) one CentOS with X with one GTX 980Ti
3) one Windows guest with GTX 970

I need to run the original GPU driver as I want to keep Folding running not working with 3rd party drivers. But I also would like both OS flavor at the same time.

Possible ?
CPU is i7-2600S, RAM right now 8GB, but can go 16GB.
 
There is alot more than just following the article. It is also a bit out of date, shouldn't need pci-stub now. Also you will runing into Code 43 errors within Windows as NVIDIA will detect the GPU is running within a VM. You also want to make sure your HW is compatible.

Look through this blog: VFIO tips and tricks

While old it does work. That article is complete and was tested many times. I also cover Code 43. It's in the code.
 
Thanks guys for the pointers; we have some days off here in Japan, Amazon delivered 4 sticks each 8GB and my Asus P8z77 support Vt-d as my i7-2600S does.

Let see where it gets me ... A first try got me into code 43 and a subsequent unstable windows 8.1 installation. Give it another shot now.
 
A bit more complicated it is:
1) Had to update to qemu 2.5.1 as the Centos "stock version" 2.0 not supporting the KVM-hiding feature (relevant for nV)
2) Update to newer kernel 4.5 (and recompile as stock kernel didn't had the CONFIG_VFIO_PCI_VGA parameter set); hell, is that many years ago that I compiled a Linux kernel ...

... WIP ...
 
Last edited:
sofar, so good:; installed a CentOS 7 guest (with CentOS 7 host)

and see the nV 970 passed through and "nice looking"; while the nV 980Ti remain assigned to the host.

Code:
-[0000:00]-+-00.0  Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
  +-01.0  Device 1234:1111
  +-02.0  Intel Corporation 82540EM Gigabit Ethernet Controller
  +-03.0  Intel Corporation 82371AB/EB/MB PIIX4 IDE
  +-1c.0-[01]--+-00.0  NVIDIA Corporation GM204 [GeForce GTX 970]
  |  \-00.1  NVIDIA Corporation GM204 High Definition Audio Controller
  +-1d.0  Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1
  +-1d.1  Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2
  +-1d.2  Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3
  +-1d.7  Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1
  +-1f.0  Intel Corporation 82801IB (ICH9) LPC Interface Controller
  +-1f.2  Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode]
  \-1f.3  Intel Corporation 82801I (ICH9 Family) SMBus Controller

The script to start the qemu 2.5.1 (self compiled; under vanilla kernel 4.5.0; self-compiled)
Code:
#!/bin/bash

configfile=/etc/vfio-pci1.cfg

vfiobind() {
  dev="$1"
  vendor=$(cat /sys/bus/pci/devices/$dev/vendor)
  device=$(cat /sys/bus/pci/devices/$dev/device)
  if [ -e /sys/bus/pci/devices/$dev/driver ]; then
  echo $dev > /sys/bus/pci/devices/$dev/driver/unbind
  fi
  echo $vendor $device > /sys/bus/pci/drivers/vfio-pci/new_id

}

modprobe vfio-pci

echo "change driver for devices in " $configfile
cat $configfile | while read line;do
  echo $line
  echo $line | grep ^# >/dev/null 2>&1 && continue
  vfiobind $line
done


qemu-system-x86_64 \
-name kvm \
-enable-kvm \
-M q35,accel=kvm \
-m 8192 \
-boot menu=on \
-hda /winpowered/kvmpowered/kvm.img \
-cdrom /home/cl/Downloads/CentOS-7-x86_64-Minimal-1511.iso \
-cpu host,kvm=off \
-smp 3,sockets=3,cores=1,threads=1 \
-realtime mlock=off \
-rtc base=utc,driftfix=slew \
-device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 \
-device piix4-ide,bus=pcie.0,id=piix4-ide \
-device vfio-pci,host=02:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on \
-device vfio-pci,host=02:00.1,bus=root.1,addr=00.1 \
-display gtk \
-usb \
-usbdevice host:0603:00f2 \
-usbdevice host:056e:0101 \
-msg timestamp=on

I have a minimal gnome installed and can also compile the nV driver (361.28)

but as soon I start nvidia-smi to check if the card is there the full system freeze (guest AND host).
can't access to the host via network either; guts-feeling tells me something wrong with interrupts.

Any idea ?
 
Last edited:
Nice !! I used an older driver (346.87) and it works ! In principle ...
Still can't start gnome anymore and coolbits for fan control are therefore still missing. But now with CentOS as guest working I can use the same config and try Win 8.1
 
Win 8.1, of course, crash into blue screens while starting after nV driver install. Even restore point didn't recover. Now refresh install, and might upgrade to Win 10, in hope it's better ...
 
While it got late for further windows installs I thought give the CentOS guest another run ... Now it freeze again ... I guess I give up - or someone have a good idea ?? :banghead::(:sour::bigtears::hungover::cry:
 
I think the problem maker are the nV drivers ... Until they come into the game all runs well. Good luck with your endavours; please share your experiences during/afterwards.
 
Finally it worked as it should !

1) CentOS 7 with vanilla kernel 4.5.2
2) regular nV driver 361.28 for the host driving a nV 980Ti
3) qemu 2.5.1 from source build with GTK front end
3) Windows 8.1 Pro with CUDA 7.5 driver and MSI Afterburner for fan control driving a nV 970

image.png


One step after the initial Windows installation was to set -vga none for the guest. I think under virtual having two VGA capable devices cause trouble.

To be honest: right now I'm not 100% why it works; wouldnt use it for any mission critical setup yet. But as proof of concept I'm very happy with the outcome.

Lessons learned:
1) after some steps make snapshots of the image file (or even clone it); I wasted quite some time with reinstall of Windows and at times the image file got corrupted.
2) not yet confirmed but once the VM is stopped it seems it not start well again (at least with PCI passthrough); looks like a host reboot get all conditions clean again. Will need to verify that guess on weekend.
3) as for performance: not impressed with the computing performance. Loose quite some "points" compared to native under host. But hey, that's not the my purpose. I just need from time to time a Windows folder without dedicated hardware.
 
Question to my virtualizing fellow [H] members: why is that that the performance of PCI passthrough seems low ? When I see around in internet I see comments like 95% native speed etc.
I would say: 50% to 75% (at best).

I have 8GB DDR3-1600 assigned to my Win 8.1Pro VM, that should be enough. CPU is a bit on the slower side with i7-2600S (I think 2.8 GHz), qcow2 image on SSD via SATA-3 slot.

When on the same box and slot the 970 is folding native under CentOS I get 240k PPD, under virtual Windows only 170kPPD (whatever that PPD is, but still a good measure to compare).
Understand that Windows fold 10% slow compared to Linux, that would make expected 220kPPD, the difference to 170kPPD I would make QEMU/KVM/VFIO "responsible" for.


Don't get me wrong, I'm really glad it's working, but any suggestion to what to look after for better performance ?
 
Last edited:
Question to my virtualizing fellow [H] members: why is that that the performance of PCI passthrough seems low ? When I see around in internet I see comments like 95% native speed etc.
I would say: 50% to 75% (at best).

I have 8GB DDR3-1600 assigned to my Win 8.1Pro VM, that should be enough. CPU is a bit on the slower side with i7-2600S (I think 2.8 GHz), qcow2 image on SSD via SATA-3 slot.

When on the same box and slot the 970 is folding native under CentOS I get 240k PPD, under virtual Windows only 170kPPD (whatever that PPD is, but still a good measure to compare).
Understand that Windows fold 10% slow compared to Windows, that would make expected 220kPPD, the difference to 170kPPD I would make QEMU/KVM/VFIO "responsible" for.


Don't get me wrong, I'm really glad it's working, but any suggestion to what to look after for better performance ?

It depends on whether you are using legacy vs UEFI. For whatever reason legacy looses more performance than UEFI. When I ran tests UEFI loading gave me performance within 5% of native, which for me is more than acceptable when you factor in the power savings.
 
Thanks again kac77 ; switched to OMVF but had to reinstall Win.

Now the performance did not really improved and I wonder if that can be caused by some parameter to drive the number of pice lanes (x1,x4, x8). Are you by chance aware of such on option. I couldn't find one; but might be blind or not aware of proper search words.

Also I struggle with restart of the VM; it continue to freeze the whole system (host and guest). Deep freeze. On power cycle recover.

Thinking about getting s X99/Xeon combo with more lanes ...
 
Thanks again kac77 ; switched to OMVF but had to reinstall Win.

Now the performance did not really improved and I wonder if that can be caused by some parameter to drive the number of pice lanes (x1,x4, x8). Are you by chance aware of such on option. I couldn't find one; but might be blind or not aware of proper search words.

Also I struggle with restart of the VM; it continue to freeze the whole system (host and guest). Deep freeze. On power cycle recover.

Thinking about getting s X99/Xeon combo with more lanes ...

Yes you are bandwidth starved. That chip only has 16 lanes of PCIE2.0. I have Intel(R) Core(TM) i7-4930K, which is damn near a Xeon and I have 40 of PCIE3.0. So I experienced a pretty nice performance boost by moving to UEFI. UEFI or no you need more bandwidth.

As for restarting I have experienced trouble with it especially in Fedora with SELinux in toe. After watching the alerts and allowing more access restarting is easier but sometimes every once in a while a hard lock will happen on restart.
 
Back
Top