IM Blog Projects AS213339 Contact
Back

Adventures with single GPU passthrough via libvirt

Published 2023-12-06 10 min read

Over the past few days, I’ve been experimenting with first LXD and then libvirt. They’re wonderful technology. Both are wrappers around QEMU (with LXD being mostly a wrapper around Linux Containers but it also supports QEMU) that allow for much easier use and setup of VMs.

Below I’m detailing some lessons learned, tips and tricks, and some solutions to potential issues you might run into. Let me know if you disagree with any of these steps or have any questions! I’m still learning myself and I’d love to hear what you think.

Before we get into that, my setup:

Drivers

Note

I use libvirt. If you use LXD, these steps also work! But ignore this part about drivers and CPU pinning and use the Canonical guide for adding drivers and installing instead. Don’t add your GPU yet. Then come back here and follow the GPU passthrough steps.

So first things first. For optimal performance, you’ll want to use virtio devices for everything, including networking and disks. It’s possible to use non-virtio devices at setup and then add these drivers and migrate later, but it’s easier to just start with them.

So you’ll want to switch your disks to virtio-scsi, and your network to virtio. Then download the drivers and attach them to the VM. You’ll need to install the viostor driver for the disk, and the netkvm driver for the network. You can do that by clicking “Load drivers” when you get to the part of the Windows install where it asks for the disk you want to install in. Make sure you use the drivers under /netkvm/amd64 and /viostor/amd64 in the root folder! There’s drivers that are in the /amd64 folder but they don’t seem to work with Windows 11 despite being tagged as Windows 11 drivers.

Side note: feel free to use the latest drivers rather than the stable ones. I’ve found (and most other people online agree) that the latest drivers are still fairly stable and have few issues.

SCSI

If, for some reason, you do install on a virtio disk and then want to switch to SCSI, it’s a bit more complicated than it sounds. You’ll need to boot Windows, install the viostor driver from the VirtIO drivers disk (right-click on the .inf and hit Install), and then shut things down. Add a temporary disk (don’t remove the existing one! Just add a temp disk) and attach as SCSI. Then start Windows again, wait for a second, shut it down, and switch your SCSI disk to be the actual disk you boot from and remove the old one.

I found that if you just immediately switch to SCSI, Windows won’t load the driver at boot and it’ll crash and not boot. But if you connect an SCSI driver once, Windows will load the driver and then keep it loaded for the next boot when you’ll actually boot from SCSI.

CPU pinning

I don’t have much to say here. You should definitely CPU pin by following this guide if you want non-terrible performance. If you end up using the guide for other things, I found the hooks didn’t work and I had to use my own scripts for GPU passthrough and CPU governing. If you’re using LXD there’s a guide here. If your CPU is multi-threaded, make sure you pin both threads to the VM, don’t pin one thread from one core and another from another core.

Keyboard and mouse

You’ll need to attach your keyboard and mouse as USB devices. I used virt-manager to do this since I was too lazy to find out the IDs of my keyboard and mouse manually. Just open the VM settings and add a USB device.

Edit 2023-12-18: I’ve since switched to Evdev rather than direct USB passthrough. I highly recommend passing via Evdev rather than passing the USB directly, since it allows you to switch your keybaord back host system via a keyboard shortcut and is much easier than messing with USB devices. There’s a guide on the Arch Wiki here that I used. If it doesn’t work, make sure you get the right device - I got the wrong one accidently at first 😅

GPU passthrough (VM startup)

Now comes the trickiest part. These are the basic steps I’ve found for single-GPU passthrough. I’m going to give a high-level overview, and then the actual script I use.

At any time during the process, if you have an issue you can reboot and everything will safely reset and you can try again. I found myself rebooting liberally during my trial-and-error process of figuring these steps out.

Gotcha

If you have a display device attached, which is done automatically if you used virt-manager, and you try this you’ll see a black screen. It’s working, but the Nvidia GPU is set as the second display and so it won’t show anything. Set the display device to none and reboot.

My script for this is below.

#!/bin/bash
VM_NAME="win11"
PCI_ID="2d"
# Only allow running as root. This is so we don't have to "sudo" everything.
if [ "$(id -u)" -ne 0 ]; then
echo 'This script must be run as root!' >&2
exit 1
fi
# Shut down the display manager.
echo "Shutting down window manager..."
systemctl stop gdm.service
systemctl isolate multi-user.target
# Wait for a sec, then unbind the frame buffers.
sleep 3
echo "Unbinding framebuffers..."
echo '0' > /sys/class/vtconsole/vtcon1/bind
echo 'efi-framebuffer.0' > /sys/bus/platform/drivers/efi-framebuffer/unbind
# Wait for a sec, then disable the Nvidia Persistence Daemon.
sleep 3
echo "Disabling NVIDIA Persistence Daemon..."
nvidia-smi -pm 0
# Unload the Nvidia kernel modules. MUST be done in this order! Each one depends on the last.
echo "Unloading NVIDIA kernel modules..."
modprobe -r nvidia_drm nvidia_modeset nvidia_uvm nvidia i2c_nvidia_gpu drm_kms_helper drm
# Wait for a sec to make sure they're unloaded, then unbind the GPU.
sleep 3
echo "Detaching GPU..."
virsh nodedev-detach pci_0000_${PCI_ID}_00_0
virsh nodedev-detach pci_0000_${PCI_ID}_00_1
virsh nodedev-detach pci_0000_${PCI_ID}_00_2
virsh nodedev-detach pci_0000_${PCI_ID}_00_3
# Start the VM!
sleep 3
echo "Starting VM..."
virsh start $VM_NAME

GPU un-passthrough (VM shutdown)

The steps above are pretty much done in reverse order for VM shutdown. I’ll detail them in high-level steps below, and then give you my script again.

Gotcha

If you’re using LXD, you’ll need to reset your GPU between the step of rebinding the framebuffer and reenabling Persistence Mode, otherwise the GPU will say it’s active but your display manager will crash. As far as I can tell this physically power-cycles your GPU but I’m not sure. Run echo "1" | sudo tee /sys/bus/pci/drivers/nvidia/${ID}/reset (with your GPU’s PCI ID, found via lspci -nn | grep -i nvidia) to do that. I found I didn’t need to with libvirt.

Alright, my script is below. Same deal - replace VM_NAME with your VM name, and PCI_ID with the first couple digits of the PCI ID of your GPU, and check your framebuffer number to make sure you bind the right one.

#!/bin/bash
VM_NAME="win11"
PCI_ID="2d"
# Only allow running as root. This is so we don't have to "sudo" everything.
if [ "$(id -u)" -ne 0 ]; then
echo 'This script must be run as root!' >&2
exit 1
fi
# Shut down the VM. It's an asynchrous operation so we need to wait for it to complete too.
echo "Stopping VM..."
virsh shutdown $VM_NAME
until sudo virsh domstate win11 | grep "shut off"; do
echo "Waiting for shutdown..."
sleep 3
done
# Reattach the GPU.
echo "Reattaching GPU..."
virsh nodedev-reattach pci_0000_${PCI_ID}_00_0
virsh nodedev-reattach pci_0000_${PCI_ID}_00_1
virsh nodedev-reattach pci_0000_${PCI_ID}_00_2
virsh nodedev-reattach pci_0000_${PCI_ID}_00_3
# Reload the Nvidia drivers. MUST be done in this order! Each one depends on the previous one.
sleep 3
echo "Reloading NVIDIA kernel modules..."
modprobe drm drm_kms_helper i2c_nvidia_gpu nvidia nvidia_uvm nvidia_modeset nvidia_drm
# Rebind the framebuffers. Change this if your framebuffer is on a different device.
echo "Binding framebuffers..."
echo 'efi-framebuffer.0' > /sys/bus/platform/drivers/efi-framebuffer/bind
echo "1" > /sys/class/vtconsole/vtcon1/bind
# For LXD! Don't uncomment if you're on libvirt.
# sleep 3
# echo "Resetting GPU..."
# echo "1" > /sys/bus/pci/drivers/nvidia/0000:${PCI_ID}:00.0/reset
# Enable NVIDIA Persistence Daemon.
echo "Enabling NVIDIA Persistence Daemon..."
nvidia-smi -pm 1
# Moment of truth! Restart the display manager. You should see graphics come back up!
echo "Reloading display manager..."
systemctl start gdm.service

To use these scripts, just save them somewhere and run them with sudo.

Conclusion

Good luck! Feel free to reach out if you have any questions or issues.