This not only enables the use of GPGPU on VMs, but also enables the use of a single GPU to virtualize Windows video games from Linux!
This means that one of the major problems with Linux on the desktop for power users goes away, and it also means that we can now deploy Linux only GPU tech such as HIP on any operating system that supports this trick!
> This means that one of the major problems with Linux on the desktop for power users goes away, and it also means that we can now deploy Linux only GPU tech such as HIP on any operating system that supports this trick!
If you're brave enough, you can already do that with GPU passthrough. It's possible to detach the entire GPU from the host and transfer it to a guest and then get it back from the guest when the guest shuts down.
This could be way more practically useful than GPU passthrough. GPU passthrough demands at least two GPUs (an integrated one counts), requires at least two monitors (or two video inputs on one monitor), and in my experience has a tendency to do wonky things when the guest shuts off, since the firmware doesn't seem to like soft resets without the power being cycled. It also requires some CPU and PCIe controller settings not always present to run safely.
This could allow a single GPU with a single video output to be used to run games in a Windows VM, without all the hoops that GPU passthrough entails. I'd definitely be excited for it!
It only requires 2 GPUs if you plan on using Linux GUI applications as you game on Windows. Besides, any shared single GPU solution is going to introduce performance overhead and display latency, both of which are undesired for gaming. Great for non-gaming things though - but generally you don't need Windows for those anyways.
From experience not always. If the dedicated GPU gets selected as BIOS GPU then it might be impossible to reset it properly for the redirect. I had this problem with 1070.
I have to say vGPU is amazing feature, and this possibly brings it to "average" user (as average user doing GPU passthrough can be).
Certainly, but this requires both BIOS/UEFI fiddling and it also means you can't use both Windows and Linux at the same time, which is very important for me.
I run gentoo host (with dual monitors) and a third monitor on a separate GPU for windows. I bought a laptop with discrete and onboard GPUs, and discovered that the windows VM now lives in msrdp.exe on the laptop, rather than physically interacted with keyboard and mouse. i still can interact with the VM if there's some game my laptop chokes on, but so far it's not worth the hassle for the extra 10% framerate. It's amusing because my laptop has 120hz display, so "extra 10% FPS" would be nice on the laptop but hey, we're not made of money over here.
Oh, i got sidetracked. I have a kernel command line that invokes IOMMU and "blacklists" the set of PCIE lanes that GPU sits on, the kernel never sees it, even when its in use. The next thing that i had to do was set up a vfio-bind script, that just tells qemu what GPU it's going to use. Thirdly, and this is the unfortunate part, since i forgot exactly what i did - there's some weirdness with windows in qemu with a passthru GPU - you have to registry hack some obscure stuff in to the way windows handles the GPU memory.
If i am not mistaken, 95% of all of my issues were solved by reading the ArchLinux documentation for qemu host/guests. My system is ryzen 3600, 64GB of ram, 2x NVME drives + one M.2 Sata drive, a gtx 1060 and a gtx 1070. Gentoo gets 16GB of ram (unless i need more, i just shut down windows or reset the guest memory) and the 1060. Windows gets ~47GB of ram, and the 1070, a wifi card, and a USB sound card. One of the things you quickly realize with guests on machines like this is that consumer grade motherboards and CPUs are garbage, there aren't enough PCIe lanes to, say, passthrough a bunch of USB or SAS/SATA ports, or a dedicated PCIe soundcard, or firewire. If you have an idea that you'd really like to try this out as an actual "desktop replacement" - especially for replacing multiple desktops, i recommend going to at least a threadripper, as those can expose like 4-6 times as many PCIe lanes to the host OS, meaning the possibility of multiple guests on multiple GPUs, or a single "redundant" guest, with USB ports, SATA ports, and pcie sound/firewire/whatever.
Why would anyone do this? dd if=/dev/sdb of=/mnt/nfs/backups/windows-date.img . Q.E.D.
I built a PC with two decent GPUs with the intention of doing this (one GPU for windows in the VM, one for Linux running on the host). It works great performance-wise but any game with anti-cheat will be very unhappy in a VM. I tried various workarounds which work to varying degrees but ultimately it’s a huge pain.
If it's such a cool feature, why does NVidia lock it away non-Tesla H/W?
[EDIT]: Funny, but the answers to this question actually provide way better answers to the other question I posted in this thread (as in: what is this for).
Entirely for market segmentation. The ones they allow it on are much more expensive. With this someone could create a cloud game streaming service using normal consumer cards and dividing them up for a much cheaper experience than the $5k+ cards that they currently allow it on. The recent change to allow virtualization at all (removing the code 43 block) does allow some of that, but does not allow you to say take a 3090 and split it up for 4 customers and get 3060-like performance for each of them for a fraction of the cost.
The OP is referring to GPU passthrough setup[1], which passes through a GPU from Linux host to Windows guest (e.g. for gaming). This is done by detaching the GPU from the host and pass it to the VM, thus most setup requires two GPUs since one need to remain with the host (although single GPU passthrough is also possible).
Nvidia used to detect if the host is a VM and return error code 43 blocking them from being used (for market segmentation between GeForce and Quadro). This is usually solved by either patching VBIOS or hiding KVM from the guest, but it was painful and unreliable. Nvidia removed this limitation with RTX 30 series.
This vGPU feature unlock (TFA) would allow GPU to be virtualized without requiring the GPU to first be detached from the host, vastly simplify the setup and open up the possibility of having multiple VMs running on a single GPU, all with its own dedicated vGPU.
Because otherwise, people would be able to use non-Tesla GPUs for cloud compute workloads, drastically reducing the cost of cloud GPU compute, and it would also enable the use of non-Tesla GPUs as local GPGPU clusters - additionally reducing workstation GPU sales due to more efficient resource use.
GPUs are a duopoly due to intellectual property laws and high costs of entry (the only companies I know of that are willing to compute are Chinese and only a result of sanctions), so for NVidia this just allows for more profit.
Interestingly, Intel is probably the most open with its GPUs, although it wasn't always that way; perhaps they realised they couldn't compete on performance alone.
AMD do have great open source drivers, but they have longer lag behind with their code merges compared to Intel. Also at least a while ago their open documentation was quite lacking for newer generations of GPUs.
Yeah, but now the comparison for many companies (e.g. R&D dept. is dabbling a bit in machine learning) becomes "buy one big box with 4x RTX 3090 for ~$10k and spin up VMs on that as needed", versus the cloud bill. Previously the cost of owning physical hardware with that capability would be a lot higher.
This has the potential to challenge the cloud case for sporadic GPU use, since cloud vendors cannot buy RTX cards. But it would require that the tooling becomes simple to use and reliable.
Certainly, and both AWS, GCP and Azure even on CPU are much beyond simply hardware cost - there are hosts that are 2-3x cheaper for most uses with equivalent hardware resources.
Nvidia sells an ever greater percentage of their sales to the data-center market, and consumers purchase a shrinking portion. They do not want to flatten their currently upward trending data-center sales of high-end cards.
NVIDIA's stock price has doubled since March 2020, and most of these gains can be largely attributed to the outstanding growth of its data center segment. Data center revenue alone increased a whopping 80% year over year, bringing its revenue contribution to 37% of the total. Gaming still contributes 43% of the company's total revenues, but NVIDIA's rapid growth in data center sales fueled a 39% year-over-year increase in its companywide first-quarter revenues.
The world's growing reliance on public and private cloud services requires ever-increasing processing power, so the market available for capture is staggering in its potential. Already, NVIDIA's data center A100 GPU has been mass adopted by major cloud service providers and system builders, including Alibaba (NYSE:BABA) Cloud, Amazon (NASDAQ:AMZN) AWS, Dell Technologies (NYSE:DELL), Google (NASDAQ:GOOGL) Cloud Platform, and Microsoft (NASDAQ: MSFT) Azure.
While this is definitely welcome news, GPU VFIO passthrough has been possible for awhile now. I've been playing games on my windows VM + linux host for a few years at least. 95% native performance without needing to dual boot has been a game-changer (heh).
This not only enables the use of GPGPU on VMs, but also enables the use of a single GPU to virtualize Windows video games from Linux!
This means that one of the major problems with Linux on the desktop for power users goes away, and it also means that we can now deploy Linux only GPU tech such as HIP on any operating system that supports this trick!