Nvidia open sources GPU Kernel modules: How will that be Beneficial?

2022-05-12

Today is almost a historic day, as Nvidia has posted a long announcement regarding the fact that they are releasing open-source (GPL/MIT dual license) Kernel modules for their GPUs. There’s already a github repo where you can see and contribute to their code. Note that this is limited to more recent Nvidia cards (Ampere and Turing):

Customers with Turing and Ampere GPUs can choose which modules to install. Pre-Turing customers will continue to run the closed source modules.

What’s a kernel module?

First, let’s confirm what this is about. This is NOT Nvidia going full open-source with their drivers. The kernel module is just the interface between the Linux kernel and the hardware. Modules extend the functionality of the kernel without the need to reboot the system:

For example, one type of module is the device driver, which allows the kernel to access hardware connected to the system. Without modules, we would have to build monolithic kernels and add new functionality directly into the kernel image

So, the modules are now open-source, but the rest of the hardware remains controlled by proprietary firmware and drivers.

These changes are for the kernel modules; while the user-mode components are untouched. So the user-mode will remain closed source and published with pre-built binaries in the driver and the CUDA toolkit.

Is that closed source aspect going to continue in the future? FOr now, yes, but an official answer to a github issue gives some hope:

NVIDIA has just begun our open source journey, so please be patient. For now, we are not planning to upload GSP firmware at linux-firmware but this may change in the future.

Now the Nvidia installer will propose to install either the proprietary modules or the open ones from driver version R515 onwards:

For now, the code released by Nvidia for the kernel modules works with both x86_64 (Intel and AMD) and aarch64 (ARM) hardware. Since the modules are now open-source, it’s not impossible to think that they could be ported to other hardware architecture as well (Power, anyone?)

Now, which modules are we talking about?

The Nvidia GPU modules

NVIDIA’s kernel modules go in two categories:

An “OS-agnostic” component: modules independent of operating system.
A “kernel interface layer”: component specific to the Linux kernel version and configuration.

Nvidia explains further on the github repository:

When packaged in the NVIDIA .run installation package, the OS-agnostic component is provided as a binary: it is large and time-consuming to compile, so pre-built versions are provided so that the user does not have to compile it during every driver installation. For the nvidia.ko kernel module, this component is named “nv-kernel.o_binary”. For the nvidia-modeset.ko kernel module, this component is named “nv-modeset-kernel.o_binary”. Neither nvidia-drm.ko nor nvidia-uvm.ko have OS-agnostic components.

The kernel interface layer component for each kernel module must be built for the target kernel.

The repository highlights the following components:

kernel-open/ The kernel interface layer
kernel-open/nvidia/ The kernel interface layer for nvidia.ko
kernel-open/nvidia-drm/ The kernel interface layer for nvidia-drm.ko
kernel-open/nvidia-modeset/ The kernel interface layer for nvidia-modeset.ko
kernel-open/nvidia-uvm/ The kernel interface layer for nvidia-uvm.ko
src/ The OS-agnostic code
src/nvidia/ The OS-agnostic code for nvidia.ko
src/nvidia-modeset/ The OS-agnostic code for nvidia-modeset.ko
src/common/ Utility code used by one or more of nvidia.ko and nvidia-modeset.ko

Now for the billion dollar question: will this make it into the Linux kernel?

Is this going upstream?

Unfortunately, no, and Nvidia is very clear on that:

The current codebase does not conform to the Linux kernel design conventions and is not a candidate for Linux upstream.

However…

There are plans to work on an upstream approach with the Linux kernel community and partners such as Canonical, Red Hat, and SUSE.

So it is not completely out of the question, but it may take a while. In case you have a short memory (or forgot), AMD initially dumped an open-source driver that could not be merged in the kernel and required significant rework before it made it to upstream, so this is not really unusual.

This can help the Nouveau driver (the real, open-source Nvidia driver, developped by the community) to make some strides in terms of actual support - since it was mostly rendered useless by the most recent cards after the Pascal architecture:

published source code serves as a reference to help improve the Nouveau driver. Nouveau can leverage the same firmware used by the NVIDIA driver, exposing many GPU functionalities, such as clock management and thermal management, bringing new features to the in-tree Nouveau driver.

Does this have anything to do with gaming?

Not really, this announcement has pretty much nothing to do with gaming. You are NOT the target kind of user that Nvidia is thinking about with this announcement. you can see that clearly with the comments from SUSE:

We at SUSE are excited that NVIDIA is releasing their GPU kernel-mode driver as open source. This is a true milestone for the open-source community and accelerated computing. SUSE is proud to be the first major Linux distribution to deliver this breakthrough with SUSE Linux Enterprise 15 SP4 in June. Together, NVIDIA and SUSE power your GPU-accelerated computing needs across cloud, data center, and edge with a secure software supply chain and excellence in support

Ubuntu mentions gamers, but as a secondary target:

The new NVIDIA open-source GPU kernel modules will simplify installs and increase security for Ubuntu users, whether they’re AI/ML developers, gamers, or cloud users

This is really a much bigger deal for the enterprise/cloud market, where Nvidia is now a major provider of hardware for AI/ML applications.

So what are the key benefits of this announcement?

To sum things up a bit, we should expect the following benefits:

Higher system security since this is not just another firmware blob that you have to trust: you can now know exactly what it does, and the potential exploitation by an attacker decreases.
Better modules eventually: since issues will be discussed in the open, and contributors are welcome, this will enventually lead to less bugs/issues related to such modules.
Nouveau will benefit from such developments and we may see much better nouveau drivers in the near future as alternative to the proprietary Nvidia drivers.
Non-typical architectures (other than ARM and x86) should be able to support Nvidia now that the kernel module is open-sourced.
Until now, the closed source modules limited which kernel could be supported to run the drivers, but that’s now going to be less of an issue with the open modules.
This also opens the door for non-Linux OSes to receive support (for example OpenBSD).

Overall, very positive developments, even if it’s just about the GPU Kernel modules. This has major repercussions in multiple areas, even beyond Linux.