Recent Proton Versions are Broken on Solus

By

Solus is a relatively new distro that took off a few years ago, not based on any other major distro branch out there - actually built from scratch. It used to be a very popular option for gaming as it included back in the day several tweaks to make Steam better integrated (such as a tool to switch between the Steam Runtime and the native libraries) and several fixes for Unity games.

However, things have been getting worse recently, namely since Proton 5.13.x where Valve started to release newer Proton versions built on their pressure vessel tech. It’s been now several months that running the latest Proton on Solus is utterly broken (i.e. all games will refuse to run). The fallback is to use earlier versions of Proton for now, yet forsaking advancements in compatibility or performance brought by newer versions.

This affects both the native Steam client available from the repositories, and the flatpak client that you can install through a container.

The issue has been kind of identified, while there does not seem to be an immediate fix at the moment:

They have the same high-level failure mode (game doesn’t start) but for entirely different reasons.

The feature being used here is that Steam can start individual games in their own container environment, using a tool called pressure-vessel. For native Linux games, this is an opt-in feature intended to get each game to run in a predictable environment, which is particularly helpful for games that make assumptions that were true in 2015 but are no longer valid (this is described as “Steam Linux Runtime” in the UI). For Windows games that run via a compatibility layer, a newer version of the container runtime is automatically used whenever you have Proton 5.13 or newer as your compatibility layer, because those Proton builds require a newer library stack than the one Steam has traditionally provided.

pressure-vessel uses a system copy of bwrap if available, instead of its own copy, because some kernels require a setuid bwrap (which Steam cannot provide, because it is an unprivileged user process).

For the version of Steam running directly on Solus without a container, according to @Staudey’s report in the Steam Runtime bug tracker it works fine if bwrap is not setuid, but something crashes (with no error message printed, and possibly with SIGILL) if bwrap is setuid. The workaround is to not make bwrap setuid. This is meant to work, and on Debian 10 (which genuinely requires a setuid bwrap), it does - but apparently something is sufficiently different on Solus to cause a crash. This is probably a bug in either bwrap or pressure-vessel, but at the moment we can’t tell which. If someone can get details of which process is crashing, and ideally also a backtrace, then that would be very helpful.

bwrap does not need to be setuid unless your kernel requires it. The setuid mode of installation for bwrap disables some features for security reasons, and is intended to be used with kernels that do not support unprivileged creation of user namespaces, such as the distro kernels in Debian 10 or older and the non-default linux-hardened kernel in Arch Linux. This is a security trade-off in those distributions: disabling unprivileged creation of user namespaces reduces the kernel’s attack surface and prevents exploitation of some kernel bugs (such as CVE-2016-3135); but it makes it necessary for bwrap to be setuid, which means that bugs in bwrap could be used by a local attacker to get root privilege escalation (such as CVE-2020-5291).

As far as I can see from the packaging git repo, Solus’ packaged kernels (both -lts and -current) have full support for unprivileged creation of user namespaces, similar to the distro kernels in Fedora, Ubuntu and Debian 11, and Arch Linux’s default kernel. For kernels like these, a non-setuid bwrap is preferred: it offers more features than the setuid bwrap, and prevents bwrap bugs like CVE-2020-5291 from being used for root privilege escalation.

Having a kernel that allows unprivileged user namespaces (like Debian 11), combined with a setuid bwrap (like Debian 10), maximises the attack surface: it gives attackers access to more bugs in the kernel and more bugs in bwrap, and in fact there have been several bwrap CVEs that were really only exploitable in this situation. I would not recommend this configuration.

(Obviously, the kernel developers and the bwrap developers both fix known vulnerabilities like CVE-2016-3135 and CVE-2020-5291 as soon as they become known - the point here is to defend you against possible bugs similar to those that are not known about yet.)

Meanwhile, the Flatpak app can’t launch new containers because it’s already in a container. This is known, and not a Solus-specific issue (Steam Runtime bug, Flathub bug). I’ve been working on a solution for a while, but it requires a new version of Flatpak with several new features, which are currently waiting for review. The only thing Solus needs to do to benefit from this is to keep your packaged version of Flatpak up to date. Because the new features add new API, I would strongly discourage patching them in as a distro-specific change before they have been reviewed and merged upstream.

The new Flatpak features that will make this work with the Flatpak version of Steam will require that Flatpak is using a copy of bwrap that is not setuid. As far as I can tell from Solus’ packaging repository, this is already the case - it’s using a Flatpak-specific bundled copy, installed as flatpak-bwrap, instead of the system copy.

I actually still have one of my machines running on Solus (that I occassionally use for gaming) and I would recommend anyone looking at Solus as a potential distro (to do gaming) to be well aware of these issues before making the jump.

I am also concerned by how Solus devs seem to take this as a low priority task: this was first reported since November 2020, and yet in March 2021 there is still no ETA for a fix. For a distribution that used to make gaming a key differentiator, that’s quite a disappointment.