CVE-2024–0132 — Critical NVIDIA AI Vulnerability Affecting Containers Using NVIDIA GPUs

SOCFortress
3 min readSep 30, 2024

--

Need Help?

The functionality discussed in this post, and so much more, are available via the SOCFortress platform. Let SOCFortress help you and your team keep your infrastructure secure.

Website: https://www.socfortress.co/

Contact Us: https://www.socfortress.co/contact_form.html

References

Intro

Critical severity vulnerability CVE-2024–0132 affecting NVIDIA Container Toolkit and GPU Operator presents high risk to AI workloads and environments. This impacts any AI application — in the cloud or on-premise — that is running the vulnerable container toolkit to enable GPU support.

Affected Components:

  • NVIDIA Container Toolkit: All versions up to and including v1.16.1
  • NVIDIA GPU Operator: All versions up to and including 24.6.1

Note: The vulnerability does not impact use cases where Container Device Interface (CDI) is used.

Attack flow

The attack has three main stages:

  • Creating a malicious image: The attacker crafts a specially designed image to exploit CVE-2024–0132. (Note: Specific technical details about exploiting this vulnerability are not provided at this stage for the reasons we mentioned earlier.)
  • Gaining full access to the file system: The attacker runs the malicious image on the target platform. This can be performed either directly (for example in services allowing shared GPU resources) or indirectly through a supply chain or social engineering attack (e.g., a user running an AI image from an untrusted source). By exploiting the vulnerability, the attacker gains the ability to mount the entire host file system, obtaining full read access to the underlying host. This gives the attacker full visibility to the underlying infrastructure, and potentially allows access to other customers’ confidential data.
  • Complete host takeover: With this access, the attacker can now reach the Container Runtime Unix sockets (docker.sock/containerd.sock). These sockets can be used to execute arbitrary commands on the host system with root privileges, effectively taking control of the machine (this is a known attack path for containerised systems, see here). Note that while the vulnerability initially grants only READ access to the file system, an attacker can exploit a nuance in Unix socket behaviour. In Linux, sockets remain writable even when mounted with read-only permissions.

Mitigations

Affected organisations should upgrade to the latest version of Container Toolkit (v1.16.2) and NVIDIA GPU Operator (v24.6.2).

Patching is highly recommended for container hosts running Container Toolkit in vulnerable versions, while prioritising hosts that are likely to run containers, especially those built from images originating in untrusted sources. Further prioritisation can be achieved through runtime validation, so as to focus patching efforts on instances where the toolkit is definitely in use.

Note that Internet exposure is not a relevant factor for triaging this vulnerability, as the affected container host does not need to be publicly exposed in order to load a malicious container image. Instead, initial access vectors may include social engineering attempts against developers; supply chain scenarios such as an attacker with prior access to a container image repository; and containerised environments allowing external users to load arbitrary images (whether by design or due to a misconfiguration).

Need Help?

The functionality discussed in this post, and so much more, are available via the SOCFortress platform. Let SOCFortress help you and your team keep your infrastructure secure.

Website: https://www.socfortress.co/

Contact Us: https://www.socfortress.co/contact_form.html

--

--

SOCFortress

SOCFortress is a SaaS company that unifies Observability, Security Monitoring, Threat Intelligence and Security Orchestration, Automation, and Response (SOAR).