Setting up Docker for CUDA support

Learn how to configure Docker for CUDA support, enabling GPU acceleration in containers. Step-by-step guide for installing drivers, configuring runtime, and verifying setup.

Installing NVIDIA Container Toolkit

This guide explains how to install and configure the NVIDIA Container Toolkit, which is essential for using CUDA in Docker containers. The toolkit allows Docker containers to leverage NVIDIA GPUs, enabling GPU-accelerated computing within containerized environments.

For Debian-based systems (e.g., Ubuntu)

First, add the NVIDIA GPG key and repository:

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
  sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
    && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
      sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
      sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

Update the package list:

sudo apt-get update

Install the NVIDIA Container Toolkit:

sudo apt-get install -y nvidia-container-toolkit

For RPM-based systems (e.g., CentOS, RHEL)

Add the NVIDIA repository:

curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
  sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo

Install the NVIDIA Container Toolkit:

sudo yum install -y nvidia-container-toolkit

Configuring Docker

After installing the NVIDIA Container Toolkit, you need to configure Docker to use it as a runtime. This configuration allows Docker containers to access and utilize NVIDIA GPUs on your system.

Edit the Docker daemon configuration file

The Docker daemon configuration file is typically located at /etc/docker/daemon.json. The configuration file should include settings that specify the use of the NVIDIA runtime.

Configure the NVIDIA Container Toolkit runtime

To simplify the process, you can use the nvidia-ctk command-line tool to automatically configure Docker to use the NVIDIA runtime. This command updates the Docker configuration file for you:

sudo nvidia-ctk runtime configure --runtime=docker

This command sets up Docker to recognize nvidia as a valid runtime option, allowing containers to be launched with GPU support.

Restart the Docker daemon

After making changes to the Docker configuration, restart the Docker service to apply these changes:

sudo systemctl restart docker

Restarting Docker ensures that it loads the new configuration, enabling GPU support for your containers. Once restarted, you can verify that everything is set up correctly by running a test container with GPU access.

Verifying the installation

To verify that the NVIDIA Container Toolkit is working correctly, run a test container:

sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

If successful, you should see output similar to this:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10    Driver Version: 535.86.10    CUDA Version: 12.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:00:1E.0 Off |                    0 |
| N/A   34C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

This output confirms that the NVIDIA GPU is accessible within the Docker container.

Setting up Docker for CUDA support ​

Installing NVIDIA Container Toolkit ​

For Debian-based systems (e.g., Ubuntu) ​

For RPM-based systems (e.g., CentOS, RHEL) ​

Configuring Docker ​

Edit the Docker daemon configuration file ​

Configure the NVIDIA Container Toolkit runtime ​

Restart the Docker daemon ​

Verifying the installation ​