Run Ollama with Docker: CPU, NVIDIA, and AMD GPU support

Learn how to run Ollama with Docker on CPU, NVIDIA, and AMD GPU architectures. Get started with our quick guide and deploy Ollama in minutes.

Image details

Before we dive in, here are some key details about the official Ollama Docker image:

Base image: The image is built on top of Ubuntu 22.04 LTS, a stable and well-maintained Linux distribution.
Exposed port: The image exposes port 11434, which is used to access the Ollama service.
Entrypoint: The entrypoint of the image is set to /bin/ollama, which is the command that runs when the container starts.
Default command: The default command for the container is serve, which starts the Ollama service.

Quick start

To get started with Ollama, you can use one of the following commands, depending on your hardware:

CPU

If you're running on a CPU-only machine, you can start the Ollama container with the following command:

docker run --rm --name ollama -d -v ollama:/root/.ollama -p 11434:11434 ollama/ollama

This command creates a new container named ollama from the ollama/ollama image, maps port 11434 on the host machine to port 11434 in the container, and mounts a volume at /root/.ollama to persist data.

NVIDIA GPU

If you have an NVIDIA GPU, you can start the Ollama container with the following command:

docker run --rm --name ollama --gpus all -d -v ollama:/root/.ollama -p 11434:11434 ollama/ollama

This command is similar to the CPU-only command, but it also requests access to all available NVIDIA GPUs using the --gpus all flag.

IMPORTANT

Make sure you have the NVIDIA Container Toolkit installed and configured on your system to utilize GPU acceleration. For setup instructions, please refer to our documentation.

AMD GPU

If you have an AMD GPU, you can start the Ollama container with the following command:

docker run --rm --name ollama --device /dev/kfd --device /dev/dri -d -v ollama:/root/.ollama -p 11434:11434 ollama/ollama:rocm

This command is similar to the CPU-only command, but it also requests access to the AMD GPU using the --device flag to mount the /dev/kfd and /dev/dri devices.

Docker Compose

If you prefer to use Docker Compose, you can define your Ollama service in a compose.yaml file. Here are some examples:

CPU

yaml

services:
  ollama:
    image: ollama/ollama
    ports:
      - 11434:11434
    volumes:
      - ollama:/root/.ollama

volumes:
  ollama: {}

This YAML file defines a service named ollama that uses the ollama/ollama image, maps port 11434 on the host machine to port 11434 in the container, and mounts a volume at /root/.ollama.

NVIDIA GPU

yaml

services:
  ollama:
    image: ollama/ollama
    ports:
      - 11434:11434
    volumes:
      - ollama:/root/.ollama
    environment:
      NVIDIA_VISIBLE_DEVICES: all
    runtime: nvidia

volumes:
  ollama: {}

This YAML file is similar to the CPU-only example, but it also sets the NVIDIA_VISIBLE_DEVICES environment variable to all and uses the nvidia runtime to request access to the NVIDIA GPU.

AMD GPU

yaml

services:
  ollama:
    image: ollama/ollama:rocm
    ports:
      - 11434:11434
    volumes:
      - ollama:/root/.ollama
    devices:
      - /dev/kfd
      - /dev/dri

volumes:
  ollama: {}

This YAML file is similar to the CPU-only example, but it uses the ollama/ollama:rocm image and mounts the /dev/kfd and /dev/dri devices to request access to the AMD GPU.

Setting up Docker for CUDA support

Run Ollama with Docker: CPU, NVIDIA, and AMD GPU support ​

Image details ​

Quick start ​

CPU ​

NVIDIA GPU ​

AMD GPU ​

Docker Compose ​

CPU ​

NVIDIA GPU ​

AMD GPU ​

Related posts ​

Run Ollama with Docker: CPU, NVIDIA, and AMD GPU support

Image details

Quick start

CPU

NVIDIA GPU

AMD GPU

Docker Compose

CPU

NVIDIA GPU

AMD GPU

Related posts