Appearance
Run Ollama with Docker: CPU, NVIDIA, and AMD GPU support
Learn how to run Ollama with Docker on CPU, NVIDIA, and AMD GPU architectures. Get started with our quick guide and deploy Ollama in minutes.
Image details
Before we dive in, here are some key details about the official Ollama Docker image:
- Base image: The image is built on top of Ubuntu 22.04 LTS, a stable and well-maintained Linux distribution.
- Exposed port: The image exposes port
11434
, which is used to access the Ollama service. - Entrypoint: The entrypoint of the image is set to
/bin/ollama
, which is the command that runs when the container starts. - Default command: The default command for the container is
serve
, which starts the Ollama service.
Quick start
To get started with Ollama, you can use one of the following commands, depending on your hardware:
CPU
If you're running on a CPU-only machine, you can start the Ollama container with the following command:
sh
docker run --rm --name ollama -d -v ollama:/root/.ollama -p 11434:11434 ollama/ollama
This command creates a new container named ollama
from the ollama/ollama
image, maps port 11434
on the host machine to port 11434
in the container, and mounts a volume at /root/.ollama
to persist data.
NVIDIA GPU
If you have an NVIDIA GPU, you can start the Ollama container with the following command:
sh
docker run --rm --name ollama --gpus all -d -v ollama:/root/.ollama -p 11434:11434 ollama/ollama
This command is similar to the CPU-only command, but it also requests access to all available NVIDIA GPUs using the --gpus all
flag.
IMPORTANT
Make sure you have the NVIDIA Container Toolkit installed and configured on your system to utilize GPU acceleration. For setup instructions, please refer to our documentation.
AMD GPU
If you have an AMD GPU, you can start the Ollama container with the following command:
sh
docker run --rm --name ollama --device /dev/kfd --device /dev/dri -d -v ollama:/root/.ollama -p 11434:11434 ollama/ollama:rocm
This command is similar to the CPU-only command, but it also requests access to the AMD GPU using the --device
flag to mount the /dev/kfd
and /dev/dri
devices.
Docker Compose
If you prefer to use Docker Compose, you can define your Ollama service in a compose.yaml
file. Here are some examples:
CPU
yaml
services:
ollama:
image: ollama/ollama
ports:
- 11434:11434
volumes:
- ollama:/root/.ollama
volumes:
ollama: {}
This YAML file defines a service named ollama
that uses the ollama/ollama
image, maps port 11434
on the host machine to port 11434
in the container, and mounts a volume at /root/.ollama
.
NVIDIA GPU
yaml
services:
ollama:
image: ollama/ollama
ports:
- 11434:11434
volumes:
- ollama:/root/.ollama
environment:
NVIDIA_VISIBLE_DEVICES: all
runtime: nvidia
volumes:
ollama: {}
This YAML file is similar to the CPU-only example, but it also sets the NVIDIA_VISIBLE_DEVICES
environment variable to all
and uses the nvidia
runtime to request access to the NVIDIA GPU.
AMD GPU
yaml
services:
ollama:
image: ollama/ollama:rocm
ports:
- 11434:11434
volumes:
- ollama:/root/.ollama
devices:
- /dev/kfd
- /dev/dri
volumes:
ollama: {}
This YAML file is similar to the CPU-only example, but it uses the ollama/ollama:rocm
image and mounts the /dev/kfd
and /dev/dri
devices to request access to the AMD GPU.