DevPod on Kubernetes: turning devcontainer.json into a persistent remote workspace
DevPod is an open source workspace manager for reproducible development environments across Docker, Kubernetes, SSH hosts, and several cloud backends. This note documents a full Kubernetes-based remote development setup with DevPod, including persistent volume strategy, custom images, file sync, IDE integration, and the GPU issues that tend to burn the most time.
DevPod, from Loft Labs, separates environment definition from the infrastructure that runs it. The developer describes the environment in devcontainer.json, including the base image, toolchain, ports, and lifecycle hooks. DevPod then creates and manages the matching workspace on the selected Provider.
Three terms matter more than anything else:
- Provider: the infrastructure backend. DevPod supports Docker, Kubernetes, SSH, and several cloud platforms.
- Workspace: an isolated development environment instance, usually backed by a container or VM on the provider.
- devcontainer.json: a Dev Container specification file that defines the image, lifecycle hooks, port forwarding, and editor behavior.
Compared with GitHub Codespaces or Gitpod, DevPod is a client-side tool rather than a hosted platform. On a self-managed Kubernetes cluster, that means you keep control over networking, storage, security policy, and node placement.
When Kubernetes is the provider, DevPod creates a Pod to host the workspace. Most setups end up with three files:
- devcontainer.json, which defines the image, workspace directory, forwarded ports, and lifecycle commands.
- pod-manifest.yaml, which carries the Kubernetes-native parts such as security context, resource limits, and volume mounts.
- An orchestration script such as devpod.sh, which wraps devpod up, file sync, and environment bootstrap. That script is usually the glue that makes the workflow tolerable.
A typical flow looks like this:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
# Create and start the workspace, which creates a Pod on Kubernetes devpod up . --ide none --provider K8s # Sync local source code to the remote workspace rsync -az --exclude='node_modules' ./project/ remote:/workspace/project/ # Enter the development environment devpod ssh my-workspace # Stop the workspace, which removes the Pod but keeps the PVC devpod stop my-workspace # Delete everything, including the Pod and the PVC devpod delete my-workspace |
What matters is how devpod stop behaves. It removes the Pod but keeps the PVC. The next devpod up recreates the Pod and reattaches the same volume, so the data survives Pod recreation.
The simplest way to split environments is to keep a separate Pod manifest for each one and switch them in a wrapper script:
|
1 2 3 4 5 6 7 8 9 10 11 |
# Example orchestration logic: select a manifest and disk size by environment case "$ENV" in prod) MANIFEST="pod-manifest.yaml"; DISK="300Gi" ;; dev) MANIFEST="pod-manifest-dev.yaml"; DISK="50Gi" ;; test) MANIFEST="pod-manifest-test.yaml"; DISK="500Gi" ;; esac devpod up . --ide none \ --provider K8s \ --provider-option DISK_SIZE="$DISK" \ --provider-option POD_MANIFEST="$MANIFEST" |
This lets each environment define its own node selectors, quotas, and security policy while still sharing one devcontainer.json and one base image.
Where you mount the PVC decides what survives a Pod rebuild.
Mount the PVC at the container's $HOME, for example /root. In most setups, that is the least painful option. There are a few reasons to prefer it:
- The IDE server side, such as VS Code Server or Cursor Server, installs itself under ~/.vscode-server or ~/.cursor-server. Those directories land on persistent storage automatically.
- Toolchain state such as ~/.nvm and ~/.local/bin does not need extra symlink work.
- Shell files such as ~/.bashrc also persist, so environment setup happens once instead of on every Pod restart.
If the PVC is mounted somewhere else, such as /workspace, you usually end up adding symlinks or reinstalling tooling whenever the Pod comes back.
|
1 2 3 4 5 6 7 8 9 10 11 |
/root/ # PVC mount point = $HOME ├── .cursor-server/ # IDE server and extensions, persistent │ ├── cli/ # Server binaries, disposable │ └── extensions/ # Installed extensions, keep these ├── .nvm/ # Node.js version manager, persistent ├── .local/bin/ # kubectl and other tools, persistent ├── .bashrc # Shell configuration, persistent ├── Projects/ │ ├── my-project/ # Project source code │ └── shared-libs/ # Shared libraries └── .config/ # Tool configuration |
DevPod manages the whole workspace lifecycle through the devpod CLI. These are the commands that tend to matter in daily use.
Add and configure the provider first:
|
1 2 3 4 5 6 7 8 9 10 |
# Add the Kubernetes provider devpod provider add kubernetes # List configured providers devpod provider list # Set provider options such as the namespace and Pod manifest path devpod provider set-options kubernetes \ --option KUBERNETES_NAMESPACE=devpod \ --option POD_MANIFEST=pod-manifest.yaml |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
# Create and start a workspace # --ide none skips automatic IDE attach and works well in scripts devpod up . --provider kubernetes --ide none # List workspace state devpod list # SSH into the workspace devpod ssh my-workspace # Stop the workspace, which removes the Pod and keeps the PVC devpod stop my-workspace # Delete the workspace and the PVC devpod delete my-workspace |
stop only removes the Pod. Everything on the PVC, including extensions, toolchain state, and source code, stays in place. The next up recreates the Pod and reattaches the volume, so the environment comes back quickly.
The Kubernetes provider accepts extra parameters through --provider-option:
|
1 2 3 4 |
devpod up . --provider kubernetes --ide none \ --provider-option DISK_SIZE=100Gi \ --provider-option POD_MANIFEST=pod-manifest-test.yaml \ --provider-option KUBERNETES_NAMESPACE=devpod |
| Option | Description |
| DISK_SIZE | PVC size, for example 50Gi or 300Gi. |
| POD_MANIFEST | Path to the custom Pod manifest. |
| KUBERNETES_NAMESPACE | Target namespace for workspace Pods. |
|
1 2 3 4 5 6 7 8 |
# Show detailed workspace status devpod status my-workspace # Inspect the underlying Pod directly kubectl get pod -n devpod -l app=devpod # Show Pod events when startup fails kubectl describe pod my-workspace -n devpod |
devcontainer.json is the core Dev Container file. It defines the image, lifecycle hooks, forwarded ports, editor customization, and the rest of the workspace contract. DevPod fully supports that specification. The file usually lives at .devcontainer/devcontainer.json.
A full example for remote development on Kubernetes:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
{ "name": "my-workspace", // Use a custom image with all tools preinstalled "image": "registry.example.com/dev/ubuntu:22.04-tools", // Skip first-run installation work "onCreateCommand": "true", // Mount the PVC at $HOME so IDE state and extensions persist // workspaceMount is left empty on purpose. DevPod v0.6.x has a known // .devpodignore issue, so large monorepos can get uploaded in full. "workspaceFolder": "/root", "customizations": { "vscode": { "extensions": [ "ms-python.python", "ms-python.vscode-pylance", "ms-python.debugpy", "redhat.vscode-yaml", "ms-kubernetes-tools.vscode-kubernetes-tools" ], "settings": { "python.defaultInterpreterPath": "/usr/local/bin/python", "editor.formatOnSave": true, "terminal.integrated.defaultProfile.linux": "bash" } } }, "forwardPorts": [8000, 8080, 5432, 6379], "portsAttributes": { "8000": { "label": "API Server" }, "8080": { "label": "Web UI" }, "5432": { "label": "PostgreSQL", "onAutoForward": "silent" }, "6379": { "label": "Redis", "onAutoForward": "silent" } }, "otherPortsAttributes": { "onAutoForward": "silent" } } |
You can define the base container either by pointing directly at an image or by building one from a Dockerfile.
The image field accepts any OCI image, including Docker Hub, GHCR, or a private registry. For remote development on Kubernetes, a prebuilt image usually saves trouble. Baking the whole toolchain into the image cuts startup time from minutes to seconds.
If you need to customize the image, use build:
|
1 2 3 4 5 6 7 8 9 |
{ "build": { "dockerfile": "Dockerfile", "context": "..", "args": { "PYTHON_VERSION": "3.11" } } } |
context defaults to ".", which means the directory that contains devcontainer.json. Setting it to ".." lets the Dockerfile reference files from the project root.
workspaceFolder is the directory the IDE opens by default after it connects. On Kubernetes, it usually makes sense to point it at the PVC mount, for example /root, so the workspace path and the persistent path are the same thing.
workspaceMount controls how local source code gets mounted into the container. It is useful in local Docker workflows. In remote Kubernetes workflows, it is often better to leave it empty. DevPod v0.6.x has a known issue in #1885 where .devpodignore can be ignored during streaming upload, which means a large workspace, including venv and node_modules, can get pushed in full. A custom rsync step gives you much better control.
The Dev Container spec defines six lifecycle hooks, in this order:
|
1 2 3 4 5 6 7 8 9 10 11 |
initializeCommand # runs on the host, every startup ↓ onCreateCommand # runs once after first container creation ↓ updateContentCommand # runs after content updates, at least once ↓ postCreateCommand # runs after user assignment, user secrets available ↓ postStartCommand # runs after each container start ↓ postAttachCommand # runs after each IDE attach |
Each hook accepts three forms:
- String: executed through /bin/sh.
- Array: executed directly without a shell, which is safer.
- Object: multiple named commands executed in parallel, useful when several services need to start together.
|
1 2 3 4 5 6 |
{ "postAttachCommand": { "api-server": "cd /root/api && python -m uvicorn main:app --port 8000", "worker": "cd /root/worker && python -m celery -A tasks worker" } } |
A few practical rules help here:
- If all tools are already in the image, set onCreateCommand to "true" and skip it.
- postStartCommand is a good place for startup checks or light warmup.
- The waitFor field decides which phase must finish before the IDE attaches. The default is "updateContentCommand".
You can declare extensions and settings under customizations.vscode, and they are applied automatically when the IDE connects:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
"customizations": { "vscode": { "extensions": [ "ms-python.python", "ms-python.vscode-pylance", "ms-python.debugpy", "redhat.vscode-yaml", "ms-kubernetes-tools.vscode-kubernetes-tools" ], "settings": { "python.defaultInterpreterPath": "/usr/local/bin/python", "editor.formatOnSave": true, "terminal.integrated.defaultProfile.linux": "bash" } } } |
Extensions listed under extensions install automatically on first attach. With the PVC mounted at $HOME, you only pay that cost once. Settings defined here override local editor settings, which helps keep behavior consistent across a team.
Ports listed in forwardPorts are forwarded automatically after the IDE connects. When a service starts inside the container, you can usually hit it on local localhost without extra setup.
portsAttributes lets you define a label and behavior per port:
|
1 2 3 4 5 6 7 8 9 10 |
"forwardPorts": [8000, 8080, 5432, 6379], "portsAttributes": { "8000": { "label": "API Server" }, "8080": { "label": "Web UI", "onAutoForward": "openBrowser" }, "5432": { "label": "PostgreSQL", "onAutoForward": "silent" }, "6379": { "label": "Redis", "onAutoForward": "silent" } }, "otherPortsAttributes": { "onAutoForward": "silent" } |
onAutoForward controls the first reaction when DevPod sees the port: "notify" shows a notification, "openBrowser" opens a browser, "silent" forwards quietly, and "ignore" does nothing. otherPortsAttributes sets the default for ports you did not list explicitly.
The Dev Container spec splits environment variables into two layers:
- containerEnv: set on the container itself, visible to all processes, and fixed for the life of that container.
- remoteEnv: only visible to IDE-launched processes such as terminals, tasks, and debuggers. This layer can reference ${containerEnv:VAR} and does not require a container rebuild when changed.
|
1 2 3 4 5 6 7 8 |
{ "containerEnv": { "PYTHONPATH": "/root/libs/common:/root/libs/shared" }, "remoteEnv": { "PATH": "${containerEnv:PATH}:/root/.local/bin" } } |
Both fields also support ${localEnv:VAR}, which reads an environment variable from the host, for example ${localEnv:HOME}.
Dev Container Features are reusable Dockerfile fragments distributed as OCI artifacts. The features field lets you add tools without editing the base image directly:
|
1 2 3 4 5 6 7 8 9 10 11 |
{ "features": { "ghcr.io/devcontainers/features/docker-in-docker:2": {}, "ghcr.io/devcontainers/features/kubectl-helm-minikube:1": { "version": "latest" }, "ghcr.io/devcontainers/features/node:1": { "version": "22" } } } |
You can browse the available features at containers.dev/features. For Kubernetes-based remote development, though, baking tools into the base image is usually better than paying installation time on every new workspace. Features fit local Docker prototypes better than long-lived remote workspaces.
A few fields change how the container behaves at runtime:
| Field | Default | Description |
| overrideCommand | true | Overrides the image command with an infinite loop so the container stays alive. This default usually makes sense for custom development images. |
| shutdownAction | stopContainer | What happens when the IDE closes. Options include stopContainer and none. For Kubernetes, none is often the better choice. |
| init | false | Uses tini as PID 1 to reap zombie processes. |
| privileged | false | Enables privileged mode. In Docker workflows this can be set here. In Kubernetes, it belongs in the Pod manifest. |
| containerUser | root or the Dockerfile USER | The user for all container operations. |
| remoteUser | same as containerUser | The user for IDE terminals and tasks. It can differ from containerUser. |
String values in devcontainer.json can use these predefined variables:
| Variable | Meaning |
| ${localEnv:VAR_NAME} | Host environment variable, with optional default value syntax: ${localEnv:VAR:default} |
| ${containerEnv:VAR_NAME} | Container environment variable, available only inside remoteEnv |
| ${localWorkspaceFolder} | Workspace path on the host |
| ${containerWorkspaceFolder} | Workspace path inside the container |
| ${devcontainerId} | Stable unique identifier for the container |
You can point the Dev Container image field at any public image, but for remote development on Kubernetes it is usually worth building a dedicated base image with the toolchain, language runtimes, and system libraries locked into image layers.
That pays off in a few ways:
- The Pod is usable as soon as it starts. You do not wait for onCreateCommand to install half the environment.
- Environment consistency improves because everyone shares the same image instead of replaying installation steps in slightly different conditions.
- When the Pod is rebuilt, the toolchain comes back with it. You are not depending on package manager availability at workspace creation time.
Good layering makes build caching much more effective. Put low-churn tools in lower layers and faster-moving pieces in upper layers. End each RUN block with apt-get clean && rm -rf /var/lib/apt/lists/* to keep layers smaller, and use --no-install-recommends to avoid pulling in packages you do not need.
The following example builds a development image with Python 3.11, common system tools, and the NVIDIA CUDA runtime:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
FROM ubuntu:22.04 ENV DEBIAN_FRONTEND=noninteractive # Layer 1: system tools, Python 3.11, and all PPAs RUN apt-get update && \ apt-get install -y --no-install-recommends \ software-properties-common gnupg2 wget curl ca-certificates && \ add-apt-repository -y ppa:deadsnakes/ppa && \ add-apt-repository -y ppa:graphics-drivers/ppa && \ wget -qO /tmp/cuda-keyring.deb \ https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb && \ dpkg -i /tmp/cuda-keyring.deb && rm /tmp/cuda-keyring.deb && \ apt-get update && \ apt-get install -y --no-install-recommends \ python3.11 python3.11-venv python3.11-dev python3-pip \ git make vim jq postgresql-client \ openssh-server procps iproute2 iputils-ping \ rsync htop telnet && \ update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 1 && \ update-alternatives --install /usr/bin/python python /usr/bin/python3.11 1 && \ apt-get clean && rm -rf /var/lib/apt/lists/* # Layer 2: NVIDIA driver tools such as nvidia-smi RUN apt-get update && \ apt-get install -y --no-install-recommends nvidia-utils-580-server && \ apt-get clean && rm -rf /var/lib/apt/lists/* # Layer 3: CUDA runtime libraries RUN apt-get update && \ apt-get install -y --no-install-recommends cuda-libraries-12-8 && \ apt-get clean && rm -rf /var/lib/apt/lists/* |
Several design choices here matter:
- Add PPAs and GPG keys in layer 1, before update-alternatives changes the default Python. If you switch Python first, add-apt-repository can fail with No module named 'apt_pkg' because the apt_pkg binding expects the system Python.
- Keep NVIDIA tools and CUDA libraries in separate layers. That way a driver update only rebuilds one layer.
- Install nvidia-utils-xxx-server, not nvidia-utils-xxx. On Ubuntu, the latter can be a transitional dummy package without the actual nvidia-smi binary.
- Pick cuda-libraries-12-8, roughly 1.2 GB, instead of the full cuda-toolkit-12-8, which is closer to 10 GB. Most development environments need the runtime more often than the full compiler and debugger stack.
Once the image already contains the full toolchain, devcontainer.json becomes much simpler:
|
1 2 3 4 5 |
{ "image": "registry.example.com/dev/ubuntu:22.04-cuda12.8", "onCreateCommand": "true", "workspaceFolder": "/root" } |
Setting onCreateCommand to "true" means there is nothing left to install at first startup. The Pod is ready immediately after creation.
The Pod manifest is the core Kubernetes-side configuration. It controls the things DevPod cannot express through devcontainer.json.
DevPod renders the Pod manifest as a template before it creates the Pod. These placeholders are commonly used:
| Variable | Meaning |
| {{.WorkspaceId}} | Workspace name, often reused as the Pod name and label value. |
| {{.Image}} | Image declared in devcontainer.json. |
Remote development containers often need looser permissions than production containers. These are the settings that come up most often:
| Setting | Use | Risk |
| privileged: true | Docker-in-Docker, device access, debugging tools | Full access to host kernel capabilities |
| SYS_ADMIN | mount and cgroup operations | Medium |
| SYS_PTRACE | strace, gdb, and similar debugging | Low |
| NET_ADMIN | Network debugging and iptables work | Medium |
| hostNetwork: true | Direct use of the host network stack, which avoids CNI overhead | Port conflicts and loss of network isolation |
| hostPID: true | Inspect host processes for system-level debugging | Loss of process isolation |
Loosen permissions only where the workspace actually needs them, and keep these Pods isolated to dedicated namespaces or nodes so they do not interfere with production workloads.
|
1 2 3 4 5 6 7 |
resources: requests: cpu: "500m" memory: "1Gi" limits: cpu: "16" memory: "64Gi" |
Set requests low enough to keep scheduling realistic, and limits high enough to leave room for bursts. Development environments rarely sit at peak usage all day, but builds and test runs can spike hard for a short time.
DevPod includes a built-in sync path through devpod up, and it works fine for small projects. On large multi-repo workspaces, with dozens of subprojects and millions of files, two problems show up fast:
- The first sync can take a very long time, and exclusion control is limited.
- DevPod may try to upload the entire workspaceFolder, including directories you do not want remotely, such as node_modules and .git.
The usual way around this is to launch DevPod with --ide none, skip automatic sync, and then run your own rsync command with explicit include and exclude rules.
Even with --ide none, DevPod still tries to sync the local directory that matches workspaceFolder during devpod up. If that directory is large, the initial startup can still crawl. One workaround is to create a temporary empty directory and use that for the initial workspace creation:
|
1 2 3 4 |
STUB_DIR=$(mktemp -d) devpod up "$STUB_DIR" --ide none --provider K8s ... rm -rf "$STUB_DIR" # Then sync the real source tree with rsync |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
SSH_CMD="ssh my-workspace.devpod" rsync -az \ --exclude='node_modules' \ --exclude='.git' \ --exclude='__pycache__' \ --exclude='venv' \ --exclude='.venv' \ --exclude='dist' \ --exclude='.next' \ --exclude='.temp' \ --exclude='.logs' \ --exclude='.vscode/sessions.json' \ --copy-unsafe-links \ ./my-project/ my-workspace.devpod:/root/Projects/my-project/ |
The most useful flags here are:
- -az: archive mode plus compression. Do not add --progress when you have a large number of small files. The extra output can slow the SSH stream badly enough to trigger a broken pipe.
- --copy-unsafe-links: dereferences symlinks that point outside the synced tree. In multi-repo setups this is useful because shared directories linked from elsewhere often do not resolve correctly on the remote side.
- --exclude: keep anything noisy or disposable out of the remote workspace. .vscode/sessions.json changes constantly and tends to fight with remote state, so it should stay out.
VS Code and Cursor run remote development by installing a server-side component inside the container. The local editor talks to that server through SSH.
The server build has to match the local IDE version, usually by commit hash. The installation flow is usually:
- Read the current commit hash from the local IDE.
- Download the matching server bundle.
- Transfer and unpack it to ~/.cursor-server/cli/servers/Stable-{commit}/ on the remote side.
The wrapper script should make installation idempotent:
|
1 2 3 4 5 6 7 8 9 |
COMMIT=$(get_ide_commit_hash) SERVER_BIN="$HOME/.cursor-server/cli/servers/Stable-$COMMIT/server/bin/code-server" if $SSH_CMD "test -x $SERVER_BIN"; then echo "Server already installed" else # Download and install the server install_ide_server "$COMMIT" fi |
Extensions live under ~/.cursor-server/extensions/ or ~/.vscode-server/extensions/. If the PVC is mounted at $HOME, those extensions persist automatically.
A common mistake is wiping the whole ~/.cursor-server directory during a server reinstall. That blows away every installed extension. The safer cleanup target is the server binary directory only:
|
1 2 3 4 5 |
# Wrong: removes extensions too rm -rf ~/.cursor-server # Right: remove only the server binaries rm -rf ~/.cursor-server/cli |
When you first prepare a remote environment, it can be faster to sync already installed local extensions than to redownload everything from the marketplace:
|
1 2 3 |
rsync -az \ ~/.cursor-server/extensions/ \ my-workspace.devpod:~/.cursor-server/extensions/ |
After the sync, check for broken symlinks. Some extensions include links to a local Node.js path that does not exist remotely. Those need to be replaced with real files.
|
1 2 3 4 5 |
# Find broken symlinks on the remote side find ~/.cursor-server/extensions/ -type l ! -exec test -e {} \; -print # Replace each broken link with a real copy of the target file # fetched from the local machine |
DevPod SSH sessions can drop when left idle because the server's SSH daemon, a firewall, or a load balancer between the client and the pod times out the connection. The standard fix is to enable SSH keepalive on the client side:
|
1 2 3 4 |
# ~/.ssh/config Host *.devpod ServerAliveInterval 60 ServerAliveCountMax 10 |
ServerAliveInterval 60 sends a keepalive packet every 60 seconds. ServerAliveCountMax 10 allows up to 10 consecutive missed responses before the client closes the connection. That combination keeps the tunnel alive through typical idle timeouts and handles pauses of up to roughly 10 minutes.
For sessions opened through scripts, add the options as flags:
|
1 |
ssh -o ServerAliveInterval=60 -o ServerAliveCountMax=10 my-workspace.devpod |
For Cursor remote connections, the keepalive must be in ~/.ssh/config rather than a command flag, because Cursor manages the underlying SSH process itself and does not expose extra flags to the user.
The first IDE attach to a fresh workspace often takes anywhere from 30 seconds to several minutes because the IDE still has to:
- Establish the SSH tunnel, which adds some overhead through DevPod's SSH layer.
- Download and install the server if it is not already present.
- Initialize the installed extensions.
Later connections are much faster because the server and extensions are already sitting on the PVC.
GPU access on Kubernetes depends on several moving parts, including the host driver, the device plugin, and the container runtime hook. If any one of them is wrong, the container will come up without usable GPU devices.
The NVIDIA Device Plugin runs as a DaemonSet on GPU nodes and registers the extended resource nvidia.com/gpu with Kubernetes. A Pod requests GPUs by declaring the count in resources.limits:
|
1 2 3 4 5 |
resources: limits: nvidia.com/gpu: "4" requests: nvidia.com/gpu: "4" |
The scheduler places the Pod on a node with enough GPU capacity, and the device plugin injects the actual device nodes such as /dev/nvidia0.
Requesting GPU resources is not enough. Kubernetes also has to know which container runtime class should handle GPU device setup. That happens through the Pod's runtimeClassName field:
|
1 2 3 4 5 |
spec: runtimeClassName: nvidia containers: - name: devpod # ... |
If you omit runtimeClassName, the Pod may still get GPU quota, but the runtime will not call NVIDIA's prestart hook. The result is simple: no /dev/nvidia* devices inside the container. This is one of the most common failure modes.
privileged: true does not mean AppArmor is unconfined. On nodes with AppArmor enabled, a privileged container can still be blocked by the default profile, such as cri-containerd.apparmor.d, when it tries to access GPU device nodes.
The fix is to declare an unconfined AppArmor profile in the Pod annotations:
|
1 2 3 |
metadata: annotations: container.apparmor.security.beta.K8s.io/devpod: unconfined |
Here devpod is the container name. The annotation must match it exactly.
It is tempting to set NVIDIA_VISIBLE_DEVICES=all in the Pod manifest to expose every GPU. In a setup that already uses runtimeClassName: nvidia, that usually backfires. A manually set value can interfere with the device plugin's own injection logic.
The NVIDIA container runtime behaves like this:
- If NVIDIA_VISIBLE_DEVICES comes from the device plugin, the runtime mounts exactly the devices that value names.
- If the manifest hardcodes NVIDIA_VISIBLE_DEVICES=all, that value overrides the plugin-managed one and can break the mapping step.
The safer approach is to leave NVIDIA_VISIBLE_DEVICES alone and let the device plugin manage it. Keeping NVIDIA_DRIVER_CAPABILITIES=all is fine if the container needs full driver capability access.
nvidia-smi is the fastest way to confirm GPU visibility. There is one common trap when you install it inside the container: on some Linux distributions, packages named nvidia-utils-xxx are only transitional dummy packages. They install successfully but do not include the real nvidia-smi binary.
On Ubuntu 22.04, the reliable path is:
- Add ppa:graphics-drivers/ppa.
- Install nvidia-utils-xxx-server, with the -server suffix.
If changing the image is inconvenient, one temporary workaround is to mount host driver tools and libraries into the container with hostPath:
|
1 2 3 4 5 6 7 8 |
volumeMounts: - name: host-root mountPath: /host readOnly: true volumes: - name: host-root hostPath: path: / |
After startup, add /host/usr/lib/x86_64-linux-gnu to LD_LIBRARY_PATH and call /host/usr/bin/nvidia-smi directly. It works, but it is still a workaround. The long-term fix is to bake the required driver tools into the image.
If nvidia-smi returns Failed to initialize NVML: Unknown Error, check things in this order:
- AppArmor. Confirm the Pod annotation is unconfined, and inspect the actual container profile with cat /proc/1/attr/current.
- Device nodes. Check whether ls /dev/nvidia* returns anything. If the files exist but opening them returns EPERM, the cgroup device filter is the problem, not the driver.
- Runtime class. Confirm the Pod spec sets runtimeClassName: nvidia and that the cluster actually has that RuntimeClass.
- Environment variables. Verify that NVIDIA_VISIBLE_DEVICES was not overridden manually.
- Driver versions. Make sure the user-space NVIDIA libraries in the container are compatible with the host kernel driver.
- containerd privileged_without_host_devices. If the cluster uses nvidia-container-runtime as the default runtime and this flag is false, privileged pods that do not request nvidia.com/gpu will see device files in /dev but be blocked by the eBPF cgroup program. See the next section.
A development pod sometimes skips the nvidia.com/gpu resource request entirely, for example when the node already runs an inference service and the workspace wants to share the hardware without holding a scheduler slot. That approach works until the cluster enables GPU time-slicing.
A common part of the time-slicing setup is replacing the default containerd runc binary with nvidia-container-runtime:
|
1 2 |
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] BinaryName = "nvidia-container-runtime" |
When this is in place and the containerd setting privileged_without_host_devices is false, privileged containers no longer inherit host /dev automatically. The nvidia-container-runtime attaches an eBPF cgroup program that controls device access. For pods without a device plugin allocation, that program blocks /dev/nvidiactl and friends at the open syscall level.
The device files appear in ls /dev because the runtime still creates their directory entries. But opening them returns EPERM, and NVML fails immediately:
|
1 2 |
>>> import os; os.open('/dev/nvidiactl', os.O_RDWR) PermissionError: [Errno 1] Operation not permitted: '/dev/nvidiactl' |
The error is at the kernel cgroup level, not in the userspace library. The same block applies even if you run the host's own nvidia-smi binary via chroot /host nvidia-smi because the eBPF program acts on any process in that container's cgroup.
The fix is one line in /etc/containerd/config.toml:
|
1 2 |
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc] privileged_without_host_devices = true |
After updating the file, containerd must be restarted. If you cannot SSH directly to the node, a temporary privileged Job that mounts the host root is the standard approach. Use an image that is already cached on the node and avoid pulling from a registry:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
apiVersion: batch/v1 kind: Job metadata: name: containerd-restart namespace: kube-system spec: ttlSecondsAfterFinished: 60 template: spec: restartPolicy: Never nodeSelector: kubernetes.io/hostname: "your-node" hostPID: true hostNetwork: true containers: - name: restart image: your-cached-image imagePullPolicy: IfNotPresent securityContext: privileged: true command: - /bin/bash - -c - | chroot /host systemctl restart containerd sleep 5 chroot /host systemctl is-active containerd volumeMounts: - name: host-root mountPath: /host volumes: - name: host-root hostPath: path: / |
The pod must be recreated after containerd restarts. Stop the workspace with devpod stop before applying the config change. The PVC is not touched by any of these steps.
| Symptom | Cause | Fix |
| Pod enters Dead or Failed state | OOM, node issues, or a bad manifest | Run devpod stop, fix the manifest, then run devpod up again. The PVC stays intact. |
| SSH exits with code 255 | The Pod is not ready yet, or the SSH tunnel dropped | Check Pod state and retry after it reaches Running. If server installation was interrupted, rerun the installation step manually. |
| rsync reports Broken pipe | Progress output flooded the SSH channel | Use rsync -az without --progress or --info=progress2. |
| add-apt-repository fails with No module named 'apt_pkg' | The default Python was switched before repository setup | Add all PPAs before calling update-alternatives. |
| IDE extensions disappear after a Pod rebuild | The reinstall script removed the extensions directory | Delete only the cli/ subtree and keep extensions/. |
| nvidia-smi: command not found | A transitional dummy package was installed | Install nvidia-utils-xxx-server from ppa:graphics-drivers/ppa. |
| NVML Unknown Error | AppArmor, runtime class, device injection, environment override, or cgroup eBPF block from privileged_without_host_devices = false | Try opening /dev/nvidiactl in Python. EPERM means cgroup block. Set privileged_without_host_devices = true in containerd config and restart. Otherwise debug: AppArmor, device nodes, runtimeClassName, then environment variables. |
| /dev/nvidia* does not exist | Missing runtimeClassName: nvidia or a broken device plugin | Confirm the RuntimeClass exists and the device plugin DaemonSet is healthy. |
Leave a Reply