ADR 0015: Kubernetes as the Sole Container Execution Path (No Docker SDK)¶
Status: Accepted Date: 2026-05-22 Deciders: Project founder
Context¶
ADR 0002 (pod-per-task) established ephemeral containers as the only execution
model and noted that, in standalone mode, a task would run "either a Docker
container (default) or a subprocess." Phase 3 implemented a DockerExecutor
using the official Docker SDK (github.com/docker/docker) to realize the
standalone-Docker path.
That introduced two concrete problems:
- Supply chain. The Docker SDK pulls the entire Moby dependency tree and,
at the time of writing, carries an advisory with no fix available
(GO-2026-4887, Moby AuthZ plugin bypass).
govulncheckflags it as reachable through the package'sinit, so importing the SDK into the control plane binary fails the CI security gate (ADR 0014) with no remediation path. - Architectural coherence. Depending on the Docker SDK means the control plane speaks two container APIs (Kubernetes and Docker), each with its own lifecycle, watch, and cleanup semantics โ exactly the kind of dual surface ADR 0001 set out to avoid.
The question is how to keep "run each task in an isolated container locally" without importing the Docker SDK into the control plane.
Decision¶
Kubernetes is the sole container execution path. The control plane creates
ephemeral pods via client-go (which it already depends on) for both production
and local development. Local "Docker" execution becomes local Kubernetes on
a single-node cluster (k3d / kind / minikube / Docker Desktop's Kubernetes),
which itself runs on the developer's Docker.
The control plane does not import the Docker SDK (github.com/docker/docker)
or any container-engine client other than client-go.
A SubprocessExecutor remains as an explicit, dev-only escape hatch that runs
the agent directly on the host with no isolation (and a loud warning). It is
not a container path; it exists for fast local iteration without any cluster.
This refines the standalone-mode wording of ADR 0002: standalone container execution is local Kubernetes, not a direct Docker integration.
Rationale¶
- Clean supply chain. Removing the Docker SDK eliminates the unfixable GO-2026-4887 reachable vulnerability and a large transitive dependency tree, keeping the CI security gate (ADR 0014) green and the server binary small.
- One execution path. The same
KubernetesExecutor, pod spec, watcher, and cleanup logic serve both local and production. Less code, fewer edge cases, identical behavior across environments. - Ecosystem precedent. This mirrors how the cloud-native ecosystem evolved: Argo Workflows deprecated and removed its Docker executor in favor of Kubernetes-native executors, and Kubeflow Pipelines runs on Kubernetes (Argo/Tekton) without embedding the Docker SDK in its control plane, delegating image builds to out-of-process tools (Kaniko/BuildKit).
govulncheckreachability fits. Depending only onclient-golets the call-graph analysis suppress the inevitable unreachable transitive advisories, rather than failing on a directly-reachable, unfixable one.
Consequences¶
- Local dev requires a local Kubernetes cluster. The developer guide
recommends
k3d cluster create(or kind). This raises the local setup bar versus a bare Docker daemon, but unifies the execution model. - The
DockerExecutorand the Docker SDK dependency are removed. If a SDK-free Docker mode is ever desired (e.g., shelling out to thedockerCLI or talking to the daemon socket overnet/http), it can be added later as a separate, optional executor without reintroducing the SDK; this ADR does not forbid that, only the in-process SDK dependency. - Image build stays out-of-process. Consistent with this decision, the
leoflow compileimage build (ADR 0003, issue #7) should use an out-of-process builder (Kaniko/BuildKit/docker buildshell-out), not the Docker SDK. - Subprocess mode is dev-only and unisolated, gated behind an explicit opt-in and a runtime warning.
Alternatives Rejected¶
- Docker SDK in the control plane (the Phase 3
DockerExecutor): rejected for the supply-chain and coherence reasons above. - Shell out to the
dockerCLI: viable and SDK-free, but adds a second container API surface and requires thedockerCLI on the host; deferred as an optional future executor, not the default. - A separate out-of-process executor binary that imports the SDK: isolates the dependency from the server binary but still ships the unfixable vulnerability somewhere; not worth the added moving part for a dev convenience.