ADR 0033: Release Flow โ RC Tags, E2E Gates, and Immutable Versions¶
Status: Accepted
Date: 2026-06-01
Effective from: v0.1.0-alpha.1 (first alpha tag); pre-alpha tags keep the
simpler direct-tag flow described in Pre-alpha exception
below.
Context¶
Today (pre-alpha) the release flow is git tag vX.Y.Z โ release.yaml builds
+ publishes + runs install-smoke on 7 distros โ if smoke fails, gate job
retracts the release to a DRAFT. The smoke job only runs leoflow version +
python --version, which let the v0.0.1-prealpha.21 regression slip through:
multi-DAG workspaces shipped without the materialization fix, and every task
on user-installed instances failed with ModuleNotFoundError: No module named
'dag'.
Two gaps to close before alpha:
- The gate's E2E surface is too thin. Smoke runs
--versiononly; it does not exercise the path the user actually runs (boot lite, register a subdir DAG, trigger a task). PR #251 wired the multi-DAG materialization contract; PR #C wires the matching gate (test/e2e/lite-multidag.sh) that would have caught the prealpha.21 break. - There is no opt-in pre-release channel. A user wanting to validate a build BEFORE it becomes "latest" has no way to do so other than installing the latest pre-release and discovering bugs in production. For alpha, where each tag is a public commitment, this is unacceptable.
Decision¶
1. Versions are immutable.¶
A tag, once published (even as a draft), MUST NOT be moved to a different commit. Re-tagging breaks Sigstore signatures (tied to commit SHA), poisons mirrors/caches, and violates the universal semver convention. A bad release becomes a drafted release; the next good release gets the next version.
This mirrors Go modules' approach: go.mod retract vX.Y.Z keeps the version
visible-but-discouraged, and the next semver-greater tag becomes the install
default.
2. Two-channel release flow, from alpha onwards.¶
Two tag conventions, distinguished by suffix:
| Suffix | Channel | Visibility | Resolved by install.sh "latest"? |
|---|---|---|---|
vX.Y.Z-rc.N |
Release Candidate | DRAFT | No (drafts are skipped) |
vX.Y.Z |
Final (or -alpha.N, -beta.N, etc.) |
PRE-RELEASE or RELEASE | Yes |
Cutting a release:
git tag vX.Y.Z-rc.1
git push origin vX.Y.Z-rc.1
โ release.yaml builds + signs artifacts
โ publishes the release as DRAFT (goreleaser draft=true for -rc.N tags)
โ install-smoke runs against the drafted artifacts
โ if any smoke fails: gate keeps the draft; report posted on the tag
Human verifies the rc.N (Lima hands-on, dogfood, whatever the release needs):
LEOFLOW_VERSION=vX.Y.Z-rc.1 curl -fsSL https://...install.sh | sh
โ Install the explicit tag (drafts are reachable by direct tag URL).
โ Exercise whatever the rc is meant to validate.
If rc.N is green: If rc.N is red (Lima found a bug):
git tag vX.Y.Z fix the bug
git push origin vX.Y.Z git tag vX.Y.Z-rc.2 (skip rc.1 forever)
โ release.yaml builds + signs โ repeat verification
โ publishes as pre-release/release
โ install-smoke runs (auto gate)
โ if green, becomes "latest"
The rc.N drafts are never deleted manually โ prune-prealpha.yaml already
sweeps drafts older than its keep window.
3. E2E gates that block both PRs and releases.¶
Two complementary CI layers, both running test/e2e/*.sh:
- PR-time (
ci.yaml): every PR runs the full E2E suite. A failed E2E blocks merge. Catches regressions before they reach main. - Post-tag (
release.yamlinstall-smoke + an E2E step): the published artifacts are exercised end-to-end in a clean container. A failed post-tag E2E retracts the release to draft, even if the PR-time E2E was green (the artifact in CI ran againstgo build, not the release tarball โ install paths, packaging, and managed runtime layers are tested only here).
Test suite, as of this ADR:
| Script | What it gates |
|---|---|
test/e2e/lite-login.sh |
The Lite happy path: setup โ control plane โ admin login โ JWT โ workspace editor. |
test/e2e/lite-multidag.sh |
The multi-DAG materialization contract: subdir DAG โ dag.json.source carries dag.py verbatim (the property the subprocess executor depends on to materialize per-TI work dirs). |
test/e2e/e2e.sh |
The pod-path E2E on k3d (build images, k3d import, agent-over-gRPC, real pod-per-task). Heavy; runs in a separate workflow today, may merge into release.yaml install-smoke later. |
The suite grows as new release-blockers are identified. Each test SHOULD be: fast enough to run on every PR (target <60s), reality-anchored (no mocks at the boundary it gates), and named to match what it gates (the bug it would have caught โ not how the test happens to be implemented).
Consequences¶
Positive¶
- A regression cannot reach
latestundetected. The prealpha.21 break could not happen under this flow: PR-time E2E would have failed; even if it had merged, install-smoke would have caught it post-tag and drafted the release. - Users opt in to release candidates.
LEOFLOW_VERSION=vX.Y.Z-rc.1is explicit; nobody touches-rctags by accident. - Versions remain immutable, signatures remain valid. A retracted draft is still verifiable (the artifact, the SHA, the cosign signature) so forensics on what failed are preserved.
Negative¶
- One extra step in the release ritual when cutting a stable: tag
-rc.1, verify, then tag final. This is the cost of explicit gating; the alternative is shipping bugs first and apologizing. - Drafted versions accumulate.
prune-prealpha.yamlalready covers this (it sweeps drafts > 90 days old, keeping the newest N). No action required; documented here for context. - The "skip" appearance in stable history: a bad
v1.2.4drafted + a goodv1.2.5published is visible from the outside as "v1.2.4 missing from latest." This is correct (semver allows non-contiguous sequences) and matches what Kubernetes, Node.js, and Postgres ship โ but the user experience requires release notes onv1.2.5that name the drafted predecessor when one exists.
Pre-alpha exception¶
For pre-alpha tags (v0.0.1-prealpha.N), the simpler direct-tag flow stays
in place: tag โ build โ publish as pre-release โ install-smoke โ gate
retracts to draft on failure. The two-channel -rc.N flow becomes
mandatory starting at v0.1.0-alpha.1, where the first public alpha
commits to a more careful release ritual.
This exception exists because pre-alpha tags are explicitly experimental and their churn (several per day during active development) would make a -rc step per tag a meaningful overhead with little additional value โ the E2E gates run on the direct tag and catch regressions either way.
Implementation status¶
- [x] Pre-alpha direct-tag + install-smoke gate (already shipped pre-ADR)
- [x]
test/e2e/lite-login.shE2E in PR-time CI (existed before this ADR) - [x]
test/e2e/lite-multidag.shE2E in PR-time CI (this PR) - [ ]
goreleaser.ymlrecognises-rc.Ntags and publishes as DRAFT (deferred to PR D, lands beforev0.1.0-alpha.1) - [ ]
release.yamlinstall-smoke runstest/e2e/lite-multidag.shagainst the installed artifact (deferred โ needs a--against-installedflag in the script first) - [ ] Release-notes template that mentions the drafted predecessor when one exists (deferred to documentation pass)
Related¶
- ADR 0014 โ Supply-chain security: the Cosign + SBOM + Trivy + govulncheck gates that already protect the artifact. This ADR adds the functional gates on top.
- Memory note
alpha-release-policy: "firstv0.1.0-alpha.1cut only after user hands-on testing." This ADR formalises that ritual via the-rc.Nconvention. - PR #251 โ the materialization fix this test guards.