google_cloud_platform โ Google Cloud connection¶
Connect tasks to Google Cloud (GCS, BigQuery, Pub/Sub, โฆ) with a managed Connection, in two auth modes: keyless (Workload Identity / ADC โ recommended) and service-account key (encrypted at rest).
The connection follows Airflow's google_cloud_platform shape โ an existing
Airflow GCP connection drops in unchanged โ with cleaner short field names and
keyless as the default.
URI shape¶
GCP carries no host/login/password โ everything lives in Extra. The control
plane delivers it as AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT:
Extra fields¶
Short names are canonical; the legacy extra__google_cloud_platform__<name>
form is also accepted.
| Field | Meaning |
|---|---|
keyfile_dict |
Service-account JSON, inline (key in the connection). Encrypted at rest (ADR 0019). Discouraged. |
key_path |
Path to a service-account JSON file mounted from a Kubernetes Secret (the chart's taskSecret). |
key_secret_name |
Name (or full resource path) of a GCP Secret Manager secret holding the JSON key; fetched in the task via ADC. |
project / project_id |
GCP project (optional). |
scopes / scope |
OAuth scopes โ a list or a comma-separated string. |
num_retries |
Pass-through to your GCP client (optional). |
Resolution order (first match wins): keyfile_dict โ key_path โ
key_secret_name โ ADC (keyless). Leave all key fields empty for keyless.
Not handled in v1: key_secret_project_id, credential_config_file,
impersonation_chain, quota_project_id.
Auth modes¶
Keyless (recommended)¶
No key in the Connection. Credentials come from Application Default Credentials:
- Pro (GKE): Workload Identity โ the task pod runs as a Kubernetes SA bound to a GCP service account; no key ever touches the cluster. See the chart's task-ServiceAccount knob and issue #56.
- Lite (subprocess): your host ADC (
gcloud auth application-default loginorGOOGLE_APPLICATION_CREDENTIALS). - Lite (k3d): no metadata server โ keyless unavailable; use key mode.
Key from a Kubernetes Secret (key_path) โ preferred when not keyless¶
The key lives in a Kubernetes Secret, mounted read-only into every task pod;
the connection only references the file by key_path. The key never enters
Leoflow's DB/API/UI. Wire it via the chart:
kubectl -n leoflow create secret generic gcp-sa-key --from-file=key.json=/path/to/key.json
helm upgrade leoflow ./helm/leoflow -n leoflow --reuse-values \
--set taskSecret.name=gcp-sa-key --set taskSecret.mountPath=/etc/leoflow/secrets
key_path to /etc/leoflow/secrets/key.json.
Key from GCP Secret Manager (key_secret_name)¶
The connection references a Secret Manager secret name; the task fetches it at
runtime via ADC (so the task still needs an ambient identity โ typically
Workload Identity โ to read the secret). Grant the task's GSA
roles/secretmanager.secretAccessor.
Key inline (keyfile_dict) โ discouraged¶
The SA JSON in the connection's Extra (encrypted at rest). Convenient for dev/low-criticality; not recommended (see Security below).
Lite vs Pro¶
| Lite (subprocess) | Lite (k3d) | Pro (GKE) | |
|---|---|---|---|
| Keyless (ADC) | โ host ADC | โ (no metadata server) | โ Workload Identity |
key_path (K8s Secret) |
โ (mount a file) | โ (k3d Secret) | โ
(chart taskSecret) |
key_secret_name (Secret Manager) |
โ (needs ADC) | โ ๏ธ needs ADC | โ (WI + secretAccessor) |
keyfile_dict (inline) |
โ | โ | โ |
Security โ Leoflow is not a key manager¶
Credentials should come from the runtime identity (keyless) or from a secret the platform manages, which the connection only references. Leoflow does not aspire to store cloud keys (see ADR 0035). In order of preference:
- Keyless (Workload Identity / ADC) โ recommended. No key anywhere.
- Reference a platform-managed secret โ the key lives in the cluster's or cloud's secret store; the connection holds only a reference. The key never enters Leoflow's DB/API/UI:
key_path+ a mounted Kubernetes Secret (charttaskSecret);key_secret_name+ GCP Secret Manager (task fetches via ADC).keyfile_dictโ discouraged. It stores the key inside the connection (encrypted at rest withLEOFLOW_SECRET_KEY, ADR 0019; delivered only over the TLS agent channel, ADR 0021). It is the cloud-key analog of how connections store a database user/password โ pragmatic and Airflow-compatible, but not the desired posture. Use only for dev / low-criticality; many orgs forbid creating such keys anyway.
Example DAG + test¶
- Example: examples/gcp_gcs_load
โ writes + reads a GCS object in both modes, with a clean
gcp_credentials()helper. - Delivery (chain-of-custody) is covered by an automated test that round-trips a
synthetic key through encryption +
__extra__(no real cloud needed); a real end-to-end run against GCS is documented as manual in the example README.
See also: variables-connections.md, ADR 0019, ADR 0021.