Leoflow¶

The workflow orchestrator that ate Apache Airflow's lunch.
Same UI. Same vocabulary. A Go control plane instead of Python's. Zero of the pain.
Native map-reduce for ML/AI — fan-out + reduce as a Python list comprehension.

Get started DAG authoring GitHub

Leoflow Lite — localhost:8088 Leoflow Lite, the ETL graph (extract, transform, load) running on a local cluster

A Go control plane that keeps Airflow's proven pod-per-task model and its UI, and throws away the Python control plane that makes Airflow slow.

Author a DAG, ship a container¶

A DAG is a leoflow.yaml (config, bindings, packaging) plus a dag.py (Airflow SDK). They compile to one immutable artifact — dag.json + a container image.

leoflow.yamldag.py

schema_version: "1.0"
dag_id: etl_daily
description: Daily ETL — extract, transform, load.
owner: data-eng
tags: [etl]
python_version: "3.12"
dependencies:            # pip packages baked into the DAG's own image
  - pandas==2.2.2

"""etl_daily — a three-task ETL on the Airflow SDK."""
from airflow.sdk import DAG, task

@task
def extract() -> list[int]:
    return list(range(100))

@task
def transform(rows: list[int]) -> int:
    return sum(rows)

@task
def load(total: int) -> None:
    print("loaded:", total)

with DAG("etl_daily", schedule="0 6 * * *", catchup=False, tags=["etl"]):
    load(transform(extract()))

DAGs are immutable artifacts

A dag.json + a container image, versioned together. No re-parsing /dags, no drift. Concepts →
One image per DAG

Each DAG carries its own dependencies. No shared filesystem, no dependency hell. Architecture →
A real dev loop

leoflow lite — isolated cluster, hot reload, silver Lite badge. Edit, save, see it run. Operating modes →
Airflow-compatible API & UI

/api/v2/* and /ui/*, pinned to Airflow 3.2.x. HTTP API →
Native map-reduce for ML/AI

Fan-out + reduce as a Python list comprehension. No XCom plumbing, no broker, no special operator. Map-reduce for ML →

Map-reduce in two lines of Python¶

Every parallel ML workload — hyperparameter search, k-fold CV, ensemble training, batch inference, Monte Carlo — is the same pattern. Leoflow expresses it as a list comprehension:

from airflow.sdk import DAG, task

@task
def trial(lr: float) -> dict:
    return train_one(lr)                            # map

@task
def select_best(trials: list[dict]) -> dict:
    return max(trials, key=lambda r: r["score"])    # reduce

with DAG("hparam_search", schedule=None):
    select_best([trial(lr) for lr in [0.001, 0.01, 0.05, 0.1, 0.5]])

The parser captures the list shape at compile time; the runtime assembles the upstream XComs in declaration order and delivers them as a real Python list — with per-trial isolation, per-trial retry, and a null slot for any upstream that legitimately produced no result. See Map-reduce for ML → for the guarantees, limits, and the dag.json shape.

The dev loop¶

leoflow lite provision            # check + provision host deps (dev-only)
leoflow init dags/my_dag     # scaffold a project
leoflow lite dags/my_dag      # hot-reload at http://localhost:8088 (Lite edition)

Lite ships today (pre-alpha, local on a trusted network); Pro (the Kubernetes edition) is a near-term goal — see the roadmap and Editions for the split.