Local Evaluator¶

FoehnCast uses the local evaluator target as the default contributor runtime. The bootstrap-local path starts the validated service subset, resets disposable local state, runs one end-to-end feature and training hand-off, prepares Feast serving state, and verifies the main app and operator contracts before it reports success.

This page records the current local runtime contract that is validated by the bootstrap path and by the runtime-stack and monitoring-stack tests. It describes the current baseline, not a future migration plan.

Scope

This page describes the current validated local evaluator target. It is not a roadmap. Future changes should be documented after they are chosen and implemented.

Target Shape¶

flowchart LR BOOT[./scripts/bootstrap-local.sh] --> AIR[Airflow plus Postgres metadata] BOOT --> APP[FastAPI app] BOOT --> MLF[MLflow] BOOT --> OBJ[MinIO objectstore] BOOT --> FEAST[Feast Datastore emulator] BOOT --> MON[Prometheus plus StatsD exporter plus Grafana] AIR --> FEAT[feature_pipeline DAG] FEAT --> CUR[(Curated feature data)] FEAT --> REQ[training-request asset] REQ --> TRN[training_pipeline DAG] TRN --> REG[(MLflow registry)] REG --> APP CUR --> FEAST APP --> MET[/metrics] MET --> MON

The important point is that the local evaluator is a real runtime target, not a mock environment:

the same feature, training, inference, and monitoring modules run inside the local stack
the bootstrap waits for a real asset-triggered training run instead of stopping after container startup
Feast serving state is prepared after curated features exist instead of being treated as an optional afterthought
rider-facing and operator-facing surfaces stay separate even in local mode

Bootstrap Responsibilities¶

The default contributor entrypoint is still simple:

Install Docker.
Clone the repository.
Run ./scripts/bootstrap-local.sh.

You do not need local gcloud, Terraform, GitHub Actions variables, or a compiler toolchain for this path.

The bootstrap path owns these responsibilities:

reset Docker volumes, local Airflow metadata, and disposable runtime artifacts so each run starts clean
start the validated service subset without enabling the optional development_env container
wait for Airflow component health checks and the Airflow API health payload
verify Grafana provisioning before any pipeline run starts
run the feature pipeline for the selected date
wait for the asset-triggered training pipeline to finish successfully
prepare Feast local serving state and verify the live /features/online route
wait for app health and verify hosted-sync metrics on /metrics

If the preferred local ports are already occupied, the bootstrap moves the bindings to the next free ports and prints the resolved endpoints.

Runtime Surfaces¶

Surface	Current role in the local evaluator	Must not become
FastAPI app	serve `/health`, `/spots`, `/predict`, `/rank`, `/features/online`, and `/metrics`	a hidden training or operator control plane
Airflow plus Postgres metadata database	run feature and training orchestration with asset hand-offs	a notebook-only demo path
MLflow	track runs, registry versions, and local model loading	the rider-facing interface
MinIO objectstore	local object-access baseline for curated features and MLflow artifacts	a replacement for the online feature layer
Feast Datastore emulator	required local online-serving layer above the curated contract	the primary storage contract
Prometheus, StatsD exporter, and Grafana	operator monitoring and provisioning validation	the product UI
`development_env`	optional notebook and dev-shell helper surface	part of the default contributor path

This keeps the local target close to the hosted architecture. The object-access layer follows the same shape as the hosted artifact path, while Feast continues to provide the online-serving boundary instead of turning storage into a serving shortcut.

Verification Contract¶

The local evaluator reports success only after the runtime proves several real contracts.

The current verification path includes:

Airflow component health checks for the webserver, dag-processor, scheduler, triggerer, and metadata database
an Airflow API health payload check instead of treating 200 OK alone as enough
Grafana API checks for the checked-in dashboard, alert rules, contact point, and policies
a real feature_pipeline DAG test run through Airflow
a real asset-triggered training_pipeline success wait
a live /features/online request against the running app
app health and hosted-sync metric verification through /metrics

That makes the local target more than a container smoke test. It exercises the same hand-offs the rest of the system pages describe.

Disposable And Retained Local State¶

The local evaluator resets disposable state aggressively, but it still keeps the important contracts explicit.

The current split is:

disposable Airflow metadata and logs are cleared by the bootstrap path
curated features and MLflow artifacts use the MinIO-backed local objectstore baseline for the run
retained prediction monitoring history lives under .state/monitoring/ instead of mixing into data/
pipeline summaries are rendered under airflow/reports/, with history copies under airflow/reports/history/

This boundary matters because reproducibility depends on resetting transient runtime state without pretending that monitoring history or rendered evidence are themselves product data.

Local-Only Overrides¶

The checked-in Grafana configuration keeps deployable-safe defaults: anonymous access, public dashboard sharing, and embedding are disabled by default.

The local evaluator applies local-only access overrides so a fresh Docker run can still verify Grafana provisioning without extra manual setup. That keeps the public and hosted surface policy intact:

Grafana remains an operator surface
rider-facing evaluation still belongs to the Streamlit demo and API outputs
public docs should keep preferring rendered evidence over live embeds of private dashboards

Why This Target Works¶

it gives contributors a clone-and-run baseline instead of a partial demo stack
it proves the feature-to-training hand-off through Airflow assets before declaring the stack ready
it keeps the online feature contract real by preparing Feast state and calling the live route
it keeps optional notebook tooling outside the default path so the supported baseline stays smaller and easier to reproduce

See Architecture, Inference Pipeline, Monitoring, and Cloud Mapping for the surrounding runtime and deployment boundaries.