Skip to content

Grading Checklist

Quick reference: what we built and where to find it. Each section maps to 20% of the grade.

Architecture (20%)

Clean FTI split, feature store, model registry, containerized deployment.

What Where
Feature / Training / Inference split src/foehncast/feature_pipeline/, training_pipeline/, inference_pipeline/
Airflow DAGs with asset triggers (local) dags/feature_dag.py, training_dag.py, inference_dag.py
Cloud Run jobs + Workflows (cloud) terraform/main.tf — FTI cascade (6 h), drift detection (12 h)
Feature store (Feast) feature_repo/
Model registry (MLflow) Champion/candidate aliases in training_pipeline/register.py
6 container services containers/ (Airflow, MLflow, app, UI, monitoring, dev)
Compose with overlay pattern docker-compose.yml + objectstore.yml / gcp.yml
Local + cloud deployment scripts/bootstrap-local.sh, terraform/main.tf
Storage abstraction feature_pipeline/store.py switches between S3 and BigQuery

Docs: Architecture

Automation (20%)

Everything runs without manual steps after bootstrap.

What Where
CI (7 jobs: shell, lint, terraform, dvc, compose, test, docs) .github/workflows/ci.yml
Auto image publishing Cloud Build triggers (GCP-native, path-filtered)
Infrastructure-as-code terraform/main.tf + terraform.yml workflow
Asset-triggered training (local) Feature DAG → training-request asset → Training DAG
Cloud Workflows cascade (cloud) Cloud Scheduler → feature → training → inference (scale-to-zero)
Asset-triggered inference (local) Model registered → Inference DAG runs batch predictions
Pre-commit hooks (8) .pre-commit-config.yaml (ruff, whitespace, YAML, etc.)
Bootstrap scripts scripts/bootstrap-local.sh, scripts/bootstrap-gcp.sh
Runtime release + rollback Cloud Build triggers + Cloud Run probes (automatic rollback)

Docs: Delivery Workflow

Reproducibility (20%)

Same results on any machine.

What Where
DVC pipeline (curate + train) dvc.yaml, dvc.lock
Tracked outputs data/, reports/train_metrics.json, reports/feature_importance.png
Locked dependencies pyproject.toml + uv.lock
Pinned containers python:3.12-slim, multi-stage, uv sync --frozen
Config-driven (no magic numbers) config.yaml has spots, model params, thresholds
Data lineage in MLflow SHA-256 hash + git commit logged per run
Local smoke test = CI smoke test make smoke-local-evaluator

Docs: Feature Pipeline, Training Pipeline

Code Quality (20%)

Static analysis, tests, and consistent structure.

What Where
Linting + formatting ruff via pre-commit and make lint
Unit tests across the pipeline modules tests/
CI enforces everything Lint, test, shell checks, terraform validate on every PR
Type annotations from __future__ import annotations throughout
Clean package structure Domain subpackages + shared utilities (_bigquery.py, etc.)
Shell validation ShellCheck-style checks in CI

Docs: Repository

Monitoring (20%)

Prometheus metrics, drift detection, alerting, and visualization.

What Where
Custom Prometheus exporters monitoring/pipeline_prometheus.py, prediction_prometheus.py
Combined /metrics endpoint Feature + training + prediction + drift metrics
Drift detection (Evidently) monitoring/drift.py — statistical tests per column
Drift Cloud Run job (12 h schedule) terraform/main.tf — runs drift.py on Cloud Run
Hindcast validation monitoring/hindcast.py — predicted vs. observed
Streamlit charts (Altair + PromQL) ui/app.py — system health, drift, pipeline panels
9 alert rules prometheus_config/alerting_rules.yml
Prediction event log .state/monitoring/prediction-events.jsonl (local), BigQuery (cloud)
Scrape config in version control prometheus_config/prometheus.yml

Docs: Monitoring