A solution accelerator for building microservices on Databricks.
microbricks demonstrates how to build microservices on Databricks using Databricks Apps for the runtime, Lakebase Autoscale Postgres for per-service state, OBO authentication for end-to-end user identity, and DABs + GitHub Actions for CI/CD across dev/test/prod.
The demo domain is healthcare: six backend services that loosely follow HL7 FHIR resource boundaries, plus one frontend portal with an in-process BFF.
Status: Phases 1–7 complete — six backend services + the BFF + the DAB bundle + six GitHub Actions workflows are all in place. The workflows now run end-to-end against the dev workspace via M2M (service-principal) auth, build every apx frontend, deploy all seven apps in parallel, and resolve each app's canonical URL (with workspace ID) from the apps API. Dev is also fully runnable from a workstation via
scripts/ci-local.sh. SeeROADMAP.mdfor the full phase breakdown and open work for what's next.
Browser → hc-portal (frontend + BFF) → 6 backend services → 6 Lakebase databases
└── REST /api/bff/... └── one DB per service, no shared schema
└── GraphQL /api/graphql └── each exposes /api/graphql (Strawberry)
└── forwards OBO token
- 6 backend services (
patient,provider,appointment,lab,prescription,billing), each its own Databricks App with its own Lakebase project. Each exposes both REST (/api/v1/) and GraphQL (/api/graphql) endpoints via Strawberry. - 1 frontend (
hc-portal) — a BFF that orchestrates calls and joins data in-memory. Holds no data of its own. Exposes a GraphQL gateway at/api/graphqlwith DataLoader-based batch resolution and nested cross-service traversal. - OBO auth end-to-end. Every Postgres connection is opened with the calling user's OAuth credential, so Unity Catalog enforces access at the data layer.
- No backend-to-backend calls. The BFF is the only place where data from multiple services is joined.
- Dual API surface. REST endpoints remain stable; GraphQL coexists and enables field selection, nested traversal, and partial-failure handling natively. The frontend is being migrated page-by-page from React Query (REST) to Apollo Client (GraphQL).
Detailed architecture: ARCHITECTURE.md. Data model: HEALTHCARE_DATA_MODEL.md. Implementation plan: ROADMAP.md. Brand assets: docs/brand/.
| Tool | Min version | Install |
|---|---|---|
| Databricks CLI | 1.2.1 |
brew install databricks (the GitHub Actions workflows pin 1.2.1) |
apx |
latest | curl -fsSL https://databricks-solutions.github.io/apx/install.sh | sh |
uv |
latest | curl -LsSf https://astral.sh/uv/install.sh | sh |
bun |
latest | curl -fsSL https://bun.sh/install | bash |
gh |
latest | brew install gh |
psql (optional) |
16 | brew install postgresql@16 |
apx + bun are needed at deploy time (CI and local) because each apx project's React UI is built into src/<pkg>/__dist__/ before bundle deploy — see scripts/build-frontends.sh.
Three Databricks workspaces (FE-VM serverless type, required for Lakebase + Apps):
| Profile | Workspace | Used for |
|---|---|---|
hc-dev |
dev workspace URL | develop branch + per-feature-branch preview environments (local dev + PR) |
hc-test |
test workspace URL | release/* branches and main HEAD |
hc-prod |
prod workspace URL | tagged releases (v*) |
Configure them in ~/.databrickscfg:
databricks auth login --host https://<dev-workspace>.cloud.databricks.com --profile hc-dev
databricks auth login --host https://<test-workspace>.cloud.databricks.com --profile hc-test
databricks auth login --host https://<prod-workspace>.cloud.databricks.com --profile hc-prod# 1. Clone
git clone https://github.com/databricks-solutions/microbricks.git
cd microbricks
# 2. Validate the bundle (no Databricks calls, just YAML + schema checks)
databricks bundle validate -t dev
# 3. Bootstrap dev: lint + tests + provision 6 Lakebase projects + alembic +
# build frontend UIs + `bundle deploy -t dev` + `bundle run` per app +
# /healthz smoke. ci-local.sh wraps every step the GitHub Actions
# `deploy-dev.yml` workflow runs in CI.
./scripts/ci-local.sh deploy dev
# 4. Seed synthetic patient/provider/appointment/lab/prescription/billing
# data into the freshly-provisioned production branches.
make seed-devThe seven dev apps (patient, provider, appointment, lab, prescription, billing, hc-portal) come up RUNNING — the bundle run step in CI submits an app deployment per app in parallel and waits for RUNNING state. Lakebase endpoints scale to zero after 1h idle, so an inactive dev environment is ~$0.
Already have a deployed bundle and just want to (re-)launch the apps? Use the standalone helper:
# Plain trunk-dev: deploy bundle, then start all seven apps (parallel-friendly)
./scripts/deploy-and-run-bundle.sh dev
# Already deployed — just kick a new app deployment per app
./scripts/deploy-and-run-bundle.sh dev --skip-deploy --restart
# Iterate on a subset
./scripts/deploy-and-run-bundle.sh dev --only=patient,labFor the per-PR / per-feature-branch workflow:
# Day-to-day fast loop: lint + tests + bundle-validate, no deploy.
./scripts/ci-local.sh pr-validate --no-deploy
# Full preview deploy: provision 6 feature branches off `production`, alembic
# on changed services, deploy 7 preview apps suffixed `-<slug>`, smoke-test.
./scripts/ci-local.sh pr-validate
# Tear it all down when you're done iterating.
./scripts/ci-local.sh pr-cleanupSee CONTRIBUTING.md "Running CI locally" for the full reference. Every step can also be driven by Claude Code — open the repo in Claude Code and the project-local skills (.claude/skills/) are auto-discovered.
.
├── README.md # This file
├── ARCHITECTURE.md # Architecture reference
├── HEALTHCARE_DATA_MODEL.md # Per-service data model
├── CONTRIBUTING.md # GitFlow + PR rules + "Running CI locally"
├── ROADMAP.md # Phased implementation plan (all phases ✅ except 8)
├── Makefile # `make seed-dev` and per-service seed targets
├── databricks.yml # Root DAB (3 targets: dev / test / prod)
├── resources/ # DAB resource includes (apps + BFF + shared)
├── services/ # 6 backend microservices (one APX project each)
│ ├── patient/ # ✅ all six scaffolded; auth.py / db.py / migrations
│ ├── provider/ # are byte-identical except for entity names
│ ├── appointment/ # each exposes REST + GraphQL (Strawberry)
│ ├── lab/
│ ├── prescription/
│ └── billing/
├── frontend/
│ └── hc-portal/ # ✅ React UI + BFF; REST aggregation + GraphQL
│ # gateway with DataLoaders + Apollo Client
├── scripts/
│ ├── ci-local.sh # Local CI emulator — pr-validate / pr-cleanup /
│ │ # deploy {dev,test,prod} / nightly-cleanup
│ ├── build-frontends.sh # Auto-discovers apx projects under frontend/
│ │ # and runs `apx frontend build` for each
│ ├── deploy-and-run-bundle.sh # `bundle deploy` + per-app `bundle run` in
│ │ # one verb (with --only / --skip-deploy /
│ │ # --restart / --no-wait / --var passthrough)
│ ├── sanitize-branch-slug.sh # Code-branch -> Lakebase-/preview-slug transform
│ ├── lakebase-project-{up,down}.sh # Per-env Lakebase project lifecycle
│ ├── lakebase-branch-{up,down}.sh # Per-feature-branch lifecycle
│ └── seeds/ # Shared seed primitives (deterministic UUIDs)
├── tests/seeds/ # Cross-service ID-stability tests (no DB needed)
├── docs/
│ ├── brand/ # Logo + icon + Photoshop source
│ └── diagrams/ # *.drawio + exported *.png
├── .github/
│ ├── workflows/ # 6 workflows: pr-validate, pr-cleanup,
│ │ # deploy-{dev,test,prod}, nightly-orphan-cleanup
│ └── release-template.md # PR template for release/* -> main PRs
└── .claude/ # Project-local Claude Code skills (auto-discovered)
└── skills/
├── hc-microservice-scaffold/
├── hc-lakebase-branching/
├── hc-obo-auth/
├── hc-dab-deployment/
├── hc-bff-pattern/
└── hc-gitflow-cicd/
The repo ships six project-local skills under .claude/skills/. Each one codifies a piece of the reference architecture so that future contributors (and Claude Code itself) follow the same patterns.
| Ask Claude Code… | Skill that fires |
|---|---|
| "Scaffold a new service called X" | hc-microservice-scaffold |
| "Spin up a Lakebase branch for my feature" / "tear it down" | hc-lakebase-branching |
| "Add a new route to the patient service" | hc-obo-auth |
| "Deploy to test" | hc-dab-deployment |
| "Add a BFF endpoint that joins patient + appointment" | hc-bff-pattern |
| "Add a GraphQL query / migrate a page to GraphQL" | hc-graphql |
| "Cut a release / open a PR" | hc-gitflow-cicd |
The repo ships six workflows under .github/workflows/, all pinned to Databricks CLI 1.2.1:
| Workflow | Trigger | What it does |
|---|---|---|
pr-validate.yml |
PR open/sync against develop/release/*/main |
Path-scoped matrix: lint + unit tests for changed services, BFF tests, bundle-validate, provision per-feature Lakebase branches, alembic, build apx frontends, parallel bundle deploy + per-app bundle run, /healthz smoke, resolve canonical URLs from the apps API, comment URLs on PR |
pr-cleanup.yml |
PR close (merged or not) | bundle destroy of the preview + tear down 6 Lakebase feature branches |
deploy-dev.yml |
Push to develop |
Same shape (no path filter, all six services + portal): alembic on production branches, build frontends, bundle deploy -t dev, parallel bundle run per app, wait for RUNNING, /healthz smoke |
deploy-test.yml |
Push to release/* or main |
Same shape, -t test |
deploy-prod.yml |
Push tag v* on main |
Same shape, -t prod, gated by manual approval on the prod GitHub environment |
nightly-orphan-cleanup.yml |
Daily cron (04:17 UTC) + manual | GC Lakebase feature branches whose PR is closed |
A few details worth knowing:
- Auth: M2M (service-principal) via the OAuth
client_credentialsflow. Each deploy job writes a~/.databrickscfgprofile (hc-dev/hc-test/hc-prod) fromvars.DATABRICKS_HOST_<ENV>+secrets.DATABRICKS_CLIENT_ID+secrets.DATABRICKS_CLIENT_SECRETscoped to the matching GitHub environment. Smoke tests mint a short-lived bearer by POSTingclient_credentialsto${DATABRICKS_HOST}/oidc/v1/token(no PATs, nodatabricks auth token— that's U2M-only). OIDC trust is the documented migration path; the inline rationale lives inpr-validate.yml. - Frontends are built in CI.
scripts/build-frontends.shauto-discovers every apx project underfrontend/(anything with bothpyproject.tomlandpackage.json) and runsapx frontend build—bundle deployonly syncs files, so without this step the deployed app would 404 on every page route. The bundle force-includes the resultingsrc/<pkg>/__dist__/viasync.includeindatabricks.yml. - Apps deploy in parallel.
bundle deploywires the cross-appCAN_USEACLs at deploy-time, so the sevenbundle runcalls (six services +hc-portal) fire concurrently. Wall time becomesmax(app startup)rather thansum. Each subshell logs to its own file so a failure surfaces a clean log instead of interleaved noise. - Smoke tests are strict. Both the status code AND the body are asserted (
200+ literal{"ok":true}) — Databricks Apps' OBO gateway returns 200-with-HTML for unauthenticated requests, which would fool a vanillacurl -fsS. - Environment URLs come from the apps API. The Apps platform embeds the workspace ID into each app hostname (e.g.
hc-portal-<workspace-id>.azure.databricksapps.com), so we can't construct it client-side. Each deploy workflow's final step callsdatabricks apps get hc-portal -o jsonand feeds the canonical URL intoenvironment.urland (for previews) into the PR comment table.
Note: GH-hosted runners may not reach the dev workspace if FE-VM's managed IP allowlist is in effect — see the
github-runner-ip-aclfinding. Where that's the case,scripts/ci-local.shruns the same logical pipelines from a developer's machine where the IP is already allowlisted. SeeCONTRIBUTING.md"Running CI locally" for the dev/CI division of labor.
The phased plan in ROADMAP.md is mostly done — phases 1-7 (six services + BFF + seed data + DAB bundle + dev rollout + workflows, including parallel bundle run, frontend build step in CI, M2M auth, and canonical URL resolution from the apps API) are ✅. Phase 8 wraps it up:
- Confirm CI ↔ Databricks reachability per env — verify each workspace's IP allowlist (if any) admits GH-hosted runner egress; otherwise pick one of the three options in the
github-runner-ip-aclfinding. - Fresh-clone test — verify a stranger can go from
git cloneto a working PR with a preview app in under 30 minutes following only README + CONTRIBUTING. - Runbooks —
docs/runbooks/{prod-rollback,hotfix,lakebase-branch-orphan}.md. - Cost audit — confirm idle endpoints scale to zero (already wired: 1h
suspend_timeout_durationinscripts/lakebase-project-up.sh+scripts/lakebase-branch-up.sh); document expected dev-env monthly cost.
Future / optional follow-ups (Phases 9–14 in ROADMAP.md): service-mesh observability, saga/events demo, RLS demo, read-replica for prod, multi-region, shared APX UI library.
See LICENSE. This is a reference architecture the data model is illustrative and the synthetic data is intentionally non-realistic.
| Library | Role | License |
|---|---|---|
| FastAPI | HTTP framework for all services and the BFF | MIT |
| Pydantic v2 | Request/response validation and settings | MIT |
| Strawberry GraphQL | GraphQL API layer in the BFF | MIT |
| uvicorn | ASGI server | BSD-3-Clause |
| psycopg 3 | PostgreSQL driver (async + connection pooling) | LGPL-3.0 |
| SQLAlchemy | ORM used by Alembic migrations | MIT |
| Alembic | Database schema migrations | MIT |
| httpx | Async HTTP client (service-to-service calls) | BSD-3-Clause |
| Databricks SDK for Python | Workspace client, OBO credential generation | Apache-2.0 |
| React | Frontend UI library | MIT |
| TanStack Router | Type-safe file-based routing | MIT |
| TanStack Query | Server-state management and caching | MIT |
| Apollo Client | GraphQL client (used alongside REST hooks) | MIT |
| Vite | Frontend build tooling and dev server | MIT |
| Tailwind CSS | Utility-first CSS framework | MIT |
| shadcn/ui | Accessible UI component primitives | MIT |
| uv | Python package/project manager | MIT |
| Bun | JavaScript runtime and package manager | MIT |
