feat(authz): replace bespoke FGA with embedded OpenFGA ReBAC engine by lakhansamani · Pull Request #625 · authorizerdev/authorizer

lakhansamani · 2026-06-08T05:41:09Z

Summary

Replaces the not-yet-rolled-out Resource/Scope/Policy/Permission authorization engine (#607/#610/#611) with an OpenFGA-backed ReBAC engine. Since the old FGA was never released, it is removed entirely rather than deprecated.

What changed

Engine SPI (internal/authorization/engine) with an embedded OpenFGA implementation (memory/sqlite/postgres/mysql datastores) and external-mode scaffolding. Selected via --authorization-engine=fga.
GraphQL API
- Admin (super-admin gated, audited): _fga_write_model, _fga_get_model, _fga_write_tuples, _fga_delete_tuples, _fga_read_tuples.
- Runtime: fga_check, fga_batch_check, fga_list_objects — the subject is pinned to the authenticated token, never client-supplied; fail-closed.
Session/validate: required_permissions replaced by required_relations on session, validate_session, validate_jwt_token. Coarse roles/scope gating is unchanged.
Dashboard: new FGA admin UI — authorization-model editor, relationship tuples, access tester.
SQLite driver: standardized on modernc.org/sqlite via a local GORM dialect so the embedded OpenFGA SQL datastore links without a duplicate database/sql "sqlite" registration. Pure-Go, no CGO.

New flags

Deployment modes

single-node/dev: embedded + SQLite (migrations on boot).
HA / serverless: external Postgres/MySQL (or external OpenFGA service); run migrations as a separate init job; no embedded SQLite (ephemeral/non-shared disk).

Testing

go build ./..., go vet ./... clean.
SQLite integration + storage + graphql + authorization tests pass.
Embedded SQLite FGA verified in-process alongside GORM SQLite (no driver-registration panic).
Default binary starts without the previous Register called twice for driver sqlite panic.

Follow-ups (not in this PR)

SDK cleanup in authorizer-go / authorizer-js (remove required_permissions, add FGA client helpers).
Auth0 FGA → Authorizer import tool + MIGRATION.md.

Design docs included: FGA_OPENFGA_MIGRATION_PLAN.md, ENTERPRISE_AUTHZ_MODEL.md, AGENTIC_DELEGATION_DESIGN.md.

Remove the not-yet-rolled-out Resource/Scope/Policy/Permission engine (#607/#610/#611) and replace it with an OpenFGA-backed ReBAC engine. - AuthorizationEngine SPI (internal/authorization/engine) with an embedded OpenFGA implementation (memory/sqlite/postgres/mysql datastores) plus external-mode flag scaffolding. - GraphQL: admin _fga_write_model/_fga_get_model/_fga_write_tuples/ _fga_delete_tuples/_fga_read_tuples and runtime fga_check/fga_batch_check/ fga_list_objects (runtime principal pinned to the token subject). - required_relations on session/validate_session/validate_jwt_token; coarse roles/scope gating unchanged. - Dashboard FGA admin UI: authorization model editor, relationship tuples, access tester. - Standardize the SQLite driver on modernc.org/sqlite via a local GORM dialect so the embedded OpenFGA SQL datastore links without a duplicate database/sql "sqlite" registration. Flags: --authorization-engine, --fga-mode, --fga-store, --fga-store-url, --fga-external-url.

…model Add design docs for the OpenFGA migration and the agentic-authorization program, and update the v2 roadmap. - FGA_OPENFGA_MIGRATION_PLAN.md: phased plan, locked decisions, deployment modes (single-node / HA / serverless), implementation status. - ENTERPRISE_AUTHZ_MODEL.md: OpenFGA model patterns (role grants, user-specific overrides, exclusions, hierarchy) with a worked example. - AGENTIC_DELEGATION_DESIGN.md: RFC 8693 token exchange, act claim, attenuation, audit delegation chain, revocation. - FGA_IMPLEMENTATION_AGENTS.md: program execution plan. - ROADMAP_V2.md: agentic authorization track; corrected FGA/audit status.

…on FGA checks - Admin introspection ops `_fga_list_users` and `_fga_expand` (super-admin gated). These reveal the access graph (who-can-access / why), so they are admin-only rather than end-user facing. - Optional, trust-gated `user` on `fga_check`/`fga_batch_check`/ `fga_list_objects`: a super-admin may query an explicit subject; an ordinary end-user token stays pinned to its own subject and a client-supplied `user` is rejected (prevents enumerating another user's access). Centralized in resolveFgaSubject; M2M/client-credentials callers to be allowed in Phase 2. - Engine SPI: ListUsers and Expand methods on AuthorizationEngine.

Add tests for previously-uncovered surface: - _fga_delete_tuples (removes a tuple; non-admin rejected) - _fga_get_model (returns active model; non-admin rejected) - trust gate enforced per decision op: fga_list_objects and fga_batch_check reject an ordinary user supplying another subject (not only fga_check) - session query honors required_relations (separate wiring of the same helper as validate_session)

… relations - engine.ReadModel now returns (id, dsl): _fga_get_model previously returned an empty FgaModel.id while _fga_write_model returned one. Populate it from the active OpenFGA model id. - Add a validate_jwt_token required_relations test (the third entry point of the shared enforceRequiredRelations helper); re-logs in for a fresh access token since session ops in earlier subtests rotate the original.

…re config The two-engine selector (--authorization-engine=policy|fga) was a vestige of the SPI design — the policy engine was removed entirely, leaving only OpenFGA. FGA is now enabled by configuring a store: --fga-store (embedded) or --fga-external-url (external). With neither set the engine is not constructed and the fga_* resolvers fail closed, identical to the previous default. - Remove the AuthorizationEngine config field and CLI flag. - --fga-store defaults to "" (set it to enable embedded FGA). - Update stale comments/schema descriptions referencing the removed flag.

…ly FGA Authorizer embeds OpenFGA in-process — it IS the engine. Trim the FGA config surface to what's actually used: - Remove --fga-mode and --fga-external-url: external-OpenFGA-service mode was a non-functional stub (logged a warning, started no engine). HA/serverless use the embedded engine + an external SQL store (postgres/mysql), not a separate service. The AuthorizationEngine SPI still allows adding an external client later if a real need arises. - Remove three dead flags left from the old policy engine, with zero consumers after its removal: --authorization-cache-ttl, --include-permissions-in-token, --authorization-log-all-checks. FGA is now enabled solely by --fga-store (+ --fga-store-url). Build + full SQLite suite green.

The "not enabled" empty state referenced the removed --authorization-engine=fga flag. Rewrite it as a helpful empty state: correct enable command (--fga-store) with copy-to-clipboard, store options (memory/sqlite/postgres/mysql), and a docs link, styled to the dashboard's blue accent. Also replace the bare "No Tuples" empty state with guidance on what a tuple is and how to grant the first one.

…ride When the main database is OpenFGA-compatible (sqlite/postgres/mysql/mariadb), FGA derives its store from --database-url automatically — no extra flags, with OpenFGA's tables living in the main DB (as the old engine did). --fga-store / --fga-store-url become overrides, required only when the main DB is unsupported (mongodb, dynamodb, cassandra, couchbase, arangodb, sqlserver) or to use a dedicated store. - config.FGAStoreConfig() resolves the store (explicit override > main-DB derivation > disabled); unit-tested across the matrix. - Migrations run on boot for SQL stores (idempotent, goose-locked → HA-safe). - Dashboard "not enabled" copy updated to explain auto-reuse + the override. Verified: a SQLite-configured instance auto-enables FGA (reused_main_db=true) with no --fga-store and no driver-registration panic.

…thout store - config: for every database OpenFGA can't use (mongodb, dynamodb, cassandra, scylla, couchbase, arangodb, sqlserver, libsql, cockroachdb, yugabyte, planetscale), FGAStoreConfig returns disabled when --fga-store/--fga-store-url are unset; an explicit --fga-store still enables it. - integration: validate_session without required_relations succeeds when no FGA engine is configured — the instance works normally without FGA.

Replace the raw-DSL-only model editor with a visual builder that generates OpenFGA DSL under the hood, plus a "DSL (advanced)" escape hatch: - ModelBuilder: add/edit types, relations and permissions via forms — direct assignment (chips), unions, and inheritance ("X from Y"), no DSL knowledge needed. - modelDsl.ts: generateDsl / parseDsl (best-effort) / validateModel / plain-English summarize + 3 starter templates (document sharing, folder inheritance, org/team/project). Verified round-trip; advanced constructs (and / but not / conditions) keep the user in DSL mode. - Model page: Builder <-> DSL tabs, template chips, live "what this model means" summary, clearer intro copy. Loads an existing model into the builder when representable, else opens DSL.

… nav Turn the three Authorization pages into a clear guided workflow: - AuthSteps: a shared, clickable stepper (1 Define model → 2 Grant access → 3 Test access) shown on each page, with done/current/upcoming states. Steps stay deep-linkable so admins can jump directly. - Each page now leads with "Step N · <title>", a concrete worked Example callout (document-sharing running example), and a "Next →" link to continue. - "RBAC — your roles" model template generated from the instance's configured roles (fetched via admin _env), with role-name sanitization. Round-trip verified. - Sidebar: the Authorization group is now collapsible (chevron, aria-expanded), default-open when on an authorization route.

…t tree The hand-rolled form builder was fragile (delete bug, cluttered layout). Replace it with a robust master-detail tree editor: - react-arborist tree shows types -> relations (expand/collapse, keyboard nav, per-node add/delete, selection); a detail pane edits the selected node's name, assignable types, and computed terms. Builder | DSL stays as two tabs. - All model edits go through pure, unit-tested mutation helpers in modelDsl.ts (add/delete/rename type & relation, add/remove assignable & computed) — this eliminates the in-place-mutation delete bug at the source. Verified by a standalone mutation test. - Removed the bespoke ModelBuilder.tsx.

…le catalog Replace the confusing tree/builder + Builder/DSL sub-tabs with one simple, example-driven editor: - A catalog of 9 ready-to-use OpenFGA model examples (raw DSL, so they use the full language): document sharing, folder hierarchy, organizations & teams, RBAC roles, groups, block list (exclusion), multi-tenant SaaS, GitHub-style repos, and time-bound access (conditions) — plus a dynamic "Your roles" example. Each card shows a description; clicking loads it into the editor. - One DSL editor + a live plain-English summary + Save. No tree, no builder, no model sub-tabs. CRUD is load/edit/save. - All 9 examples validated against the OpenFGA DSL transformer (the same one the backend uses on save). Removed react-arborist and ModelTree.tsx.

The collapsible group header was styled as a faded uppercase section label (text-gray-400, uppercase), which read as a disabled item. Style it like a normal nav entry (text-sm, gray-700, blue-50 when active).

…pper - DocsLinks: links to OpenFGA / ReBAC concepts, modeling guide, DSL reference, and relationship tuples — shown on the Model and Grant-access pages. - Grant-access page: "Common grant patterns" cards (direct, assign a role, grant a whole role via role#assignee, public user:*, and grant-on-a-folder so all resources inherit) that prefill the form, plus a tip on avoiding a tuple per object id. - Model page: switching to an example now confirms if there are unsaved changes and shows a toast; a note explains there is one active model and saving makes a new immutable version active. - Stepper now marks a step done only when actually complete (model saved / tuples exist), so step 1 isn't checked when no model exists.

Add an "About model versions" info panel: one active model, saving creates a new immutable version, earlier versions are retained, OpenFGA models are append-only (a version can't be deleted individually), and separate models need separate stores.

OpenFGA models are append-only — individual versions cannot be deleted. Reset is the only way to remove a model and all its past versions and start fresh. - engine: add Reset() to the AuthorizationEngine SPI; OpenFGA impl deletes the store (model + all versions + tuples) and creates a new empty one - graphql: add _fga_reset mutation, super-admin gated and audited (admin.fga_reset). Refused while any relationship tuples still exist so live grants are never dropped silently — callers must delete tuples first - dashboard: "Danger zone" on the model page. Disabled with a link to the Grant access page while tuples exist; otherwise a typed-confirmation dialog (type RESET) before wiping - test: TestOpenFGAEngine_Reset covers store rotation, model clearing, tuple removal, and engine reuse

- Add engine.ErrNoModel sentinel; ReadModel returns it on a fresh store so callers treat "no model yet" as an empty state, not a failure. FgaGetModel maps it to an empty model for the dashboard's starting view. Fail-closed is unchanged — Check/BatchCheck/ListObjects still deny on a model-less store. - Add authorizer_fga_checks_total, authorizer_fga_check_duration_seconds and authorizer_fga_operations_total, recorded across the FGA resolvers. Only low-cardinality constant labels are ever used as label values. - Tests: ErrNoModel sentinel (engine), empty-model GraphQL state + metric recording (integration), metric helpers (unit).

…ubject - Step 1 is now two-mode: a roles × permissions matrix (RbacBuilder, the default for non-developers) that generates a standard OpenFGA RBAC model, plus the Advanced (DSL) editor. No syntax to learn to define a model. - Example catalogs (model examples and grant patterns) moved into modal popups so the editor and the add-tuple form stay the focus. - Tester gains a User (subject) field so a super-admin can check any subject; result copy reflects the checked subject. Server already gates the override to admins. - Grant page guards against writing tuples before a model exists, and only blocks on a genuine no-model error — never on a transient failure. - Drop the dead _env.ROLES / AdminRolesQuery fetch. - Add vitest + modelDsl.test.ts unit coverage (rbacModel, parse, summarize, example catalog).

- Add admin-only _admin_meta query (AdminMeta type) returning the configured roles / default_roles / protected_roles. Super-admin gated; the non-deprecated replacement for the role bits of _env (deprecated in v2). - Dashboard model builder seeds its roles × permissions matrix from the real configured roles via _admin_meta, falling back to a generic set. The builder mounts only after the roles fetch settles so it never locks in the fallback. - Test: admin_meta_test.go (super-admin gated, returns configured roles).

…tion - Add docs/fga-rebac-guide.md: app vs FGA roles, identifying subjects by user:<id> (not names), org→project→resource hierarchy (grant once, inherit everywhere), and fine-grained grants that coexist with inheritance. - Add "Org → project → resource" and "Company roles (RBAC)" model examples; make both concentric (editor implies viewer; permissions reference the next more-powerful one) per OpenFGA's concentric-relationships guidance. - Add hierarchy_test.go proving inheritance from one org-level grant, scoped fine-grained grants, and concentric view, all keyed by user:<id>. - Grant form nudges admins to use the user's id, not a name.

…date in CI Reviewed every shipped model against openfga/agent-skills (the official OpenFGA modeling rules): - Folder hierarchy example: chain owner down (`owner from parent_folder`) so a folder owner can edit its documents — was the documented "parent role forgotten on child types" anti-pattern; rename parent → parent_folder per the naming convention; add folder can_view. - Organizations & teams example: add can_view so apps check a permission, not the member relation directly. - Model editor placeholder: concentric (editor implies viewer) instead of independent viewer/editor unioned in can_view. - Add examples_validation_test.go: extracts every DSL from the dashboard catalog, the editor placeholder, and docs/fga-rebac-guide.md and writes each through the real embedded engine — the in-repo equivalent of `fga model validate`, so a malformed example can never ship.

- Replace every user:alice example, placeholder and grant-pattern prefill with the user:<id> / user:<user-id> convention the docs recommend — names aren't unique or stable; point admins at the Users page for the id. - Fix the Grant access form alignment: the id hint under the User column made it taller than the other columns in the items-end grid; the hint is now a full-width row below the inputs so all fields and the Add button align.

…, id-only examples - The model builder now always starts from the standard admin/editor/viewer matrix; the instance's configured roles are offered as one-click suggestion chips instead of being forced in as the seed (app roles like "user" make poor object-scoped FGA roles). - Grant-pattern prefill uses folder:<folder-id>; ReBAC guide examples now use numeric object ids (organization:101, project:201, resource:301) — objects, like users, are identified by id, never by name. role:* objects stay keyed by role name by design.

…lver per file BREAKING (branch-only, never released): replaces fga_check, fga_batch_check and fga_list_objects. - Public surface is now exactly two operations: - check_permissions(checks: [{relation, object, contextual_tuples?}], user?) → results echo each pair with allowed (a single check is a batch of one). - list_permissions(relation, object_type, user?) → objects. - Subject trust gate (resolveFgaSubject): defaults to the caller's token subject; an explicit `user` (bare id normalized to user:<id>) is honored only for super-admins or when it equals the caller's own subject — anything else is rejected, never silently ignored. - Resolvers restructured one-per-file: fga.go (shared helpers + gate), check_permissions.go, list_permissions.go, fga_write_model.go, fga_get_model.go, fga_write_tuples.go, fga_delete_tuples.go, fga_read_tuples.go, fga_list_users.go, fga_expand.go, fga_reset.go. - Dashboard: Access Tester page removed (the wizard is now 2 steps); per-user verification moved to Users table → "View Permissions" modal, which calls list_permissions with an explicit subject under the admin session. - Metrics labels: check_permissions / list_permissions. - Integration tests rewritten, including a new self-specification case (non-admin passing their own subject is honored).

Adding a tuple whose relation or object type isn't in the active model surfaced OpenFGA's raw gRPC error ("rpc error: code = Code(2000) desc = Invalid tuple ..."), which read as "can't add grant access". - Map tuple-validation errors in _fga_write_tuples/_fga_delete_tuples to a friendly message that keeps OpenFGA's reason and points at Step 1; raw error stays in the debug log. Covered by an integration test (also asserts no gRPC internals leak). - Grant-pattern modal now states tuples must match YOUR model; the folder pattern notes it needs a folder type.

All program design docs (FGA migration plan, agentic delegation design, enterprise authz model, implementation agents, migration-tool spec) and the ReBAC guide now live in the authorizer-docs repo under specs/. References in CLAUDE.md and ROADMAP_V2.md point there. The docs-guide DSL validation subtest is removed with the guide; dashboard example validation stays.

check_permissions accepted unbounded contextual-tuple arrays from any authenticated caller, relying on the embedded OpenFGA default limit as the only guard. Enforce an explicit cap (100) in toContextualTuples with unit coverage so the boundary no longer depends on engine configuration.

…init The engine created a fresh OpenFGA store on every boot whenever no StoreID was passed — and no caller ever persisted one — so on SQL-backed deployments a restart orphaned the model and every tuple, and all checks failed with 'no authorization model written yet' until an admin rebuilt everything. New() now recovers the existing store by exact name via ListStores and adopts the store's latest authorization model, so persistent deployments survive restarts with zero operator action. Covered by a restart-continuity test that boots a second engine on the same SQLite file and asserts the original decisions still hold. Engine-init failure no longer log.Fatal()s the instance: FGA is optional, so init errors (e.g. missing DDL rights for OpenFGA migrations) now log and leave the engine nil — permission APIs fail closed, core auth keeps serving. Also inlines the no-op strconvItoa wrapper.

…ers omitted relation and object_type are now optional on list_permissions. When either is omitted, every matching (type, relation) pair of the active model is enumerated — an empty input answers "what can this user access?" in one call. Pairs come from the new TypeRelations engine SPI method and are expanded via ListObjects with bounded concurrency (5) so a single request cannot saturate the embedded engine. The response now carries (object, relation) detail in permissions[] and an explicit truncated flag when the 1000-entry cap is hit, replacing the previous silent truncation. The subject trust gate is unchanged: callers enumerate their own access unless super-admin.

The Users-table permissions modal now treats both filters as optional, matching the new list_permissions API: an empty form lists every permission the user holds. Results render as (object, permission) rows instead of bare object ids, and a notice appears when the server truncated at 1000 entries.

FGA tuples and permission lookups need the user's UUID; admins previously had to open the user detail view to get it. The ID now shows muted and monospaced under the email with a one-click copy button (existing clipboard + toast pattern); the click does not trigger the row's detail view.

The Users-table permissions modal now fetches everything the user can access the moment it opens — no filter input or button click required. The form is purely a narrowing filter (Apply filters / Refresh), skeleton rows show while loading, and all state resets on close so the next open starts fresh for any user.

…verride TestFGADisabled now asserts that ALL admin FGA ops — including every write path (_fga_write_model, _fga_write_tuples, _fga_delete_tuples, _fga_reset) plus _fga_get_model, _fga_read_tuples, _fga_list_users, _fga_expand and the public list_permissions — return the not-enabled error when no engine is configured, even for a super admin. This proves no FGA record can be created via the API on an unsupported database without --fga-store, and is the exact error that switches the dashboard's Authorization tab into its FgaNotEnabled state. TestFGAExplicitStoreOverrideForUnsupportedDB proves the other direction at the config→engine seam: a mongodb main DB with explicit --fga-store/ --fga-store-url resolves to an enabled FGA config, and an engine built from it exactly as cmd/root.go wires it serves model writes, tuple writes, and checks.

Adds the first component-level dashboard tests: FgaNotEnabled (what the Authorization tab shows on databases without OpenFGA support and no --fga-store) must explain the state and surface the exact flags that fix it, and isFgaNotEnabledError — the single decision point that switches the tab into that state — is covered for the backend message, case variants, unrelated errors, and missing input. Component tests opt into jsdom per file; pure DSL tests stay on the node environment. New dev-only deps: jsdom, @testing-library/react, @testing-library/dom.

Resolves the #625 collision with the service-layer extraction: - service deps swap AuthorizationProvider -> engine.AuthorizationEngine - old permissions/permission_check service ops removed; CheckPermissions/ ListPermissions ported into the transport-agnostic service layer - session/validate_jwt_token/validate_session gates moved from required_permissions to required_relations (FGA, fail-closed) - graphql check_permissions/list_permissions become thin service adapters - proto: Permissions RPC replaced by CheckPermissions + ListPermissions (POST /v1/check_permissions, /v1/list_permissions, MCP-exposed); required_relations on Session/ValidateJwtToken/ValidateSession - cmd/mcp.go wires the embedded OpenFGA engine like root.go

lakhansamani added 30 commits June 8, 2026 11:10

fix(dashboard): Authorization nav no longer looks disabled

6007c15

The collapsible group header was styled as a faded uppercase section label (text-gray-400, uppercase), which read as a disabled item. Style it like a normal nav entry (text-sm, gray-700, blue-50 when active).

docs(specs): v1→v2 migration tool design spec

b473534

docs: point openfga-modeling skill reference at the new permission APIs

094ea8a

docs(fga): note exact-string self-match semantics in the trust gate

0e374df

lakhansamani added 9 commits June 11, 2026 10:03

lakhansamani merged commit 3fd4777 into main Jun 12, 2026
2 checks passed

lakhansamani deleted the feat/fga-engine-spi branch June 12, 2026 02:21

lakhansamani mentioned this pull request Jun 12, 2026

feat(api): multi-protocol public API surface (GraphQL + gRPC + REST + MCP) #620

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(authz): replace bespoke FGA with embedded OpenFGA ReBAC engine#625

feat(authz): replace bespoke FGA with embedded OpenFGA ReBAC engine#625
lakhansamani merged 39 commits into
mainfrom
feat/fga-engine-spi

lakhansamani commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lakhansamani commented Jun 8, 2026

Summary

What changed

New flags

Deployment modes

Testing

Follow-ups (not in this PR)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant