Skip to content

refactor(service): extract internal/service layer; migrate SignUp (Phase 1)#615

Closed
lakhansamani wants to merge 1 commit into
feat/proto-skeletonfrom
feat/service-layer
Closed

refactor(service): extract internal/service layer; migrate SignUp (Phase 1)#615
lakhansamani wants to merge 1 commit into
feat/proto-skeletonfrom
feat/service-layer

Conversation

@lakhansamani

Copy link
Copy Markdown
Contributor

Stacked on top of #614. Review and merge after Phase 0 is in.

Summary

  • Introduces internal/service as the home for transport-agnostic public-API operations. Each method takes RequestMetadata (host, IP, UA, raw request) and returns a typed response plus a ResponseSideEffects bag the transport applies (today: cookies; later: redirects, trailing headers).
  • SignUp is the first migrated operation — chosen as the seam-proof because it exercises every gin coupling the legacy resolvers had: `GetHost`, `GetIP`/`GetUserAgent`, token issuance, session + mfa cookie writes, verification email/SMS dispatch.
  • The GraphQL resolver in `internal/graphql/signup.go` becomes a thin adapter: build `RequestMetadata` from `gin.Context`, call the service, apply side-effects. This is the seam Phase 2's gRPC and grpc-gateway REST handlers will reuse.

Supporting cleanups (all transport-agnostic mirrors of existing helpers)

  • `parsers.GetHostFromRequest` / `GetAppURLFromRequest` (raw `*http.Request`); gin wrappers now delegate.
  • `cookie.BuildSessionCookies` / `BuildMfaSessionCookies` (return `[]*http.Cookie`); gin wrappers now delegate.

Wiring

  • `service.Provider` constructed in `cmd/root.go` and `integration_tests/test_helper.go`.
  • `graphql.Dependencies` and `http_handlers.Dependencies` each grow a `ServiceProvider` field.

Known follow-up

`TokenProvider.CreateAuthToken` still takes `*gin.Context` even though it doesn't read from it. The service synthesizes a minimal shim (`&gin.Context{Request: meta.Request}`) at the call site, with a TODO comment. Refactoring `CreateAuthToken` to take `*http.Request` directly is a Phase 2 cleanup (~10 callers).

Test plan

  • All 11 `TestSignup` sub-cases pass
  • Full SQLite integration suite (71s) still green — no behaviour drift
  • `go build ./...` succeeds
  • `go vet ./...` clean (one pre-existing mongodb warning unchanged)
  • CI green on this stacked PR

Follow-ups

  • Phase 2 — proto for 9 public services + gRPC server + grpc-gateway REST (will migrate remaining 17 public ops into `internal/service` along the way).
  • Phase 4 — MCP server reading proto annotations.

🤖 Generated with Claude Code

…SignUp (Phase 1)

Introduces internal/service as the home for transport-agnostic public-API
operations. Each method takes a RequestMetadata (host, IP, UA, raw request)
and returns a typed response plus a ResponseSideEffects bag that the
transport applies (today: cookies; later: redirects, trailing headers).

SignUp is the first migrated operation — chosen as the seam-proof because
it exercises every gin coupling the legacy resolvers had: GetHost,
GetIP/GetUserAgent, token issuance, session + mfa cookie writes,
verification email/SMS dispatch.

The GraphQL resolver in internal/graphql/signup.go becomes a thin adapter:
build RequestMetadata from gin.Context, call the service, apply side-effects.
This is the seam Phase 2's gRPC and gRPC-gateway REST handlers will reuse.

Supporting changes (all transport-agnostic mirrors of existing helpers,
with the gin wrappers delegating to them so behaviour is identical):
- parsers.GetHostFromRequest / GetAppURLFromRequest (raw *http.Request)
- cookie.BuildSessionCookies / BuildMfaSessionCookies (return []*http.Cookie)

Wiring:
- service.Provider constructed in cmd/root.go and integration_tests/test_helper.go
- graphql.Dependencies + http_handlers.Dependencies grow a ServiceProvider field
- http_handlers.GraphqlHandler passes it through to graphql.New

Known follow-up: TokenProvider.CreateAuthToken still takes *gin.Context
even though it doesn't use it. The service synthesizes a minimal shim
(`&gin.Context{Request: meta.Request}`); refactoring CreateAuthToken to take
*http.Request directly is a Phase 2 cleanup tracked by a TODO at the call site.

Tests:
- All 11 TestSignup sub-cases pass
- Full SQLite integration suite (71s) still green — no behaviour drift

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@lakhansamani lakhansamani left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Principal-engineer review — Phase 1 (refactor(service): extract internal/service)

Reviewed against feat/proto-skeleton. Diff is ~1.2k lines, mostly the SignUp move and the gin/http helper splits.

Strengths

  • Seam shape is solid. RequestMetadata + ResponseSideEffects + MetaFromGin / ApplyToGin is a clean transport adapter. The Build*Cookies / GetHostFromRequest mirrors are the right way to split — old gin wrappers are now thin shims, so call sites pay zero cost.
  • Cookie semantics are preserved exactly. Both _session and _session_domain are emitted with the same Name/MaxAge/Path/Domain/Secure/HttpOnly/SameSite shape; the "." + domain rule (skipped for localhost) is intact in BuildSessionCookies. SameSite policy for MFA (Lax-when-insecure / None-when-secure) is preserved verbatim.
  • ApplyToGin is per-cookie SetSameSite-then-SetCookie — this is actually more correct than the old code, because future ops can return cookies with mixed SameSite without stomping each other.
  • Audit/IP/UA preserved. meta.IPAddress / meta.UserAgent captured by value before the goroutine — same behavior as the inline utils.GetIP(gc.Request) call.
  • Embedding pattern (*config.Config + Dependencies) matches graphqlProvider / httpProvider exactly. Idiomatic for this codebase, no surprises.
  • gcShim claim is correct. Verified auth_token.go: CreateAuthToken(gc, cfg) only forwards to CreateSessionToken/CreateAccessToken/CreateIDToken — none touch gc. The shim is safe today; the TODO is appropriate.

Issues

Important

  1. service.Meta is dead code in this PR. Provider interface declares Meta(ctx, meta) (*model.Meta, *ResponseSideEffects, error) and meta.go implements it, but nothing calls it — the GraphQL Meta() resolver still has the original signature Meta(ctx context.Context) (*model.Meta, error) and was not migrated to delegate. Either (a) migrate the resolver in this PR for consistency with SignUp, or (b) drop Meta from the interface and the file, defer to a later phase. As-is it's silent scope creep — the PR title/body promises only SignUp.

  2. No service-layer-direct tests. The whole point of extracting a transport-agnostic layer is to make it testable without spinning up gin. TestSignup still exercises the resolver path — fine as a regression net, but the seam itself is unverified. Bare minimum I'd want before Phase 2:

    • service/signup_test.go calling provider.SignUp directly with a hand-built RequestMetadata and stub providers (at least the happy email-verification + mobile-OTP + auth-token branches).
    • cookie/cookie_test.go for BuildSessionCookies and BuildMfaSessionCookies — these are now load-bearing helpers, and the "." + domain/localhost branch is exactly the kind of thing a copy-paste in Phase 2 will get wrong.
    • The new TestGetHostFromRequest / TestGetAppURLFromRequest are good, keep doing that.
  3. RequestMetadata.Request escape hatch will outlive its TODO. Documented as a "legacy" field for TokenProvider, but Phase 2 will introduce gRPC handlers that won't have a real *http.Request (or will need to synthesize one with empty headers). The gcShim trick works for CreateAuthToken because that call doesn't read it, but the moment someone calls GetAccessToken(gc) from the service layer, gRPC breaks silently. Worth either (a) refactoring CreateAuthToken in this PR to take *http.Request (the TODO promises a "Phase 2 cleanup, ~10 callers" — do it before gRPC lands, not after), or (b) explicit panic/log in the shim path when meta.Request == nil so gRPC fails loud instead of NPE.

Nits

  1. ResponseSideEffects.AddCookie doc says "safe on a zero-value receiver" — true for a non-nil *ResponseSideEffects pointing to a zero struct, but a nil pointer panics. Reword to "safe on a zero-value struct".
  2. Email-verification branch returns side, nil where side is empty — fine, but inconsistent with the early validation returns of nil, nil, err. Pick one style.
  3. service.New returns (Provider, error) but never errors. Matches the codebase pattern, so OK — but the TODO // TODO - Add any validation here should either be filled in (validate non-nil deps + non-nil config) or removed.
  4. MetaFromGin returns zero RequestMetadata{} on nil gc/Request. Defensive but downstream code will silently produce nonsense (empty hostname, no audit IP/UA). Probably better to make it a programming error — caller should never pass nil.
  5. gcShim literal &gin.Context{Request: meta.Request} creates a gin.Context with no engine, Keys, or middleware. If any helper down-the-stack calls gc.GetString/gc.Get it will return zero values silently. Document this constraint on the shim's TODO line, or wrap in a helper newSyntheticGin(r *http.Request) that future readers can grep for.

Test coverage gaps (recap)

  • internal/service/signup_test.go — direct, no gin
  • internal/cookie/cookie_test.goBuildSessionCookies / BuildMfaSessionCookies (domain edge cases: localhost, single-label, IPv6, ports)
  • internal/service/sideeffects_test.goMetaFromGin with realistic gin.Context, ApplyToGin with mixed-SameSite cookies, nil-safety of both
  • A regression test that proves Set-Cookie headers on the SignUp HTTP response are byte-identical pre/post-refactor — easy to assert against httptest recorder, and it's the one thing the existing test suite doesn't cover.

Verdict

Mergeable after addressing (1) and a sentence-level decision on (3). (2) ideally lands in this PR but I'd accept a follow-up issue tracking it before Phase 2 merges — these tests get harder to write once gRPC is layered on top. The refactor itself is faithful and the seam is well-chosen; SignUp was the right pick to prove it.

@lakhansamani

Copy link
Copy Markdown
Contributor Author

Superseded by #620, which consolidates this stack into a single PR against main. All blocking review findings from this PR were addressed in #620; see its body for the per-finding traceback.

lakhansamani added a commit that referenced this pull request Jun 12, 2026
… MCP) (#620)

* feat(api): multi-protocol public API surface (GraphQL + gRPC + REST + MCP)

Adds gRPC + grpc-gateway REST + MCP surfaces for the public GraphQL ops
(no `_` prefix), driven from a single proto source of truth. GraphQL stays
unchanged; admin ops stay GraphQL-only.

Consolidates the previously-stacked PRs #614#615#616#617#618#619
into a single change against main.

PROTO (proto/)
  - buf v2 module rooted at buf.build/authorizerdev/authorizer
  - Single AuthorizerService with 19 RPCs whose names match GraphQL ops
    1:1: Signup, Login, Logout, MagicLinkLogin, VerifyEmail,
    ResendVerifyEmail, VerifyOtp, ResendOtp, ForgotPassword, ResetPassword,
    Profile, UpdateProfile, DeactivateAccount, Revoke, Session,
    ValidateJwtToken, ValidateSession, Meta, Permissions
  - common/v1: annotations (required_permissions, mcp_tool, audit_log,
    public), pagination, errors, shared AppData
  - Each RPC's response wrapped in a per-RPC message so buf STANDARD's
    RPC_REQUEST_RESPONSE_UNIQUE lint passes; shared inner types (AuthResponse,
    User, Meta) live in proto/authorizer/v1/types.proto
  - google.api.http annotations drive REST: GET /v1/{method} for trivially-
    empty queries (meta, profile, permissions, logout), POST /v1/{method}
    otherwise. Snake_case method paths mirror GraphQL identifiers.
  - buf STANDARD lint + format both enforced in CI; bufbuild/buf-action@v1
    runs lint always, breaking-check on PRs, format -d --exit-code always

TRANSPORT-AGNOSTIC SERVICE LAYER (internal/service/)
  - sideeffects.go: RequestMetadata + ResponseSideEffects + MetaFromGin /
    ApplyToGin / MetaFromGRPC / ApplyToGRPC bridges
  - provider.go: service.Provider interface
  - signup.go, meta.go: migrated from internal/graphql; resolvers become
    thin transport adapters
  - Supporting helpers: parsers.GetHostFromRequest/GetAppURLFromRequest,
    cookie.BuildSessionCookies/BuildMfaSessionCookies (existing gin
    wrappers now delegate to these so behaviour is byte-identical)

gRPC SERVER (internal/grpcsrv/)
  - server.go: AuthorizerService registered, gRPC reflection (gated on
    --enable-grpc-reflection), gRPC health checking, graceful shutdown
  - interceptors: recovery (panic → codes.Internal), logging (per-code
    level), validate (protovalidate)
  - handlers/authorizer.go: Meta delegates to service.Meta; the other 18
    methods inherit UnimplementedAuthorizerServiceServer and return
    codes.Unimplemented until their handler migrates from internal/graphql
  - transport/grpc_metadata.go: gRPC metadata ↔ RequestMetadata bridge
    (extracts cookies from grpcgateway-cookie, preserves multi-cookie
    Set-Cookie responses)

REST GATEWAY (internal/gateway/)
  - mount.go: serves grpc-gateway via in-process bufconn dial — no extra
    TCP hop, no TLS plumbing
  - JSONPb marshaler: UseProtoNames=true so REST payloads match GraphQL's
    snake_case shape
  - Mounted at /v1/* under the existing gin router (shares CORS, security
    headers, rate limit, logger middleware automatically)
  - /openapi.json serves the merged swagger spec (embedded via go:embed
    from gen/openapi/openapi.go so it works regardless of cwd)

MCP SERVER (internal/mcp/)
  - scanner.go: walks grpc.Server.GetServiceInfo() + protoregistry.GlobalFiles,
    reads the mcp_tool annotation on each method to build a tool registry
  - schema.go: derives JSON Schema from proto request descriptors, with
    cycle guard for self-recursive types (google.protobuf.Value)
  - server.go: registers tools dynamically on a github.com/modelcontextprotocol/
    go-sdk Server; tool handlers unmarshal JSON args into a dynamicpb.Message,
    invoke the gRPC method via an in-process bufconn, marshal the response
    back to JSON. gRPC errors surface as CallToolResult{IsError:true} so
    the LLM gets actionable text
  - Today's MCP-exposed tools (from proto annotations): meta, profile,
    session, permissions. Credential-bearing methods stay unexposed
  - `authorizer mcp` subcommand (cmd/mcp.go) serves over stdio for
    `claude mcp add authorizer -- /path/to/authorizer mcp ...`

CLI (cmd/root.go, cmd/mcp.go, internal/config/config.go)
  - --grpc-port (default 9091; collision-checked against --http-port and
    --metrics-port at startup), --enable-grpc-reflection (default true),
    --grpc-tls-cert / -key / -insecure (TLS plumbing placeholders; TLS
    implementation is a follow-up PR)
  - server.Run starts HTTP + metrics + gRPC + REST gateway listeners with
    shared graceful shutdown

TESTS
  - internal/parsers/url_test.go        GetHostFromRequest priority + spoof rejection
  - internal/cookie/cookie_test.go      BuildSessionCookies/BuildMfaSessionCookies shape
  - internal/service/sideeffects_test.go MetaFromGin/ApplyToGin nil-safety + roundtrip
  - internal/grpcsrv/interceptors/      recovery / logging / validate
  - internal/grpcsrv/transport/         gRPC metadata bridge (cookies, fallbacks)
  - internal/mcp/schema_test.go         flat scalars, nested message, cycle-safety regression
  - internal/integration_tests/grpc_meta_test.go      AuthorizerService.Meta
  - internal/integration_tests/grpc_surface_test.go   all 18 stubs return Unimplemented + gRPC health
  - internal/integration_tests/rest_meta_test.go      GET /v1/meta through gateway
  - internal/integration_tests/rest_openapi_test.go   /openapi.json serves embedded spec
  - internal/integration_tests/mcp_test.go            tools/list + tools/call meta
  - internal/integration_tests/mcp_stubs_test.go      stub returns CallToolResult{IsError:true}
  - Existing GraphQL integration suite still passes (65–70s, no behaviour drift)

What's NOT in this PR (deferred)
  - --grpc-tls-cert / -key / -insecure are wired into config but not yet
    enforced; TLS implementation lands in a follow-up alongside metrics-
    listener TLS
  - 18 of the 19 gRPC methods (and their REST mirrors + MCP tools) are
    Unimplemented stubs; each becomes real as its op migrates from
    internal/graphql into internal/service in follow-up PRs. The
    annotation-driven MCP scanner + gateway routing means follow-ups
    don't need to touch the gRPC/REST/MCP scaffolding — only add the
    service-layer method and the handler delegation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(api,mcp): migrate 7 stubs; security audit fixes; lock stdio-only MCP (#621)

Implements 7 of the 17 stubbed AuthorizerService methods (Profile,
Permissions, Logout, Revoke, ValidateJwtToken, ValidateSession, Session)
following the established service-layer pattern, and addresses the
security audit findings against the MCP surface.

SECURITY AUDIT FIXES

C1 — Session response carries access_token / refresh_token / id_token /
authenticator_secret / recovery_codes. The proto annotation on Session
flipped to mcp_tool.exposed = false so those credentials never land in
an LLM transcript. Session remains available via gRPC + REST + GraphQL
for legitimate browser/server-to-server consumers.

H1 — MCP→gRPC auth propagation. New `--mcp-bearer` flag on the
`authorizer mcp` subcommand; the MCP server stamps `Authorization:
Bearer <token>` on every outgoing gRPC call. Identity-bearing tools
(profile, permissions) now have a caller to attribute to; anonymous
runs still work for the public Meta tool but identity-bearing tools
surface a clean unauthorized error.

H2 — Recovery interceptor redacts panic values. The recovered value is
no longer dumped via `.Interface("panic", r)` (which would have logged
credentials if a handler ever panicked with the request struct); only
the panic type is logged for triage. Regression test included.

STDIO-ONLY MCP TRANSPORT

internal/mcp/server.go — explicit type-level documentation: stdio is
the ONLY supported transport. The Server has no RunHTTP / RunTCP /
RunSSE methods, intentionally.

internal/mcp/transport_test.go — `TestServer_StdioOnly` reflects over
*Server's exported methods and fails the build if anyone adds a method
whose name suggests a network transport (RunHTTP, ListenTCP, ServeWS,
etc.). To add a transport: implement an MCP-side auth interceptor
first, then update the allow-list.

cmd/mcp.go — docstring + CLI long help explicitly state "stdio only".

7 STUB MIGRATIONS

internal/service: profile.go, permissions.go, logout.go, revoke.go,
validate_jwt_token.go, validate_session.go, session.go,
permission_check.go (shared helper). All follow the SignUp pattern:
take RequestMetadata, return (result, *ResponseSideEffects, error).

internal/grpcsrv/handlers: authorizer.go grows 4 real method
implementations (Profile, Permissions, Logout, Revoke,
ValidateJwtToken, ValidateSession, Session). project.go adds
projectUser / projectAuthResponse / projectAppData / claimsToAppData /
protoToModelPermissions helpers shared across methods.

internal/graphql: resolvers for the seven ops become thin delegations
(same pattern as Signup + Meta).

internal/cookie: BuildDeleteSessionCookies added; DeleteSession now
delegates to it (transport-agnostic mirror of the existing pattern).

internal/service/provider.go: Dependencies grows AuthorizationProvider;
the four new methods land on the Provider interface. All call sites
(cmd/root, cmd/mcp, test_helper) wire it through.

TESTS

- TestRecovery_DoesNotLogCredentialBearingPanicValue (H2 regression)
- TestServer_StdioOnly (transport lock-down)
- TestMCPListAndCallMeta now expects 3 MCP tools (meta/profile/permissions);
  session was DROPPED per C1.
- TestMCPToolErrorSurfacesAsIsErrorResult exercises anonymous call to
  identity-bearing tool (formerly the "stubbed tool" test).
- TestAuthorizerServiceStubsReturnUnimplemented shrunk by 7 entries.
- Full SQLite integration suite (67s) still green — no regression on
  the existing GraphQL behaviour for any of the 7 migrated ops.

STILL STUBBED (10 ops, follow-up PRs)

Login, MagicLinkLogin, VerifyEmail, ResendVerifyEmail, VerifyOtp,
ResendOtp, ForgotPassword, ResetPassword, UpdateProfile,
DeactivateAccount. Each is a substantial state machine; better as
focused individual PRs than rushed in a batch.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(api): typed errors + REST status codes, logout POST, signup gRPC, fmt/lint

Addresses the multi-protocol API review findings.

REST/gRPC correctness (a): introduce transport-agnostic typed errors
(internal/service/errors.go, ErrorKind) and a gRPC ErrorMap interceptor
so business errors map to proper codes (InvalidArgument->400,
Unauthenticated->401, PermissionDenied->403, NotFound->404,
FailedPrecondition->400) instead of collapsing to Unknown/500. All
migrated service methods classify their client-facing errors; messages
are unchanged so GraphQL behaviour is byte-identical.

Logout GET->POST (b): logout mutates state and is audited, so it must
not be a safe GET (RFC 9110 9.2.1, CSRF). Proto annotation + regen.

REST error envelope (d): gateway WithErrorHandler emits a stable
snake_case envelope {"code","message"}; WithRoutingErrorHandler keeps
true HTTP statuses (e.g. 405 on method mismatch instead of 501).

Signup gRPC handler (4): wire service.SignUp into the gRPC/REST/MCP
surface (was a stub despite the service method existing).

Fix latent nil-Request panic: MetaFromGRPC now synthesizes an
*http.Request from the gRPC metadata so the gin-shim TokenProvider
helpers in Profile/Permissions/Logout/Session/ValidateSession don't
dereference nil over gRPC/REST.

Tooling: add make fmt (fmt-go/fmt-ts) and make lint (lint-go/lint-ts)
plus .golangci.yml (skips generated code).

Docs: document Stripe-aligned REST conventions (snake_case paths,
method-by-effect, /v1 prefix, error envelope) and correct the mapping
table to the as-implemented paths.

Tests: cross-protocol error-message consistency (GraphQL==gRPC==REST),
REST status-code/envelope coverage, logout-is-POST, MetaFromGRPC request
synthesis. project.go AppData converters de-duplicated.

* feat(api): expose check_permissions/list_permissions on gRPC, REST, and MCP

- typed ErrFgaNotEnabled as FailedPrecondition (gRPC FailedPrecondition,
  REST 400 failed_precondition) instead of an opaque internal error
- FGA integration setup wires the service layer with the embedded engine
- surface tests: 20-RPC assertion, fail-closed + validation coverage for
  both permission RPCs over gRPC and REST, MCP tool list and nested-schema
  coverage (check_permissions/list_permissions replace the permissions tool)
- docs/grpc-rest-api-spec.md updated to the new permission surface and
  required_relations gates

* refactor: review fixes — token-derived FGA subject, shared engine init

- session/validate_session pass the token-validated claims.Subject (not the
  re-fetched user record ID) to enforceRequiredRelations, matching main
- extract initAuthzEngine into cmd/fga_engine.go; root.go and the mcp
  subcommand now share one OpenFGA init path

* fix(cli): mcp subcommand inherits server flags

RootCmd registered its flags as local flags, which cobra does not
propagate to subcommands — the documented `authorizer mcp
--database-type=... --client-id=...` invocation failed with
'unknown flag'. Register them as persistent flags so the mcp
subcommand shares the full server flag surface and rootArgs storage.

Verified end-to-end over stdio: initialize handshake, tools/list
(meta, profile, check_permissions, list_permissions), nested input
schema, public meta call, and fail-closed IsError results for
anonymous identity-bearing calls.

* ci: skip buf breaking until main carries the proto module

buf breaking diffs against main#subdir=proto, but proto/ first lands in
this PR — the check can only fail before merge ('Module had no .proto
files'). Gate it on the base branch actually having protos, and disable
the action's PR comment which the job token lacks permission to post.

* fix(gateway,mcp): propagate authorizer host so issuer validation works off-HTTP

Two fixes found by live end-to-end smoke testing of the new surfaces:

- REST gateway: the in-process bufconn call carries ':authority=bufconn',
  so the service layer resolved the host as http://bufconn and JWT issuer
  validation rejected every token on /v1/*. A WithMetadata annotator now
  forwards the original request's host via parsers.GetHostFromRequest
  (same spoof-hardened resolution as the gin path) as x-authorizer-url,
  which transport.MetaFromGRPC already reads first.
- MCP: stampAuth now also stamps x-authorizer-url from the new
  --mcp-authorizer-url flag, so identity-bearing tools (profile,
  check_permissions, list_permissions) pass issuer validation when
  --mcp-bearer is set.

Regression tests: TestRESTGatewayForwardsAuthorizerHost (REST signup must
mint iss=<forwarded host>, then round-trip on /v1/profile) and
TestStampAuth (both metadata keys).

* test(e2e): release smoke suite for all public API surfaces

make smoke builds the real binary and runs one black-box scenario across
GraphQL, REST, gRPC, and MCP stdio: seed an OpenFGA model + tuple, sign a
user up, then assert the identical check_permissions / list_permissions
decision (allow + deny) on every surface, plus REST fail-closed/validation
envelopes and the MCP handshake + tool discovery with a real bearer token.

Gated behind the smoke build tag so regular test runs skip it; the release
workflow runs it as a required job before the Docker image is built.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lakhansamani added a commit that referenced this pull request Jun 12, 2026
… MCP) (#620)

* feat(api): multi-protocol public API surface (GraphQL + gRPC + REST + MCP)

Adds gRPC + grpc-gateway REST + MCP surfaces for the public GraphQL ops
(no `_` prefix), driven from a single proto source of truth. GraphQL stays
unchanged; admin ops stay GraphQL-only.

Consolidates the previously-stacked PRs #614#615#616#617#618#619
into a single change against main.

PROTO (proto/)
  - buf v2 module rooted at buf.build/authorizerdev/authorizer
  - Single AuthorizerService with 19 RPCs whose names match GraphQL ops
    1:1: Signup, Login, Logout, MagicLinkLogin, VerifyEmail,
    ResendVerifyEmail, VerifyOtp, ResendOtp, ForgotPassword, ResetPassword,
    Profile, UpdateProfile, DeactivateAccount, Revoke, Session,
    ValidateJwtToken, ValidateSession, Meta, Permissions
  - common/v1: annotations (required_permissions, mcp_tool, audit_log,
    public), pagination, errors, shared AppData
  - Each RPC's response wrapped in a per-RPC message so buf STANDARD's
    RPC_REQUEST_RESPONSE_UNIQUE lint passes; shared inner types (AuthResponse,
    User, Meta) live in proto/authorizer/v1/types.proto
  - google.api.http annotations drive REST: GET /v1/{method} for trivially-
    empty queries (meta, profile, permissions, logout), POST /v1/{method}
    otherwise. Snake_case method paths mirror GraphQL identifiers.
  - buf STANDARD lint + format both enforced in CI; bufbuild/buf-action@v1
    runs lint always, breaking-check on PRs, format -d --exit-code always

TRANSPORT-AGNOSTIC SERVICE LAYER (internal/service/)
  - sideeffects.go: RequestMetadata + ResponseSideEffects + MetaFromGin /
    ApplyToGin / MetaFromGRPC / ApplyToGRPC bridges
  - provider.go: service.Provider interface
  - signup.go, meta.go: migrated from internal/graphql; resolvers become
    thin transport adapters
  - Supporting helpers: parsers.GetHostFromRequest/GetAppURLFromRequest,
    cookie.BuildSessionCookies/BuildMfaSessionCookies (existing gin
    wrappers now delegate to these so behaviour is byte-identical)

gRPC SERVER (internal/grpcsrv/)
  - server.go: AuthorizerService registered, gRPC reflection (gated on
    --enable-grpc-reflection), gRPC health checking, graceful shutdown
  - interceptors: recovery (panic → codes.Internal), logging (per-code
    level), validate (protovalidate)
  - handlers/authorizer.go: Meta delegates to service.Meta; the other 18
    methods inherit UnimplementedAuthorizerServiceServer and return
    codes.Unimplemented until their handler migrates from internal/graphql
  - transport/grpc_metadata.go: gRPC metadata ↔ RequestMetadata bridge
    (extracts cookies from grpcgateway-cookie, preserves multi-cookie
    Set-Cookie responses)

REST GATEWAY (internal/gateway/)
  - mount.go: serves grpc-gateway via in-process bufconn dial — no extra
    TCP hop, no TLS plumbing
  - JSONPb marshaler: UseProtoNames=true so REST payloads match GraphQL's
    snake_case shape
  - Mounted at /v1/* under the existing gin router (shares CORS, security
    headers, rate limit, logger middleware automatically)
  - /openapi.json serves the merged swagger spec (embedded via go:embed
    from gen/openapi/openapi.go so it works regardless of cwd)

MCP SERVER (internal/mcp/)
  - scanner.go: walks grpc.Server.GetServiceInfo() + protoregistry.GlobalFiles,
    reads the mcp_tool annotation on each method to build a tool registry
  - schema.go: derives JSON Schema from proto request descriptors, with
    cycle guard for self-recursive types (google.protobuf.Value)
  - server.go: registers tools dynamically on a github.com/modelcontextprotocol/
    go-sdk Server; tool handlers unmarshal JSON args into a dynamicpb.Message,
    invoke the gRPC method via an in-process bufconn, marshal the response
    back to JSON. gRPC errors surface as CallToolResult{IsError:true} so
    the LLM gets actionable text
  - Today's MCP-exposed tools (from proto annotations): meta, profile,
    session, permissions. Credential-bearing methods stay unexposed
  - `authorizer mcp` subcommand (cmd/mcp.go) serves over stdio for
    `claude mcp add authorizer -- /path/to/authorizer mcp ...`

CLI (cmd/root.go, cmd/mcp.go, internal/config/config.go)
  - --grpc-port (default 9091; collision-checked against --http-port and
    --metrics-port at startup), --enable-grpc-reflection (default true),
    --grpc-tls-cert / -key / -insecure (TLS plumbing placeholders; TLS
    implementation is a follow-up PR)
  - server.Run starts HTTP + metrics + gRPC + REST gateway listeners with
    shared graceful shutdown

TESTS
  - internal/parsers/url_test.go        GetHostFromRequest priority + spoof rejection
  - internal/cookie/cookie_test.go      BuildSessionCookies/BuildMfaSessionCookies shape
  - internal/service/sideeffects_test.go MetaFromGin/ApplyToGin nil-safety + roundtrip
  - internal/grpcsrv/interceptors/      recovery / logging / validate
  - internal/grpcsrv/transport/         gRPC metadata bridge (cookies, fallbacks)
  - internal/mcp/schema_test.go         flat scalars, nested message, cycle-safety regression
  - internal/integration_tests/grpc_meta_test.go      AuthorizerService.Meta
  - internal/integration_tests/grpc_surface_test.go   all 18 stubs return Unimplemented + gRPC health
  - internal/integration_tests/rest_meta_test.go      GET /v1/meta through gateway
  - internal/integration_tests/rest_openapi_test.go   /openapi.json serves embedded spec
  - internal/integration_tests/mcp_test.go            tools/list + tools/call meta
  - internal/integration_tests/mcp_stubs_test.go      stub returns CallToolResult{IsError:true}
  - Existing GraphQL integration suite still passes (65–70s, no behaviour drift)

What's NOT in this PR (deferred)
  - --grpc-tls-cert / -key / -insecure are wired into config but not yet
    enforced; TLS implementation lands in a follow-up alongside metrics-
    listener TLS
  - 18 of the 19 gRPC methods (and their REST mirrors + MCP tools) are
    Unimplemented stubs; each becomes real as its op migrates from
    internal/graphql into internal/service in follow-up PRs. The
    annotation-driven MCP scanner + gateway routing means follow-ups
    don't need to touch the gRPC/REST/MCP scaffolding — only add the
    service-layer method and the handler delegation

* feat(api,mcp): migrate 7 stubs; security audit fixes; lock stdio-only MCP (#621)

Implements 7 of the 17 stubbed AuthorizerService methods (Profile,
Permissions, Logout, Revoke, ValidateJwtToken, ValidateSession, Session)
following the established service-layer pattern, and addresses the
security audit findings against the MCP surface.

SECURITY AUDIT FIXES

C1 — Session response carries access_token / refresh_token / id_token /
authenticator_secret / recovery_codes. The proto annotation on Session
flipped to mcp_tool.exposed = false so those credentials never land in
an LLM transcript. Session remains available via gRPC + REST + GraphQL
for legitimate browser/server-to-server consumers.

H1 — MCP→gRPC auth propagation. New `--mcp-bearer` flag on the
`authorizer mcp` subcommand; the MCP server stamps `Authorization:
Bearer <token>` on every outgoing gRPC call. Identity-bearing tools
(profile, permissions) now have a caller to attribute to; anonymous
runs still work for the public Meta tool but identity-bearing tools
surface a clean unauthorized error.

H2 — Recovery interceptor redacts panic values. The recovered value is
no longer dumped via `.Interface("panic", r)` (which would have logged
credentials if a handler ever panicked with the request struct); only
the panic type is logged for triage. Regression test included.

STDIO-ONLY MCP TRANSPORT

internal/mcp/server.go — explicit type-level documentation: stdio is
the ONLY supported transport. The Server has no RunHTTP / RunTCP /
RunSSE methods, intentionally.

internal/mcp/transport_test.go — `TestServer_StdioOnly` reflects over
*Server's exported methods and fails the build if anyone adds a method
whose name suggests a network transport (RunHTTP, ListenTCP, ServeWS,
etc.). To add a transport: implement an MCP-side auth interceptor
first, then update the allow-list.

cmd/mcp.go — docstring + CLI long help explicitly state "stdio only".

7 STUB MIGRATIONS

internal/service: profile.go, permissions.go, logout.go, revoke.go,
validate_jwt_token.go, validate_session.go, session.go,
permission_check.go (shared helper). All follow the SignUp pattern:
take RequestMetadata, return (result, *ResponseSideEffects, error).

internal/grpcsrv/handlers: authorizer.go grows 4 real method
implementations (Profile, Permissions, Logout, Revoke,
ValidateJwtToken, ValidateSession, Session). project.go adds
projectUser / projectAuthResponse / projectAppData / claimsToAppData /
protoToModelPermissions helpers shared across methods.

internal/graphql: resolvers for the seven ops become thin delegations
(same pattern as Signup + Meta).

internal/cookie: BuildDeleteSessionCookies added; DeleteSession now
delegates to it (transport-agnostic mirror of the existing pattern).

internal/service/provider.go: Dependencies grows AuthorizationProvider;
the four new methods land on the Provider interface. All call sites
(cmd/root, cmd/mcp, test_helper) wire it through.

TESTS

- TestRecovery_DoesNotLogCredentialBearingPanicValue (H2 regression)
- TestServer_StdioOnly (transport lock-down)
- TestMCPListAndCallMeta now expects 3 MCP tools (meta/profile/permissions);
  session was DROPPED per C1.
- TestMCPToolErrorSurfacesAsIsErrorResult exercises anonymous call to
  identity-bearing tool (formerly the "stubbed tool" test).
- TestAuthorizerServiceStubsReturnUnimplemented shrunk by 7 entries.
- Full SQLite integration suite (67s) still green — no regression on
  the existing GraphQL behaviour for any of the 7 migrated ops.

STILL STUBBED (10 ops, follow-up PRs)

Login, MagicLinkLogin, VerifyEmail, ResendVerifyEmail, VerifyOtp,
ResendOtp, ForgotPassword, ResetPassword, UpdateProfile,
DeactivateAccount. Each is a substantial state machine; better as
focused individual PRs than rushed in a batch.

* feat(api): typed errors + REST status codes, logout POST, signup gRPC, fmt/lint

Addresses the multi-protocol API review findings.

REST/gRPC correctness (a): introduce transport-agnostic typed errors
(internal/service/errors.go, ErrorKind) and a gRPC ErrorMap interceptor
so business errors map to proper codes (InvalidArgument->400,
Unauthenticated->401, PermissionDenied->403, NotFound->404,
FailedPrecondition->400) instead of collapsing to Unknown/500. All
migrated service methods classify their client-facing errors; messages
are unchanged so GraphQL behaviour is byte-identical.

Logout GET->POST (b): logout mutates state and is audited, so it must
not be a safe GET (RFC 9110 9.2.1, CSRF). Proto annotation + regen.

REST error envelope (d): gateway WithErrorHandler emits a stable
snake_case envelope {"code","message"}; WithRoutingErrorHandler keeps
true HTTP statuses (e.g. 405 on method mismatch instead of 501).

Signup gRPC handler (4): wire service.SignUp into the gRPC/REST/MCP
surface (was a stub despite the service method existing).

Fix latent nil-Request panic: MetaFromGRPC now synthesizes an
*http.Request from the gRPC metadata so the gin-shim TokenProvider
helpers in Profile/Permissions/Logout/Session/ValidateSession don't
dereference nil over gRPC/REST.

Tooling: add make fmt (fmt-go/fmt-ts) and make lint (lint-go/lint-ts)
plus .golangci.yml (skips generated code).

Docs: document Stripe-aligned REST conventions (snake_case paths,
method-by-effect, /v1 prefix, error envelope) and correct the mapping
table to the as-implemented paths.

Tests: cross-protocol error-message consistency (GraphQL==gRPC==REST),
REST status-code/envelope coverage, logout-is-POST, MetaFromGRPC request
synthesis. project.go AppData converters de-duplicated.

* feat(api): expose check_permissions/list_permissions on gRPC, REST, and MCP

- typed ErrFgaNotEnabled as FailedPrecondition (gRPC FailedPrecondition,
  REST 400 failed_precondition) instead of an opaque internal error
- FGA integration setup wires the service layer with the embedded engine
- surface tests: 20-RPC assertion, fail-closed + validation coverage for
  both permission RPCs over gRPC and REST, MCP tool list and nested-schema
  coverage (check_permissions/list_permissions replace the permissions tool)
- docs/grpc-rest-api-spec.md updated to the new permission surface and
  required_relations gates

* refactor: review fixes — token-derived FGA subject, shared engine init

- session/validate_session pass the token-validated claims.Subject (not the
  re-fetched user record ID) to enforceRequiredRelations, matching main
- extract initAuthzEngine into cmd/fga_engine.go; root.go and the mcp
  subcommand now share one OpenFGA init path

* fix(cli): mcp subcommand inherits server flags

RootCmd registered its flags as local flags, which cobra does not
propagate to subcommands — the documented `authorizer mcp
--database-type=... --client-id=...` invocation failed with
'unknown flag'. Register them as persistent flags so the mcp
subcommand shares the full server flag surface and rootArgs storage.

Verified end-to-end over stdio: initialize handshake, tools/list
(meta, profile, check_permissions, list_permissions), nested input
schema, public meta call, and fail-closed IsError results for
anonymous identity-bearing calls.

* ci: skip buf breaking until main carries the proto module

buf breaking diffs against main#subdir=proto, but proto/ first lands in
this PR — the check can only fail before merge ('Module had no .proto
files'). Gate it on the base branch actually having protos, and disable
the action's PR comment which the job token lacks permission to post.

* fix(gateway,mcp): propagate authorizer host so issuer validation works off-HTTP

Two fixes found by live end-to-end smoke testing of the new surfaces:

- REST gateway: the in-process bufconn call carries ':authority=bufconn',
  so the service layer resolved the host as http://bufconn and JWT issuer
  validation rejected every token on /v1/*. A WithMetadata annotator now
  forwards the original request's host via parsers.GetHostFromRequest
  (same spoof-hardened resolution as the gin path) as x-authorizer-url,
  which transport.MetaFromGRPC already reads first.
- MCP: stampAuth now also stamps x-authorizer-url from the new
  --mcp-authorizer-url flag, so identity-bearing tools (profile,
  check_permissions, list_permissions) pass issuer validation when
  --mcp-bearer is set.

Regression tests: TestRESTGatewayForwardsAuthorizerHost (REST signup must
mint iss=<forwarded host>, then round-trip on /v1/profile) and
TestStampAuth (both metadata keys).

* test(e2e): release smoke suite for all public API surfaces

make smoke builds the real binary and runs one black-box scenario across
GraphQL, REST, gRPC, and MCP stdio: seed an OpenFGA model + tuple, sign a
user up, then assert the identical check_permissions / list_permissions
decision (allow + deny) on every surface, plus REST fail-closed/validation
envelopes and the MCP handshake + tool discovery with a real bearer token.

Gated behind the smoke build tag so regular test runs skip it; the release
workflow runs it as a required job before the Docker image is built.

---------
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant