Skip to content

test+fix: address principal-engineer review findings (Phases 0-4)#618

Closed
lakhansamani wants to merge 1 commit into
feat/mcp-serverfrom
feat/test-coverage
Closed

test+fix: address principal-engineer review findings (Phases 0-4)#618
lakhansamani wants to merge 1 commit into
feat/mcp-serverfrom
feat/test-coverage

Conversation

@lakhansamani

Copy link
Copy Markdown
Contributor

Stacked on top of #617 (which is stacked on #616#615#614). Review and merge after Phase 4.

Summary

Adds the missing integration/e2e test coverage flagged across reviews of #614 / #615 / #616 / #617, and fixes the blocking bugs those reviews surfaced. Test additions are spread across every layer the PR stack introduces; fixes are deliberately scoped to issues that would have hit production on the first deploy.

New tests (8 files, ~50 sub-cases)

File Coverage
`internal/parsers/url_test.go` `GetHostFromRequest` priority + spoof rejection; `GetAppURLFromRequest`
`internal/cookie/cookie_test.go` `BuildSessionCookies`/`BuildMfaSessionCookies` shape, MaxAge, Secure, HttpOnly, SameSite; `ParseSameSite`
`internal/service/sideeffects_test.go` `MetaFromGin`/`ApplyToGin` nil-safety + cookie round-trip
`internal/grpcsrv/interceptors/interceptors_test.go` Recovery (panic→Internal), Logging (per-code level), Validate (protovalidate)
`internal/grpcsrv/transport/grpc_metadata_test.go` `MetaFromGRPC` host/IP/UA/auth/cookies + `:authority` fallback
`internal/mcp/schema_test.go` Flat scalars, nested message, cycle-safety regression test for AppData→Struct→Value
`internal/integration_tests/grpc_surface_test.go` All 18 stub RPCs verified to return `codes.Unimplemented` + gRPC health endpoint
`internal/integration_tests/rest_openapi_test.go` `/openapi.json` serves valid swagger 2.0
`internal/integration_tests/mcp_stubs_test.go` MCP stubbed-tool call returns `CallToolResult{IsError:true}`

Blocking fixes from review

# Where Fix
#616-1 `cmd/root.go` --grpc-port default 8081 collided with --metrics-port. Moved gRPC to 9091. Added triple-port collision check (HTTP/metrics/gRPC); fails fast at startup.
#616-2 `internal/grpcsrv/transport/grpc_metadata.go` `MetaFromGRPC` now extracts cookies from `grpcgateway-cookie` / `cookie` metadata. Without this fix, every authenticated flow would break the moment SessionService stubs became real handlers.
#616-3 same file `ApplyToGRPC` now emits all Set-Cookie metadata entries (was dropping all but first).
#616-4 `internal/server/http_routes.go` + new `gen/openapi/openapi.go` `/openapi.json` was reading via relative path (broke in Docker/tests). Now embedded via `//go:embed`.
#617-1 `internal/mcp/schema.go` `schemaForMessage` tracks visited descriptors. Without this, exposing any tool whose request contained `google.protobuf.Value` (e.g. AppData→Struct→Value→repeated Value) would stack-overflow at boot.
#617-2 `internal/mcp/server.go` Tool execution errors now surface as `CallToolResult{IsError:true}` (MCP-spec way) instead of opaque JSON-RPC failures.
#617-3 `internal/mcp/scanner.go` Removed unused `descriptorOptionsCarrier` dead-code struct + always-failing type assertion.

Deliberately deferred (tracked but not in this PR)

  • `--grpc-tls-cert/-key/-insecure` placeholders — flagged by reviewer; deferred to a focused TLS PR alongside the metrics-listener TLS work.
  • Secrets in REST path (`DELETE /v1/refresh-tokens/{token}` etc.) — flagged by reviewer; moving to body-borne tokens is a proto-design change worth its own PR + discussion.
  • MCP auth metadata propagation — flagged by reviewer; today's MCP-exposed tools are `GetMeta` (no auth) + 3 stubs. Real wiring lands when the stubs become real handlers.

Test plan

  • Full `go test ./...` green (17 packages, 69s integration suite)
  • `go build ./...` clean
  • CI green on this stacked PR

🤖 Generated with Claude Code

Adds the missing integration/e2e test coverage flagged across reviews of
#614 / #615 / #616 / #617, and fixes the blocking bugs those reviews
surfaced. Test additions are spread across every layer the PR stack
introduces; fixes are deliberately scoped to issues that would have hit
production on the first deploy.

NEW TESTS

internal/parsers/url_test.go
  - TestGetHostFromRequest: header priority + spoof rejection
  - TestGetAppURLFromRequest

internal/cookie/cookie_test.go
  - TestBuildSessionCookies: 3 hostname scenarios, validates domain-scoped
    cookie picks up the apex, MaxAge/Secure/HttpOnly/SameSite all preserved
  - TestBuildMfaSessionCookies + insecure-Lax-SameSite variant
  - TestParseSameSite

internal/service/sideeffects_test.go
  - MetaFromGin / ApplyToGin nil-safety + cookie round-trip
  - ResponseSideEffects.AddCookie nil tolerance

internal/grpcsrv/interceptors/interceptors_test.go
  - Recovery turns panic into codes.Internal; stack stays server-side
  - Recovery passes normal errors through unchanged
  - Logging emits info/warn/error at the right thresholds per gRPC code
  - Validate rejects bad requests (protovalidate enforces email format)
  - Validate allows valid requests and tolerates non-proto request types

internal/grpcsrv/transport/grpc_metadata_test.go
  - MetaFromGRPC extracts host/IP/UA/auth/cookies; :authority fallback
  - cookiesFromMetadata parses multi-header cookies
  - ApplyToGRPC nil-safety

internal/mcp/schema_test.go
  - schemaForMessage for flat scalars + nested message (AppData)
  - **TestSchemaForMessage_CycleSafe** would have caught the original
    stack-overflow bug
  - Documents current oneof flattening as known limitation

internal/integration_tests/grpc_surface_test.go
  - TestGRPCStubsReturnUnimplemented: every one of the 18 stubbed RPCs
    verified to return codes.Unimplemented (locks down the contract so
    future migrations can't accidentally regress to OK or panic)
  - TestGRPCHealthCheckProtocol: grpc.health.v1.Health responds SERVING

internal/integration_tests/rest_openapi_test.go
  - /openapi.json serves valid swagger 2.0 with non-empty paths object

internal/integration_tests/mcp_stubs_test.go
  - MCP call to a stubbed tool returns CallToolResult{IsError:true} with
    "Unimplemented" in the text (vs the previous JSON-RPC-level error)

BLOCKING FIXES FROM REVIEW

cmd/root.go
  - --grpc-port default 8081 collided with --metrics-port. Moved gRPC to
    9091. Added a triple-port collision check (HTTP/metrics/gRPC) at
    startup; fail fast with a specific error message.

internal/grpcsrv/transport/grpc_metadata.go
  - MetaFromGRPC now extracts cookies from grpcgateway-cookie / cookie
    metadata. The old version silently dropped them, which would have
    broken every authenticated flow as soon as the SessionService stubs
    were replaced with real handlers.
  - ApplyToGRPC now emits ALL Set-Cookie metadata entries via md.Append.
    The old version sent only the first cookie, breaking the host-scoped
    + domain-scoped session-cookie pair.

internal/mcp/schema.go
  - schemaForMessage tracks visited descriptors. Without this, exposing
    any tool whose request contained google.protobuf.Value (e.g. AppData
    → Struct → Value → repeated Value) would stack-overflow at server
    boot. Cycle short-circuits to opaque `object`.

internal/mcp/server.go
  - Tool execution errors (gRPC Unimplemented / PermissionDenied / etc.)
    now surface as CallToolResult{IsError:true} with the gRPC status
    message — the MCP-spec way to give the LLM actionable text instead
    of a low-level JSON-RPC failure.
  - Replaced fragile json-null string compare with strings.TrimSpace+
    isJSONNull helper.

internal/mcp/scanner.go
  - Removed unused descriptorOptionsCarrier dead-code struct + always-
    failing type assertion in mcpToolFromMethod. Function is now what
    it was always trying to be: 4 lines.

internal/server/http_routes.go + gen/openapi/openapi.go
  - /openapi.json no longer reads via relative path (which broke in
    Docker / tests where cwd ≠ repo root). The spec is now embedded
    via go:embed.

NOT FIXED IN THIS PR (deferred)
  - --grpc-tls-cert/-key/-insecure flags are still placeholders. Tracked
    for a TLS-implementation PR alongside the metrics-listener TLS work.
  - Secrets in REST path (DELETE /v1/refresh-tokens/{token} etc.) flagged
    by reviewer; the migration to body-borne tokens is a proto-design
    change worth its own PR + design discussion.
  - MCP auth metadata propagation — punted; today's MCP-exposed tools are
    GetMeta (no auth needed) + 3 stubs. Real wiring lands when those
    stubs become real handlers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lakhansamani

Copy link
Copy Markdown
Contributor Author

Superseded by #620, which consolidates this stack into a single PR against main. All blocking review findings from this PR were addressed in #620; see its body for the per-finding traceback.

lakhansamani added a commit that referenced this pull request Jun 12, 2026
… MCP) (#620)

* feat(api): multi-protocol public API surface (GraphQL + gRPC + REST + MCP)

Adds gRPC + grpc-gateway REST + MCP surfaces for the public GraphQL ops
(no `_` prefix), driven from a single proto source of truth. GraphQL stays
unchanged; admin ops stay GraphQL-only.

Consolidates the previously-stacked PRs #614#615#616#617#618#619
into a single change against main.

PROTO (proto/)
  - buf v2 module rooted at buf.build/authorizerdev/authorizer
  - Single AuthorizerService with 19 RPCs whose names match GraphQL ops
    1:1: Signup, Login, Logout, MagicLinkLogin, VerifyEmail,
    ResendVerifyEmail, VerifyOtp, ResendOtp, ForgotPassword, ResetPassword,
    Profile, UpdateProfile, DeactivateAccount, Revoke, Session,
    ValidateJwtToken, ValidateSession, Meta, Permissions
  - common/v1: annotations (required_permissions, mcp_tool, audit_log,
    public), pagination, errors, shared AppData
  - Each RPC's response wrapped in a per-RPC message so buf STANDARD's
    RPC_REQUEST_RESPONSE_UNIQUE lint passes; shared inner types (AuthResponse,
    User, Meta) live in proto/authorizer/v1/types.proto
  - google.api.http annotations drive REST: GET /v1/{method} for trivially-
    empty queries (meta, profile, permissions, logout), POST /v1/{method}
    otherwise. Snake_case method paths mirror GraphQL identifiers.
  - buf STANDARD lint + format both enforced in CI; bufbuild/buf-action@v1
    runs lint always, breaking-check on PRs, format -d --exit-code always

TRANSPORT-AGNOSTIC SERVICE LAYER (internal/service/)
  - sideeffects.go: RequestMetadata + ResponseSideEffects + MetaFromGin /
    ApplyToGin / MetaFromGRPC / ApplyToGRPC bridges
  - provider.go: service.Provider interface
  - signup.go, meta.go: migrated from internal/graphql; resolvers become
    thin transport adapters
  - Supporting helpers: parsers.GetHostFromRequest/GetAppURLFromRequest,
    cookie.BuildSessionCookies/BuildMfaSessionCookies (existing gin
    wrappers now delegate to these so behaviour is byte-identical)

gRPC SERVER (internal/grpcsrv/)
  - server.go: AuthorizerService registered, gRPC reflection (gated on
    --enable-grpc-reflection), gRPC health checking, graceful shutdown
  - interceptors: recovery (panic → codes.Internal), logging (per-code
    level), validate (protovalidate)
  - handlers/authorizer.go: Meta delegates to service.Meta; the other 18
    methods inherit UnimplementedAuthorizerServiceServer and return
    codes.Unimplemented until their handler migrates from internal/graphql
  - transport/grpc_metadata.go: gRPC metadata ↔ RequestMetadata bridge
    (extracts cookies from grpcgateway-cookie, preserves multi-cookie
    Set-Cookie responses)

REST GATEWAY (internal/gateway/)
  - mount.go: serves grpc-gateway via in-process bufconn dial — no extra
    TCP hop, no TLS plumbing
  - JSONPb marshaler: UseProtoNames=true so REST payloads match GraphQL's
    snake_case shape
  - Mounted at /v1/* under the existing gin router (shares CORS, security
    headers, rate limit, logger middleware automatically)
  - /openapi.json serves the merged swagger spec (embedded via go:embed
    from gen/openapi/openapi.go so it works regardless of cwd)

MCP SERVER (internal/mcp/)
  - scanner.go: walks grpc.Server.GetServiceInfo() + protoregistry.GlobalFiles,
    reads the mcp_tool annotation on each method to build a tool registry
  - schema.go: derives JSON Schema from proto request descriptors, with
    cycle guard for self-recursive types (google.protobuf.Value)
  - server.go: registers tools dynamically on a github.com/modelcontextprotocol/
    go-sdk Server; tool handlers unmarshal JSON args into a dynamicpb.Message,
    invoke the gRPC method via an in-process bufconn, marshal the response
    back to JSON. gRPC errors surface as CallToolResult{IsError:true} so
    the LLM gets actionable text
  - Today's MCP-exposed tools (from proto annotations): meta, profile,
    session, permissions. Credential-bearing methods stay unexposed
  - `authorizer mcp` subcommand (cmd/mcp.go) serves over stdio for
    `claude mcp add authorizer -- /path/to/authorizer mcp ...`

CLI (cmd/root.go, cmd/mcp.go, internal/config/config.go)
  - --grpc-port (default 9091; collision-checked against --http-port and
    --metrics-port at startup), --enable-grpc-reflection (default true),
    --grpc-tls-cert / -key / -insecure (TLS plumbing placeholders; TLS
    implementation is a follow-up PR)
  - server.Run starts HTTP + metrics + gRPC + REST gateway listeners with
    shared graceful shutdown

TESTS
  - internal/parsers/url_test.go        GetHostFromRequest priority + spoof rejection
  - internal/cookie/cookie_test.go      BuildSessionCookies/BuildMfaSessionCookies shape
  - internal/service/sideeffects_test.go MetaFromGin/ApplyToGin nil-safety + roundtrip
  - internal/grpcsrv/interceptors/      recovery / logging / validate
  - internal/grpcsrv/transport/         gRPC metadata bridge (cookies, fallbacks)
  - internal/mcp/schema_test.go         flat scalars, nested message, cycle-safety regression
  - internal/integration_tests/grpc_meta_test.go      AuthorizerService.Meta
  - internal/integration_tests/grpc_surface_test.go   all 18 stubs return Unimplemented + gRPC health
  - internal/integration_tests/rest_meta_test.go      GET /v1/meta through gateway
  - internal/integration_tests/rest_openapi_test.go   /openapi.json serves embedded spec
  - internal/integration_tests/mcp_test.go            tools/list + tools/call meta
  - internal/integration_tests/mcp_stubs_test.go      stub returns CallToolResult{IsError:true}
  - Existing GraphQL integration suite still passes (65–70s, no behaviour drift)

What's NOT in this PR (deferred)
  - --grpc-tls-cert / -key / -insecure are wired into config but not yet
    enforced; TLS implementation lands in a follow-up alongside metrics-
    listener TLS
  - 18 of the 19 gRPC methods (and their REST mirrors + MCP tools) are
    Unimplemented stubs; each becomes real as its op migrates from
    internal/graphql into internal/service in follow-up PRs. The
    annotation-driven MCP scanner + gateway routing means follow-ups
    don't need to touch the gRPC/REST/MCP scaffolding — only add the
    service-layer method and the handler delegation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(api,mcp): migrate 7 stubs; security audit fixes; lock stdio-only MCP (#621)

Implements 7 of the 17 stubbed AuthorizerService methods (Profile,
Permissions, Logout, Revoke, ValidateJwtToken, ValidateSession, Session)
following the established service-layer pattern, and addresses the
security audit findings against the MCP surface.

SECURITY AUDIT FIXES

C1 — Session response carries access_token / refresh_token / id_token /
authenticator_secret / recovery_codes. The proto annotation on Session
flipped to mcp_tool.exposed = false so those credentials never land in
an LLM transcript. Session remains available via gRPC + REST + GraphQL
for legitimate browser/server-to-server consumers.

H1 — MCP→gRPC auth propagation. New `--mcp-bearer` flag on the
`authorizer mcp` subcommand; the MCP server stamps `Authorization:
Bearer <token>` on every outgoing gRPC call. Identity-bearing tools
(profile, permissions) now have a caller to attribute to; anonymous
runs still work for the public Meta tool but identity-bearing tools
surface a clean unauthorized error.

H2 — Recovery interceptor redacts panic values. The recovered value is
no longer dumped via `.Interface("panic", r)` (which would have logged
credentials if a handler ever panicked with the request struct); only
the panic type is logged for triage. Regression test included.

STDIO-ONLY MCP TRANSPORT

internal/mcp/server.go — explicit type-level documentation: stdio is
the ONLY supported transport. The Server has no RunHTTP / RunTCP /
RunSSE methods, intentionally.

internal/mcp/transport_test.go — `TestServer_StdioOnly` reflects over
*Server's exported methods and fails the build if anyone adds a method
whose name suggests a network transport (RunHTTP, ListenTCP, ServeWS,
etc.). To add a transport: implement an MCP-side auth interceptor
first, then update the allow-list.

cmd/mcp.go — docstring + CLI long help explicitly state "stdio only".

7 STUB MIGRATIONS

internal/service: profile.go, permissions.go, logout.go, revoke.go,
validate_jwt_token.go, validate_session.go, session.go,
permission_check.go (shared helper). All follow the SignUp pattern:
take RequestMetadata, return (result, *ResponseSideEffects, error).

internal/grpcsrv/handlers: authorizer.go grows 4 real method
implementations (Profile, Permissions, Logout, Revoke,
ValidateJwtToken, ValidateSession, Session). project.go adds
projectUser / projectAuthResponse / projectAppData / claimsToAppData /
protoToModelPermissions helpers shared across methods.

internal/graphql: resolvers for the seven ops become thin delegations
(same pattern as Signup + Meta).

internal/cookie: BuildDeleteSessionCookies added; DeleteSession now
delegates to it (transport-agnostic mirror of the existing pattern).

internal/service/provider.go: Dependencies grows AuthorizationProvider;
the four new methods land on the Provider interface. All call sites
(cmd/root, cmd/mcp, test_helper) wire it through.

TESTS

- TestRecovery_DoesNotLogCredentialBearingPanicValue (H2 regression)
- TestServer_StdioOnly (transport lock-down)
- TestMCPListAndCallMeta now expects 3 MCP tools (meta/profile/permissions);
  session was DROPPED per C1.
- TestMCPToolErrorSurfacesAsIsErrorResult exercises anonymous call to
  identity-bearing tool (formerly the "stubbed tool" test).
- TestAuthorizerServiceStubsReturnUnimplemented shrunk by 7 entries.
- Full SQLite integration suite (67s) still green — no regression on
  the existing GraphQL behaviour for any of the 7 migrated ops.

STILL STUBBED (10 ops, follow-up PRs)

Login, MagicLinkLogin, VerifyEmail, ResendVerifyEmail, VerifyOtp,
ResendOtp, ForgotPassword, ResetPassword, UpdateProfile,
DeactivateAccount. Each is a substantial state machine; better as
focused individual PRs than rushed in a batch.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(api): typed errors + REST status codes, logout POST, signup gRPC, fmt/lint

Addresses the multi-protocol API review findings.

REST/gRPC correctness (a): introduce transport-agnostic typed errors
(internal/service/errors.go, ErrorKind) and a gRPC ErrorMap interceptor
so business errors map to proper codes (InvalidArgument->400,
Unauthenticated->401, PermissionDenied->403, NotFound->404,
FailedPrecondition->400) instead of collapsing to Unknown/500. All
migrated service methods classify their client-facing errors; messages
are unchanged so GraphQL behaviour is byte-identical.

Logout GET->POST (b): logout mutates state and is audited, so it must
not be a safe GET (RFC 9110 9.2.1, CSRF). Proto annotation + regen.

REST error envelope (d): gateway WithErrorHandler emits a stable
snake_case envelope {"code","message"}; WithRoutingErrorHandler keeps
true HTTP statuses (e.g. 405 on method mismatch instead of 501).

Signup gRPC handler (4): wire service.SignUp into the gRPC/REST/MCP
surface (was a stub despite the service method existing).

Fix latent nil-Request panic: MetaFromGRPC now synthesizes an
*http.Request from the gRPC metadata so the gin-shim TokenProvider
helpers in Profile/Permissions/Logout/Session/ValidateSession don't
dereference nil over gRPC/REST.

Tooling: add make fmt (fmt-go/fmt-ts) and make lint (lint-go/lint-ts)
plus .golangci.yml (skips generated code).

Docs: document Stripe-aligned REST conventions (snake_case paths,
method-by-effect, /v1 prefix, error envelope) and correct the mapping
table to the as-implemented paths.

Tests: cross-protocol error-message consistency (GraphQL==gRPC==REST),
REST status-code/envelope coverage, logout-is-POST, MetaFromGRPC request
synthesis. project.go AppData converters de-duplicated.

* feat(api): expose check_permissions/list_permissions on gRPC, REST, and MCP

- typed ErrFgaNotEnabled as FailedPrecondition (gRPC FailedPrecondition,
  REST 400 failed_precondition) instead of an opaque internal error
- FGA integration setup wires the service layer with the embedded engine
- surface tests: 20-RPC assertion, fail-closed + validation coverage for
  both permission RPCs over gRPC and REST, MCP tool list and nested-schema
  coverage (check_permissions/list_permissions replace the permissions tool)
- docs/grpc-rest-api-spec.md updated to the new permission surface and
  required_relations gates

* refactor: review fixes — token-derived FGA subject, shared engine init

- session/validate_session pass the token-validated claims.Subject (not the
  re-fetched user record ID) to enforceRequiredRelations, matching main
- extract initAuthzEngine into cmd/fga_engine.go; root.go and the mcp
  subcommand now share one OpenFGA init path

* fix(cli): mcp subcommand inherits server flags

RootCmd registered its flags as local flags, which cobra does not
propagate to subcommands — the documented `authorizer mcp
--database-type=... --client-id=...` invocation failed with
'unknown flag'. Register them as persistent flags so the mcp
subcommand shares the full server flag surface and rootArgs storage.

Verified end-to-end over stdio: initialize handshake, tools/list
(meta, profile, check_permissions, list_permissions), nested input
schema, public meta call, and fail-closed IsError results for
anonymous identity-bearing calls.

* ci: skip buf breaking until main carries the proto module

buf breaking diffs against main#subdir=proto, but proto/ first lands in
this PR — the check can only fail before merge ('Module had no .proto
files'). Gate it on the base branch actually having protos, and disable
the action's PR comment which the job token lacks permission to post.

* fix(gateway,mcp): propagate authorizer host so issuer validation works off-HTTP

Two fixes found by live end-to-end smoke testing of the new surfaces:

- REST gateway: the in-process bufconn call carries ':authority=bufconn',
  so the service layer resolved the host as http://bufconn and JWT issuer
  validation rejected every token on /v1/*. A WithMetadata annotator now
  forwards the original request's host via parsers.GetHostFromRequest
  (same spoof-hardened resolution as the gin path) as x-authorizer-url,
  which transport.MetaFromGRPC already reads first.
- MCP: stampAuth now also stamps x-authorizer-url from the new
  --mcp-authorizer-url flag, so identity-bearing tools (profile,
  check_permissions, list_permissions) pass issuer validation when
  --mcp-bearer is set.

Regression tests: TestRESTGatewayForwardsAuthorizerHost (REST signup must
mint iss=<forwarded host>, then round-trip on /v1/profile) and
TestStampAuth (both metadata keys).

* test(e2e): release smoke suite for all public API surfaces

make smoke builds the real binary and runs one black-box scenario across
GraphQL, REST, gRPC, and MCP stdio: seed an OpenFGA model + tuple, sign a
user up, then assert the identical check_permissions / list_permissions
decision (allow + deny) on every surface, plus REST fail-closed/validation
envelopes and the MCP handshake + tool discovery with a real bearer token.

Gated behind the smoke build tag so regular test runs skip it; the release
workflow runs it as a required job before the Docker image is built.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lakhansamani added a commit that referenced this pull request Jun 12, 2026
… MCP) (#620)

* feat(api): multi-protocol public API surface (GraphQL + gRPC + REST + MCP)

Adds gRPC + grpc-gateway REST + MCP surfaces for the public GraphQL ops
(no `_` prefix), driven from a single proto source of truth. GraphQL stays
unchanged; admin ops stay GraphQL-only.

Consolidates the previously-stacked PRs #614#615#616#617#618#619
into a single change against main.

PROTO (proto/)
  - buf v2 module rooted at buf.build/authorizerdev/authorizer
  - Single AuthorizerService with 19 RPCs whose names match GraphQL ops
    1:1: Signup, Login, Logout, MagicLinkLogin, VerifyEmail,
    ResendVerifyEmail, VerifyOtp, ResendOtp, ForgotPassword, ResetPassword,
    Profile, UpdateProfile, DeactivateAccount, Revoke, Session,
    ValidateJwtToken, ValidateSession, Meta, Permissions
  - common/v1: annotations (required_permissions, mcp_tool, audit_log,
    public), pagination, errors, shared AppData
  - Each RPC's response wrapped in a per-RPC message so buf STANDARD's
    RPC_REQUEST_RESPONSE_UNIQUE lint passes; shared inner types (AuthResponse,
    User, Meta) live in proto/authorizer/v1/types.proto
  - google.api.http annotations drive REST: GET /v1/{method} for trivially-
    empty queries (meta, profile, permissions, logout), POST /v1/{method}
    otherwise. Snake_case method paths mirror GraphQL identifiers.
  - buf STANDARD lint + format both enforced in CI; bufbuild/buf-action@v1
    runs lint always, breaking-check on PRs, format -d --exit-code always

TRANSPORT-AGNOSTIC SERVICE LAYER (internal/service/)
  - sideeffects.go: RequestMetadata + ResponseSideEffects + MetaFromGin /
    ApplyToGin / MetaFromGRPC / ApplyToGRPC bridges
  - provider.go: service.Provider interface
  - signup.go, meta.go: migrated from internal/graphql; resolvers become
    thin transport adapters
  - Supporting helpers: parsers.GetHostFromRequest/GetAppURLFromRequest,
    cookie.BuildSessionCookies/BuildMfaSessionCookies (existing gin
    wrappers now delegate to these so behaviour is byte-identical)

gRPC SERVER (internal/grpcsrv/)
  - server.go: AuthorizerService registered, gRPC reflection (gated on
    --enable-grpc-reflection), gRPC health checking, graceful shutdown
  - interceptors: recovery (panic → codes.Internal), logging (per-code
    level), validate (protovalidate)
  - handlers/authorizer.go: Meta delegates to service.Meta; the other 18
    methods inherit UnimplementedAuthorizerServiceServer and return
    codes.Unimplemented until their handler migrates from internal/graphql
  - transport/grpc_metadata.go: gRPC metadata ↔ RequestMetadata bridge
    (extracts cookies from grpcgateway-cookie, preserves multi-cookie
    Set-Cookie responses)

REST GATEWAY (internal/gateway/)
  - mount.go: serves grpc-gateway via in-process bufconn dial — no extra
    TCP hop, no TLS plumbing
  - JSONPb marshaler: UseProtoNames=true so REST payloads match GraphQL's
    snake_case shape
  - Mounted at /v1/* under the existing gin router (shares CORS, security
    headers, rate limit, logger middleware automatically)
  - /openapi.json serves the merged swagger spec (embedded via go:embed
    from gen/openapi/openapi.go so it works regardless of cwd)

MCP SERVER (internal/mcp/)
  - scanner.go: walks grpc.Server.GetServiceInfo() + protoregistry.GlobalFiles,
    reads the mcp_tool annotation on each method to build a tool registry
  - schema.go: derives JSON Schema from proto request descriptors, with
    cycle guard for self-recursive types (google.protobuf.Value)
  - server.go: registers tools dynamically on a github.com/modelcontextprotocol/
    go-sdk Server; tool handlers unmarshal JSON args into a dynamicpb.Message,
    invoke the gRPC method via an in-process bufconn, marshal the response
    back to JSON. gRPC errors surface as CallToolResult{IsError:true} so
    the LLM gets actionable text
  - Today's MCP-exposed tools (from proto annotations): meta, profile,
    session, permissions. Credential-bearing methods stay unexposed
  - `authorizer mcp` subcommand (cmd/mcp.go) serves over stdio for
    `claude mcp add authorizer -- /path/to/authorizer mcp ...`

CLI (cmd/root.go, cmd/mcp.go, internal/config/config.go)
  - --grpc-port (default 9091; collision-checked against --http-port and
    --metrics-port at startup), --enable-grpc-reflection (default true),
    --grpc-tls-cert / -key / -insecure (TLS plumbing placeholders; TLS
    implementation is a follow-up PR)
  - server.Run starts HTTP + metrics + gRPC + REST gateway listeners with
    shared graceful shutdown

TESTS
  - internal/parsers/url_test.go        GetHostFromRequest priority + spoof rejection
  - internal/cookie/cookie_test.go      BuildSessionCookies/BuildMfaSessionCookies shape
  - internal/service/sideeffects_test.go MetaFromGin/ApplyToGin nil-safety + roundtrip
  - internal/grpcsrv/interceptors/      recovery / logging / validate
  - internal/grpcsrv/transport/         gRPC metadata bridge (cookies, fallbacks)
  - internal/mcp/schema_test.go         flat scalars, nested message, cycle-safety regression
  - internal/integration_tests/grpc_meta_test.go      AuthorizerService.Meta
  - internal/integration_tests/grpc_surface_test.go   all 18 stubs return Unimplemented + gRPC health
  - internal/integration_tests/rest_meta_test.go      GET /v1/meta through gateway
  - internal/integration_tests/rest_openapi_test.go   /openapi.json serves embedded spec
  - internal/integration_tests/mcp_test.go            tools/list + tools/call meta
  - internal/integration_tests/mcp_stubs_test.go      stub returns CallToolResult{IsError:true}
  - Existing GraphQL integration suite still passes (65–70s, no behaviour drift)

What's NOT in this PR (deferred)
  - --grpc-tls-cert / -key / -insecure are wired into config but not yet
    enforced; TLS implementation lands in a follow-up alongside metrics-
    listener TLS
  - 18 of the 19 gRPC methods (and their REST mirrors + MCP tools) are
    Unimplemented stubs; each becomes real as its op migrates from
    internal/graphql into internal/service in follow-up PRs. The
    annotation-driven MCP scanner + gateway routing means follow-ups
    don't need to touch the gRPC/REST/MCP scaffolding — only add the
    service-layer method and the handler delegation

* feat(api,mcp): migrate 7 stubs; security audit fixes; lock stdio-only MCP (#621)

Implements 7 of the 17 stubbed AuthorizerService methods (Profile,
Permissions, Logout, Revoke, ValidateJwtToken, ValidateSession, Session)
following the established service-layer pattern, and addresses the
security audit findings against the MCP surface.

SECURITY AUDIT FIXES

C1 — Session response carries access_token / refresh_token / id_token /
authenticator_secret / recovery_codes. The proto annotation on Session
flipped to mcp_tool.exposed = false so those credentials never land in
an LLM transcript. Session remains available via gRPC + REST + GraphQL
for legitimate browser/server-to-server consumers.

H1 — MCP→gRPC auth propagation. New `--mcp-bearer` flag on the
`authorizer mcp` subcommand; the MCP server stamps `Authorization:
Bearer <token>` on every outgoing gRPC call. Identity-bearing tools
(profile, permissions) now have a caller to attribute to; anonymous
runs still work for the public Meta tool but identity-bearing tools
surface a clean unauthorized error.

H2 — Recovery interceptor redacts panic values. The recovered value is
no longer dumped via `.Interface("panic", r)` (which would have logged
credentials if a handler ever panicked with the request struct); only
the panic type is logged for triage. Regression test included.

STDIO-ONLY MCP TRANSPORT

internal/mcp/server.go — explicit type-level documentation: stdio is
the ONLY supported transport. The Server has no RunHTTP / RunTCP /
RunSSE methods, intentionally.

internal/mcp/transport_test.go — `TestServer_StdioOnly` reflects over
*Server's exported methods and fails the build if anyone adds a method
whose name suggests a network transport (RunHTTP, ListenTCP, ServeWS,
etc.). To add a transport: implement an MCP-side auth interceptor
first, then update the allow-list.

cmd/mcp.go — docstring + CLI long help explicitly state "stdio only".

7 STUB MIGRATIONS

internal/service: profile.go, permissions.go, logout.go, revoke.go,
validate_jwt_token.go, validate_session.go, session.go,
permission_check.go (shared helper). All follow the SignUp pattern:
take RequestMetadata, return (result, *ResponseSideEffects, error).

internal/grpcsrv/handlers: authorizer.go grows 4 real method
implementations (Profile, Permissions, Logout, Revoke,
ValidateJwtToken, ValidateSession, Session). project.go adds
projectUser / projectAuthResponse / projectAppData / claimsToAppData /
protoToModelPermissions helpers shared across methods.

internal/graphql: resolvers for the seven ops become thin delegations
(same pattern as Signup + Meta).

internal/cookie: BuildDeleteSessionCookies added; DeleteSession now
delegates to it (transport-agnostic mirror of the existing pattern).

internal/service/provider.go: Dependencies grows AuthorizationProvider;
the four new methods land on the Provider interface. All call sites
(cmd/root, cmd/mcp, test_helper) wire it through.

TESTS

- TestRecovery_DoesNotLogCredentialBearingPanicValue (H2 regression)
- TestServer_StdioOnly (transport lock-down)
- TestMCPListAndCallMeta now expects 3 MCP tools (meta/profile/permissions);
  session was DROPPED per C1.
- TestMCPToolErrorSurfacesAsIsErrorResult exercises anonymous call to
  identity-bearing tool (formerly the "stubbed tool" test).
- TestAuthorizerServiceStubsReturnUnimplemented shrunk by 7 entries.
- Full SQLite integration suite (67s) still green — no regression on
  the existing GraphQL behaviour for any of the 7 migrated ops.

STILL STUBBED (10 ops, follow-up PRs)

Login, MagicLinkLogin, VerifyEmail, ResendVerifyEmail, VerifyOtp,
ResendOtp, ForgotPassword, ResetPassword, UpdateProfile,
DeactivateAccount. Each is a substantial state machine; better as
focused individual PRs than rushed in a batch.

* feat(api): typed errors + REST status codes, logout POST, signup gRPC, fmt/lint

Addresses the multi-protocol API review findings.

REST/gRPC correctness (a): introduce transport-agnostic typed errors
(internal/service/errors.go, ErrorKind) and a gRPC ErrorMap interceptor
so business errors map to proper codes (InvalidArgument->400,
Unauthenticated->401, PermissionDenied->403, NotFound->404,
FailedPrecondition->400) instead of collapsing to Unknown/500. All
migrated service methods classify their client-facing errors; messages
are unchanged so GraphQL behaviour is byte-identical.

Logout GET->POST (b): logout mutates state and is audited, so it must
not be a safe GET (RFC 9110 9.2.1, CSRF). Proto annotation + regen.

REST error envelope (d): gateway WithErrorHandler emits a stable
snake_case envelope {"code","message"}; WithRoutingErrorHandler keeps
true HTTP statuses (e.g. 405 on method mismatch instead of 501).

Signup gRPC handler (4): wire service.SignUp into the gRPC/REST/MCP
surface (was a stub despite the service method existing).

Fix latent nil-Request panic: MetaFromGRPC now synthesizes an
*http.Request from the gRPC metadata so the gin-shim TokenProvider
helpers in Profile/Permissions/Logout/Session/ValidateSession don't
dereference nil over gRPC/REST.

Tooling: add make fmt (fmt-go/fmt-ts) and make lint (lint-go/lint-ts)
plus .golangci.yml (skips generated code).

Docs: document Stripe-aligned REST conventions (snake_case paths,
method-by-effect, /v1 prefix, error envelope) and correct the mapping
table to the as-implemented paths.

Tests: cross-protocol error-message consistency (GraphQL==gRPC==REST),
REST status-code/envelope coverage, logout-is-POST, MetaFromGRPC request
synthesis. project.go AppData converters de-duplicated.

* feat(api): expose check_permissions/list_permissions on gRPC, REST, and MCP

- typed ErrFgaNotEnabled as FailedPrecondition (gRPC FailedPrecondition,
  REST 400 failed_precondition) instead of an opaque internal error
- FGA integration setup wires the service layer with the embedded engine
- surface tests: 20-RPC assertion, fail-closed + validation coverage for
  both permission RPCs over gRPC and REST, MCP tool list and nested-schema
  coverage (check_permissions/list_permissions replace the permissions tool)
- docs/grpc-rest-api-spec.md updated to the new permission surface and
  required_relations gates

* refactor: review fixes — token-derived FGA subject, shared engine init

- session/validate_session pass the token-validated claims.Subject (not the
  re-fetched user record ID) to enforceRequiredRelations, matching main
- extract initAuthzEngine into cmd/fga_engine.go; root.go and the mcp
  subcommand now share one OpenFGA init path

* fix(cli): mcp subcommand inherits server flags

RootCmd registered its flags as local flags, which cobra does not
propagate to subcommands — the documented `authorizer mcp
--database-type=... --client-id=...` invocation failed with
'unknown flag'. Register them as persistent flags so the mcp
subcommand shares the full server flag surface and rootArgs storage.

Verified end-to-end over stdio: initialize handshake, tools/list
(meta, profile, check_permissions, list_permissions), nested input
schema, public meta call, and fail-closed IsError results for
anonymous identity-bearing calls.

* ci: skip buf breaking until main carries the proto module

buf breaking diffs against main#subdir=proto, but proto/ first lands in
this PR — the check can only fail before merge ('Module had no .proto
files'). Gate it on the base branch actually having protos, and disable
the action's PR comment which the job token lacks permission to post.

* fix(gateway,mcp): propagate authorizer host so issuer validation works off-HTTP

Two fixes found by live end-to-end smoke testing of the new surfaces:

- REST gateway: the in-process bufconn call carries ':authority=bufconn',
  so the service layer resolved the host as http://bufconn and JWT issuer
  validation rejected every token on /v1/*. A WithMetadata annotator now
  forwards the original request's host via parsers.GetHostFromRequest
  (same spoof-hardened resolution as the gin path) as x-authorizer-url,
  which transport.MetaFromGRPC already reads first.
- MCP: stampAuth now also stamps x-authorizer-url from the new
  --mcp-authorizer-url flag, so identity-bearing tools (profile,
  check_permissions, list_permissions) pass issuer validation when
  --mcp-bearer is set.

Regression tests: TestRESTGatewayForwardsAuthorizerHost (REST signup must
mint iss=<forwarded host>, then round-trip on /v1/profile) and
TestStampAuth (both metadata keys).

* test(e2e): release smoke suite for all public API surfaces

make smoke builds the real binary and runs one black-box scenario across
GraphQL, REST, gRPC, and MCP stdio: seed an OpenFGA model + tuple, sign a
user up, then assert the identical check_permissions / list_permissions
decision (allow + deny) on every surface, plus REST fail-closed/validation
envelopes and the MCP handshake + tool discovery with a real bearer token.

Gated behind the smoke build tag so regular test runs skip it; the release
workflow runs it as a required job before the Docker image is built.

---------
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant