HDDS-15273. Add OIDC WebIdentity STS design#10338
Conversation
|
@ChenSammi @errose28 @fmorg-git @Tejaskriya please take a look |
| title: OIDC AssumeRoleWithWebIdentity for Ozone STS | ||
| summary: Web identity support for Ozone STS using OIDC and Ranger authorization | ||
| date: 2026-05-13 | ||
| status: proposed |
There was a problem hiding this comment.
please add the Jira and author fields, similar to https://github.com/apache/ozone/pull/9223/changes. The Jira should be different from HDDS-13323 to not intersperse the implementations.
| credentials into `SignatureInfo.sessionToken`. | ||
| - `EndpointBase` and `S3STSEndpointBase` propagate the session token into | ||
| `S3Auth`. | ||
| - `OzoneManagerProtocolClientSideTranslatorPB` copies `S3Auth` into |
There was a problem hiding this comment.
this statement was listed above previously
| not call Keycloak, refresh JWKS, revalidate JWTs, or otherwise depend on current | ||
| external IdP state during Ratis apply or replay. Credential expiration is | ||
| computed by the leader before replication and stored as | ||
| `credentialExpirationEpochSeconds` so replay does not depend on the apply-time |
There was a problem hiding this comment.
curious why a new field credentialExpirationEpochSeconds instead of the existing expirationEpochSeconds?
|
|
||
| - only for the STS application path; | ||
| - only for `Action=AssumeRoleWithWebIdentity`; | ||
| - only when `ozone.sts.web.identity.enabled=true`; |
There was a problem hiding this comment.
the current sts configuration flag is ozone.s3g.sts.http.enabled - should the new one be ozone.s3g.sts.web.identity.enabled?
| - `RoleSessionName=<session>` | ||
| - `WebIdentityToken=<OIDC JWT>` | ||
| - `DurationSeconds=<optional>` | ||
| - `Policy=<optional, only if the existing STS AssumeRole session policy path is |
There was a problem hiding this comment.
The current AssumeRole flow supports session policies and converts the IAM resources to Ozone objects, permissions and actions for Ranger to consume. The Ranger authorizer defines the embedded session policy format in the STS token - it is opaque to Ozone.
|
|
||
| The common request-shape extension point is | ||
| `AssumeRoleWithWebIdentityRequest`, with | ||
| `IAccessAuthorizer.generateAssumeRoleWithWebIdentitySessionPolicy()` as the |
There was a problem hiding this comment.
Curious why the current IAccessAuthorizer.generateAssumeRoleSessionPolicy() method can't be used once the JWT token is validated and a valid Ozone Kerberos user is identified?
|
|
||
| - `authType=ASSUME_ROLE` for existing tokens, with `originalAccessKeyId` | ||
| preserved; | ||
| - `authType=WEB_IDENTITY` for new tokens, with effective user, groups, issuer, |
There was a problem hiding this comment.
Who would the effective user be in the case of WEB_IDENTITY?
| Errors should use STS/S3-compatible codes where possible: | ||
|
|
||
| - invalid or expired JWT: `InvalidIdentityToken`; | ||
| - disabled feature: `AccessDenied` or `InvalidAction`; |
There was a problem hiding this comment.
I believe FEATURE_NOT_ENABLED is used for disabled feature
| - disabled feature: `AccessDenied` or `InvalidAction`; | ||
| - unauthorized role assumption: `AccessDenied`; | ||
| - unsupported optional parameter: `InvalidParameterValue`; | ||
| - internal validation or revocation failures: fail closed. |
There was a problem hiding this comment.
What response code would occur if the JWKS server is down/unresponsive?
| - fake/Ranger authorizer allows `tomato-user` to assume a test role and denies | ||
| `denied-user`. | ||
|
|
||
| Full Ranger container testing is optional for the MVP. Unit and mock-layer tests |
There was a problem hiding this comment.
I think it would be a good idea to have full ranger container testing. For example, for the tomato-user example, I think it won't work if that user isn't also in Ranger correct?
What changes were proposed in this pull request?
This PR adds the design document for OIDC/WebIdentity support in Apache Ozone STS.
The design describes how Ozone STS can support an
AssumeRoleWithWebIdentityflow, allowing an OIDC token issued by an external identity provider such as Keycloak to be exchanged for temporary S3credentials.
This is a design-document-only PR. It does not introduce runtime code changes.
The implementation remains in PR #10266:
The design covers:
x-amz-security-tokenfor subsequent S3 access.AssumeRole.This design does not propose replacing Kerberos daemon authentication, does not add OFS OIDC login, does not add CLI device-code login, and does not make Keycloak Authorization Services the Ozone policy
engine.
This design PR is split from the implementation PR so the design can be reviewed independently and documentation edits do not require rerunning the full implementation CI.
The operator/runtime Keycloak/Ranger guide remains in the implementation PR for now because it is tied to implementation config and runtime behavior.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-15273
How was this patch tested?
This is a design-document-only PR.
The patch was checked with: