Skip to content

HDDS-14935. [STS] Handle Latent Inconsistencies in S3 API Acl Checks#10009

Merged
ChenSammi merged 7 commits into
apache:HDDS-13323-stsfrom
fmorg-git:HDDS-14935
May 27, 2026
Merged

HDDS-14935. [STS] Handle Latent Inconsistencies in S3 API Acl Checks#10009
ChenSammi merged 7 commits into
apache:HDDS-13323-stsfrom
fmorg-git:HDDS-14935

Conversation

@fmorg-git
Copy link
Copy Markdown
Contributor

@fmorg-git fmorg-git commented Mar 31, 2026

Please describe your PR in detail:

  • Currently, S3 APIs are not consistent in how ACL checks are applied. For example, PutObject (i.e. OMKeyCreateRequest, OMAllocateBlockRequest, OMKeyCommitRequest), DeleteObject (i.e. OMKeyDeleteRequest), PutObjectTagging (i.e. S3PutObjectTaggingRequest), etc. perform their ACL checks in preExecute() which is on the OM leader RPC thread.

However, APIs like DeleteBucket (i.e. OMBucketDeleteRequest), PutBucketAcl (i.e. OMBucketSetAclRequest), etc. perform their ACL checks in validateAndUpdateCache() which is on the Ratis apply thread. This affects STS in that the STSTokenIdentifier ThreadLocal currently is not available on the Ratis apply thread, so if the STS token has an inline session policy, some ACL checks that should pass would fail. This ticket addresses the inconsistency by ensuring the ThreadLocal is always available on the Ratis apply thread via updates to OzoneManagerStateMachine.

A separate PR is already open to move the checks to the correct place (#9653 and https://issues.apache.org/jira/browse/HDDS-13855), but this ticket is a fallback in case any future API has the check in the incorrect place, so it won't break STS.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-14935

How was this patch tested?

unit tests, smoke tests

@fmorg-git fmorg-git marked this pull request as draft March 31, 2026 01:07
@kerneltime kerneltime requested a review from sumitagrawl March 31, 2026 05:00
@github-actions
Copy link
Copy Markdown

This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days.

@github-actions github-actions Bot added the stale label Apr 22, 2026
@fmorg-git
Copy link
Copy Markdown
Contributor Author

adding comment to remove stale label

@fmorg-git
Copy link
Copy Markdown
Contributor Author

hi @ChenSammi @sodonnel - just a quick reminder that this PR is ready to review. Thanks!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Comment on lines +121 to +129
final OMRequest.Builder requestBuilder = getOmRequest().toBuilder()
.setUserInfo(getUserIfNotExists(ozoneManager))
.setLayoutVersion(layoutVersion).build();
.setLayoutVersion(layoutVersion);

if (requestBuilder.hasS3Authentication()) {
requestBuilder.setS3Authentication(
resolveS3Authentication(requestBuilder.getS3Authentication(), OzoneManager.getStsTokenIdentifier()));
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to fix all overriding preExecute implementations and to add a checkstyle rule for future ones - 6d43f39

Comment on lines +669 to +688
if (ozoneManager.isSecurityEnabled() && request.hasS3Authentication()) {
// STS token verification runs on the leader RPC path so we don't need to recheck here on the apply
// after the log is committed
STSSecurityUtil.ensureResolvedStsFieldsInvariants(request);

final OzoneManagerProtocolProtos.S3Authentication s3Auth = request.getS3Authentication();
if (s3Auth.hasSessionToken() && !s3Auth.getSessionToken().isEmpty()) {
// ThreadLocal carries session policy for OmMetadataReader
final STSTokenIdentifier rehydratedTokenIdentifier = new STSTokenIdentifier(
s3Auth.hasResolvedStsTempAccessKeyId() ? s3Auth.getResolvedStsTempAccessKeyId() : "",
s3Auth.hasResolvedStsOriginalAccessKeyId() ? s3Auth.getResolvedStsOriginalAccessKeyId() : "",
s3Auth.hasResolvedStsRoleArn() ? s3Auth.getResolvedStsRoleArn() : "",
java.time.Instant.MAX, // ensure it deterministically is not expired
"", // no secretAccessKey needed
s3Auth.hasResolvedStsSessionPolicy() ? s3Auth.getResolvedStsSessionPolicy() : "",
null // no encryption key needed
);
OzoneManager.setStsTokenIdentifier(rehydratedTokenIdentifier);
isStsThreadLocalSet = true;
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated - c483af5


return getOmRequest().toBuilder()
// super.preExecute resolves S3Authentication (STS) for Ratis apply. Merge SetAclRequest changes on top.
final OMRequest request = super.preExecute(ozoneManager);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned OMBucketDeleteRequest in the JIRA description. Shall we call the super.preExecute in OMBucketDeleteRequest too?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two other requests, OMBucketAddAclRequest, OMBucketRemoveAclRequest. So we should consider move this to OMBucketAclRequest.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, we should call super.preExecute() in OMBucketDeleteRequest as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned OMBucketDeleteRequest in the JIRA description. Shall we call the super.preExecute in OMBucketDeleteRequest too?

OMBucketDeleteRequest doesn't override preExecute() method, so it automatically gets the resolved sts fields behavior. The issue happens when a subclass overrides preExecute() but does not call super.preExecute().

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two other requests, OMBucketAddAclRequest, OMBucketRemoveAclRequest. So we should consider move this to OMBucketAclRequest.

updated as part of 6d43f39. I changed the subclasses directly.

OzoneManagerProtocolProtos.S3Authentication s3Auth, STSTokenIdentifier stsTokenIdentifier) {
final OzoneManagerProtocolProtos.S3Authentication.Builder s3AuthBuilder = s3Auth.toBuilder();

if (s3Auth.hasSessionToken() && !s3Auth.getSessionToken().isEmpty() && stsTokenIdentifier != null) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "s3Auth.hasSessionToken() && !s3Auth.getSessionToken().isEmpty()" and null stsTokenIdentifier allowed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no this is not allowed. Updated to throw an error if this happens and a unit test coverage - b1ed2d2

Fabian Morgan added 3 commits May 20, 2026 19:14
…ss that overrides preExecute calls super.preExecute. Add checkstyle rule for this as well
@fmorg-git fmorg-git requested review from ChenSammi and Russole May 21, 2026 02:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 40 out of 40 changed files in this pull request and generated 1 comment.

Comment thread hadoop-hdds/dev-support/checkstyle/checkstyle.xml
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@ChenSammi ChenSammi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last patch LGTM.

@ChenSammi ChenSammi merged commit 5665462 into apache:HDDS-13323-sts May 27, 2026
90 of 91 checks passed
@fmorg-git
Copy link
Copy Markdown
Contributor Author

Thanks all for the review and merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants