Skip to content

fix: sigv4 auth to use base64-encoded content sha256 and custom canonical request#3120

Open
plusplusjiajia wants to merge 1 commit intoapache:mainfrom
plusplusjiajia:fix-sigv4-auth
Open

fix: sigv4 auth to use base64-encoded content sha256 and custom canonical request#3120
plusplusjiajia wants to merge 1 commit intoapache:mainfrom
plusplusjiajia:fix-sigv4-auth

Conversation

@plusplusjiajia
Copy link
Member

@plusplusjiajia plusplusjiajia commented Mar 4, 2026

Rationale for this change

This PR fixes the SigV4 request signing implementation in pyiceberg to align with the behavior of the [Iceberg Java SDK]
The SigV4 canonical request must use the hex-encoded SHA-256 of the payload for signature computation, while the x-amz-content-sha256 header uses base64 encoding. The default botocore.auth.SigV4Auth uses the x-amz-content-sha256 header value directly in the canonical request, which is incorrect when the header is base64-encoded.

Are these changes tested?

Yes

Are there any user-facing changes?

No

@plusplusjiajia plusplusjiajia changed the title Fix SigV4 auth to use base64-encoded content SHA256 and custom canonical request fix: sigv4 auth to use base64-encoded content sha256 and custom canonical request Mar 5, 2026
@JingsongLi
Copy link

Hi @kevinjqliu Can you take a look?

@kevinjqliu
Copy link
Contributor

thanks for the PR! i understand this is to align with the java SigV4 implementation. Could you help me understand the specific scenario in which this is currently breaking? (i dont know much about sigv4)

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reading up on SigV4 java implementation

but some initial comments

def canonical_request(self, request: Any) -> str:
cr = super().canonical_request(request)
# Replace the last line (body_checksum) with hex-encoded payload hash.
return cr.rsplit("\n", 1)[0] + "\n" + self.payload(request)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using rsplit feels fragile, can we reuse the same logic from canonical_request but just replace the last part with self.payload(request)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im confused why the _IcebergSigV4Auth class is needed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally we should test against an integration test before merging. unit tests can only do so much

@kevinjqliu
Copy link
Contributor

Ok so from what ive gathered

SigV4 is AWS's request signing mechanism. We're addressing the scenario where both SigV4 and OAuth are required for REST catalog authentication.
The REST catalog requires an OAuth token, and AWS requires the request to also be SigV4-signed. The request is first authenticated by the delegate auth session, which stamps the OAuth bearer token into the Authorization header. That header is then relocated to Original-Authorization before SigV4 signs the request — freeing up Authorization for the SigV4 signature while keeping the OAuth token present and covered by the HMAC.

Are you running your own IRC on AWS? i think glue IRC doesnt need the oauth part

Comment on lines +725 to +727
if request.body:
body_bytes = request.body.encode("utf-8") if isinstance(request.body, str) else request.body
content_sha256_header = base64.b64encode(hashlib.sha256(body_bytes).digest()).decode()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think we need to base64 encode this. the java implementation doesnt do this. neither does the python reference implementation

@kevinjqliu
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants