feat(auth): implement SigV4 authentication for REST catalog#616
feat(auth): implement SigV4 authentication for REST catalog#616plusplusjiajia wants to merge 8 commits intoapache:mainfrom
Conversation
bfc98f7 to
9ddbdbd
Compare
915c87b to
d1c0732
Compare
| ICEBERG_PRECHECK(delegate_type != AuthProperties::kAuthTypeSigV4, | ||
| "Cannot delegate a SigV4 auth manager to another SigV4 auth manager"); |
There was a problem hiding this comment.
add delegate_type in error message?
There was a problem hiding this comment.
add delegate_type in error message?
Good idea, done.
| - name: Install dependencies | ||
| shell: bash | ||
| run: sudo apt-get update && sudo apt-get install -y libcurl4-openssl-dev | ||
| run: sudo apt-get update && sudo apt-get install -y libcurl4-openssl-dev ninja-build |
There was a problem hiding this comment.
n00b question why is ninja-build required here?
There was a problem hiding this comment.
Good question — I added a build step in this PR so the linter can see the SigV4 code (needs compile_commands.json from a real build). I used cmake -G Ninja for speed and to be consistent with the other CI workflows, and Ninja is not preinstalled on ubuntu-24.04, hence the extra ninja-build package. Happy to switch to Make if you'd prefer.
| if (session_token_it != properties.end() && !session_token_it->second.empty()) { | ||
| Aws::Auth::AWSCredentials credentials(access_key_it->second.c_str(), | ||
| secret_key_it->second.c_str(), | ||
| session_token_it->second.c_str()); | ||
| return std::make_shared<Aws::Auth::SimpleAWSCredentialsProvider>(credentials); | ||
| } | ||
| Aws::Auth::AWSCredentials credentials(access_key_it->second.c_str(), | ||
| secret_key_it->second.c_str()); | ||
| return std::make_shared<Aws::Auth::SimpleAWSCredentialsProvider>(credentials); |
There was a problem hiding this comment.
Nit: could do only one return if Credentials are created in the conditional statement.
There was a problem hiding this comment.
Nit: could do only one return if Credentials are created in the conditional statement.
Nice catch, done.
| auto it = properties.find(AuthProperties::kSigV4SigningName); | ||
| if (it != properties.end() && !it->second.empty()) { | ||
| return it->second; | ||
| } |
There was a problem hiding this comment.
if(properties.count(AuthProperties::kSigV4SigningName) > 0) {
// do work
}
might be a little less verbose than
auto it = properties.find(AuthProperties::kSigV4SigningName);
if (it != properties.end() && !it->second.empty()) {
return it->second;
}
There was a problem hiding this comment.
Thanks for the suggestion! I chose to keep the !it->second.empty() check on purpose — the intent is for an explicitly-empty value (e.g., a env var set to "") to also fall through to the legacy key / default.
| const TableIdentifier& table, | ||
| const std::unordered_map<std::string, std::string>& properties, | ||
| std::shared_ptr<AuthSession> parent) { | ||
| auto* sigv4_parent = dynamic_cast<SigV4AuthSession*>(parent.get()); |
There was a problem hiding this comment.
use checked_pointer_cast instead
There was a problem hiding this comment.
use
checked_pointer_castinstead
Done here as well.
| Result<std::shared_ptr<AuthSession>> SigV4AuthManager::ContextualSession( | ||
| const std::unordered_map<std::string, std::string>& context, | ||
| std::shared_ptr<AuthSession> parent) { | ||
| auto* sigv4_parent = dynamic_cast<SigV4AuthSession*>(parent.get()); |
There was a problem hiding this comment.
use
checked_pointer_cast
Thanks, that's much better! Done.
| std::string signing_region_; | ||
| std::string signing_name_; | ||
| std::shared_ptr<Aws::Auth::AWSCredentialsProvider> credentials_provider_; | ||
| /// Shared signer instance, matching Java's single Aws4Signer per manager. |
There was a problem hiding this comment.
I find this comment a bit confusing especially given that signer_ is a unique pointer that will be destroyed when SigV4AuthSession is destructed
There was a problem hiding this comment.
You're right, sorry about the confusion — the "shared signer" wording was misleading since signer_ is owned per-session via unique_ptr. I've removed the comment.
|
|
||
| std::unordered_map<std::string, std::string> headers; | ||
| ASSERT_THAT(session_result.value()->Authenticate(headers), IsOk()); | ||
| ASSERT_THAT(session_result.value()->Authenticate(headers, {}), IsOk()); |
There was a problem hiding this comment.
it feels like Authenticate coudl accept a default value for the second parameter
There was a problem hiding this comment.
Good point, thanks! Done
| /// - IOError: Network or connection errors when reaching auth server | ||
| /// - RestError: HTTP errors from authentication service | ||
| virtual Status Authenticate(std::unordered_map<std::string, std::string>& headers) = 0; | ||
| virtual Status Authenticate(std::unordered_map<std::string, std::string>& headers, |
There was a problem hiding this comment.
The current design splits the request context into two separate parameters (headers as in-out + HTTPRequestContext as a separate struct).
The Java implementation uses a cleaner "request-in, request-out" pattern where authenticate() receives the full HTTPRequest and returns a new immutable request with auth headers, I'd suggest aligning with Java by introducing an HTTPRequest type that encapsulates method, url, headers, and body together, and changing the signature to:
virtual Result Authenticate(const HTTPRequest& request) = 0;
I'm open for this
There was a problem hiding this comment.
Thanks @lishuxu ! Agreed — aligning with Java's request-in/request-out pattern is the right call. I'll address this in the current PR: introducing an HTTPRequest type (encapsulating method, url, headers, body), changing the signature to Result Authenticate(const HTTPRequest& request). Will push an update shortly — PTAL when it's ready.
| // ---- SigV4 AWS credential entries ---- | ||
|
|
||
| /// AWS region for SigV4 signing. | ||
| inline static const std::string kSigV4SigningRegion = "rest.signing-region"; |
There was a problem hiding this comment.
We can remove the legacy key kSigV4Region/kSigV4Service
| // ---- SigV4 AWS credential entries ---- | ||
|
|
||
| /// AWS region for SigV4 signing. |
There was a problem hiding this comment.
The names are self-explanatory. I think we can remove them to keep the code concise.
| inline static const std::string kSigV4DelegateAuthType = | ||
| "rest.auth.sigv4.delegate-auth-type"; | ||
|
|
||
| // ---- SigV4 AWS credential entries ---- |
There was a problem hiding this comment.
Nit: the // ---- SigV4 AWS credential entries ---- section header is redundant given the // ---- SigV4 entries ---- block above already covers SigV4 config. Merge them into a single section.
| } | ||
|
|
||
| #ifdef ICEBERG_BUILD_SIGV4 | ||
| Result<std::unique_ptr<AuthManager>> MakeSigV4AuthManager( |
There was a problem hiding this comment.
MakeSigV4AuthManager is implemented directly in auth_managers.cc, while all other factory functions (MakeNoopAuthManager, MakeBasicAuthManager, MakeOAuth2Manager) are defined in their own translation units and only declared in auth_manager_internal.h. Suggest moving the implementation to sigv4_auth_manager.cc for consistency.
There was a problem hiding this comment.
Good catch! Done.
|
|
||
| #include "iceberg/catalog/rest/auth/auth_manager_internal.h" | ||
| #ifdef ICEBERG_BUILD_SIGV4 | ||
| # include "iceberg/catalog/rest/auth/sigv4_auth_manager.h" |
There was a problem hiding this comment.
Nit: auth_properties.h should come before the #ifdef ICEBERG_BUILD_SIGV4 block to maintain alphabetical include order.
There was a problem hiding this comment.
Thanks for catching it!
Implement AWS SigV4 authentication for the REST catalog client, following Java's
RESTSigV4AuthManagerandRESTSigV4AuthSession.AuthSession::Authenticate()withHTTPRequestContext(method, url, body) for SigV4 request signingSigV4AuthSession: delegate-first auth → relocate conflicting Authorization header → sign with AWS SDKSigV4AuthManager: wraps delegate AuthManager (default OAuth2), resolves credentials from properties or default chainSignerChecksumParamsoutput: empty body → hexEMPTY_BODY_SHA256; non-empty body →Base64(SHA256(body))