From 50cfe94f231038db00930633d2fb23a7cfb1dd6d Mon Sep 17 00:00:00 2001 From: Brett Nicholas <7547222+bigbrett@users.noreply.github.com> Date: Mon, 27 Apr 2026 15:50:19 -0600 Subject: [PATCH] Async support for CMAC --- docs/draft/async-crypto.md | 101 +++- src/wh_client_crypto.c | 962 ++++++++++++++++++++++++++++++------ src/wh_message_crypto.c | 1 + src/wh_server_crypto.c | 40 +- test/wh_test_crypto.c | 762 ++++++++++++++++++++++++++++ wolfhsm/wh_client.h | 16 +- wolfhsm/wh_client_crypto.h | 207 ++++++++ wolfhsm/wh_message_crypto.h | 40 +- 8 files changed, 1954 insertions(+), 175 deletions(-) diff --git a/docs/draft/async-crypto.md b/docs/draft/async-crypto.md index 7bd257924..92fdb3caf 100644 --- a/docs/draft/async-crypto.md +++ b/docs/draft/async-crypto.md @@ -620,6 +620,105 @@ The existing blocking wrappers (`wh_Client_AesCbc`, `wh_Client_AesCtr`, thin shells that call the new async primitives in a poll loop, so blocking and async paths share identical wire behaviour. +## CMAC + +CMAC is the latest algorithm to receive native async support. Unlike SHA, the +existing blocking `wh_Client_Cmac` was already a single oneshot wire round +trip when called with a complete message + key + output. To preserve that +1-RTT behavior while also exposing streaming Update/Final pairs, CMAC ships +with **three** async pairs (six functions total per non-DMA / DMA variant): + +```c +/* Oneshot: full message + key + output in a single round trip */ +int wh_Client_CmacGenerateRequest (whClientContext*, Cmac*, CmacType, + const uint8_t* key, uint32_t keyLen, + const uint8_t* in, uint32_t inLen, + uint32_t outMacLen); +int wh_Client_CmacGenerateResponse(whClientContext*, Cmac*, + uint8_t* outMac, uint32_t* outMacLen); + +/* Streaming: separate Update and Final phases, possibly multiple Updates */ +int wh_Client_CmacUpdateRequest (whClientContext*, Cmac*, CmacType, + const uint8_t* key, uint32_t keyLen, + const uint8_t* in, uint32_t inLen, + bool* requestSent); +int wh_Client_CmacUpdateResponse (whClientContext*, Cmac*); + +int wh_Client_CmacFinalRequest (whClientContext*, Cmac*); +int wh_Client_CmacFinalResponse (whClientContext*, Cmac*, + uint8_t* outMac, uint32_t* outMacLen); +``` + +Each function has a DMA counterpart that transfers the input via DMA. The +output MAC is always returned inline (16 bytes max). Note the slight naming +asymmetry: the oneshot is `wh_Client_CmacGenerateDmaRequest`/`Response`, +while streaming uses `wh_Client_CmacDmaUpdate{Request,Response}` and +`wh_Client_CmacDmaFinal{Request,Response}`. + +### Why three pairs + +- **Generate** is a true oneshot. The server dispatches to + `wc_AesCmacGenerate_ex`, which performs init / update / final in one + call. This preserves the 1-RTT performance of the existing blocking API + for callers that have the full message in hand at once. +- **Update / Final** form the streaming pair. Each Update sends the + current input chunk plus the full CMAC state (`buffer`, `bufferSz`, + `digest`, `totalSz`); each Response carries back the updated state. + Final sends an empty input with `outSz = AES_BLOCK_SIZE`, telling the + server to finalize and return the MAC. + +### Why no client-side partial-block buffering + +SHA does client-side partial-block buffering because `wc_Sha256Update` +processes complete blocks immediately and leaves bufferSz = 0 after +absorbing whole blocks. CMAC is different: `wc_CmacUpdate` deliberately +withholds the *last* whole block in its partial buffer until the next +Update arrives (or Final is called), because the last block has special +key-derived XOR handling. As a result, after a server-side Update the +CMAC buffer can hold any value from 0..AES_BLOCK_SIZE bytes. Imposing a +"bufferSz must be 0 on the wire" invariant (as SHA does) would break +correctness, so CMAC instead round-trips the entire CMAC state on every +Update Request/Response pair. + +### Blocking-wrapper dispatch + +`wh_Client_Cmac` and `wh_Client_CmacDma` retain their existing signatures +and now auto-detect the oneshot case at the top of the function. The +wrapper delegates to `CmacGenerate*` for a single round trip when all of +the following hold: + +- a complete message is supplied (`in`/`inLen` non-NULL/non-zero); +- an output buffer is supplied (`outMac`/`outMacLen` non-NULL, `*outMacLen > 0`); +- for non-DMA, `inLen` fits the inline cap + `WH_MESSAGE_CRYPTO_CMAC_MAX_INLINE_GENERATE_SZ` (DMA has no per-call cap); +- *and* either an explicit key is provided (`key`/`keyLen`), which matches + `wc_AesCmacGenerate_ex` semantics — prior cmac state is irrelevant and may + even be uninitialized — *or* the cmac struct is in fresh state + (`bufferSz == 0 && totalSz == 0`) so an HSM-cached keyId can be used + without losing in-progress data. + +Otherwise (incremental usage, mid-stream state, Final-only call, or +oversize input on non-DMA), the wrapper falls back to the streaming +`CmacUpdate*` + `CmacFinal*` pair. + +### Per-request key + +CMAC's server is stateless: every Request must carry the key. HSM-cached +keys are referenced by `keyId` (set on the cmac via +`wh_Client_CmacSetKeyId`). For inline keys, the bytes are stashed into +`cmac->aes.devKey` on the first Request and replayed on subsequent +Update/Final Requests automatically. + +### DMA wire format + +The CMAC DMA request struct (`whMessageCrypto_CmacAesDmaRequest`) gained +a new `inlineInSz` field carrying inline trailing input (for an +assembled first block, when client-side buffering is desired by some +caller). The current async clients always pass `inlineInSz = 0` and +route input either via DMA (for Update/Generate) or omit input +(Final). The field is reserved for future client-side buffering use +without another wire-format change. + ## Roadmap: Remaining Algorithms The async split pattern will be applied algorithm by algorithm to all crypto @@ -639,6 +738,7 @@ the full set of operations and their planned async status. | AES-CTR | `wh_Client_AesCtr{,Dma}{Request,Response}` | Non-DMA and DMA variants | | AES-ECB | `wh_Client_AesEcb{,Dma}{Request,Response}` | Non-DMA and DMA variants | | AES-GCM | `wh_Client_AesGcm{,Dma}{Request,Response}` | Non-DMA and DMA variants; AAD supports DMA | +| CMAC | Generate / Update / Final Request/Response | Three async pairs: oneshot Generate (1-RTT) plus streaming Update/Final. Non-DMA and DMA variants. Blocking wrapper auto-dispatches to oneshot when conditions allow. | **Planned:** @@ -652,7 +752,6 @@ the full set of operations and their planned async status. | Curve25519 | `wh_Client_Curve25519SharedSecret{Request,Response}` | Low | Single-shot | | Ed25519 Sign | `wh_Client_Ed25519Sign{Request,Response}` | Low | Single-shot | | Ed25519 Verify | `wh_Client_Ed25519Verify{Request,Response}`| Low | Single-shot | -| CMAC | `wh_Client_Cmac{Request,Response}` | Medium | Streaming (Init/Update/Final), so follows SHA-style pattern rather than the one-shot AES pattern | | ML-DSA Sign | `wh_Client_MlDsaSign{Request,Response}` | Low | Post-quantum; single-shot | | ML-DSA Verify | `wh_Client_MlDsaVerify{Request,Response}` | Low | Post-quantum; single-shot | diff --git a/src/wh_client_crypto.c b/src/wh_client_crypto.c index d18198125..023edeb84 100644 --- a/src/wh_client_crypto.c +++ b/src/wh_client_crypto.c @@ -4710,122 +4710,445 @@ int wh_Client_CmacGetKeyId(Cmac* key, whNvmId* outId) #ifndef NO_AES -int wh_Client_Cmac(whClientContext* ctx, Cmac* cmac, CmacType type, - const uint8_t* key, uint32_t keyLen, const uint8_t* in, - uint32_t inLen, uint8_t* outMac, uint32_t* outMacLen) +/* Resolve the key source for a CMAC request: if the caller didn't provide + * inline bytes and the cmac struct has cached bytes, use those. HSM keys + * (non-erased keyId) are resolved server-side. */ +static void _CmacResolveClientKey(Cmac* cmac, const uint8_t** inout_key, + uint32_t* inout_keyLen, whKeyId* out_key_id) { - int ret = WH_ERROR_OK; - uint16_t group = WH_MESSAGE_GROUP_CRYPTO; - uint16_t action = WC_ALGO_TYPE_CMAC; - whMessageCrypto_CmacAesRequest* req = NULL; - whMessageCrypto_CmacAesResponse* res = NULL; - uint8_t* dataPtr = NULL; + whKeyId key_id = WH_DEVCTX_TO_KEYID(cmac->devCtx); - if (ctx == NULL || cmac == NULL) { + if (*inout_key == NULL && *inout_keyLen == 0 && WH_KEYID_ISERASED(key_id) && + cmac->aes.keylen > 0) { + *inout_key = (const uint8_t*)cmac->aes.devKey; + *inout_keyLen = cmac->aes.keylen; + } + *out_key_id = key_id; +} + +/* Reject keys that would overflow cmac->aes.devKey (32 bytes) when cached + * client-side, matching the server's own keySz > AES_256_KEY_SIZE check. + * Must be called after _CmacResolveClientKey and before any state mutation + * or transport send. */ +static int _CmacValidateInlineKeyLen(uint32_t keyLen) +{ + if (keyLen > AES_256_KEY_SIZE) { return WH_ERROR_BADARGS; } + return WH_ERROR_OK; +} + +/* Enforce wolfCrypt's CMAC tag length contract locally to fail fast on a + * transaction that would return an error from server */ +static int _CmacValidateTagLen(uint32_t outMacLen) +{ + if (outMacLen > 0 && + (outMacLen < WC_CMAC_TAG_MIN_SZ || outMacLen > WC_CMAC_TAG_MAX_SZ)) { + return WH_ERROR_BUFFER_SIZE; + } + return WH_ERROR_OK; +} - whKeyId key_id = WH_DEVCTX_TO_KEYID(cmac->devCtx); - uint32_t mac_len = - ((outMac == NULL) || (outMacLen == NULL)) ? 0 : *outMacLen; +int wh_Client_CmacGenerateRequest(whClientContext* ctx, Cmac* cmac, + CmacType type, const uint8_t* key, + uint32_t keyLen, const uint8_t* in, + uint32_t inLen, uint32_t outMacLen) +{ + whMessageCrypto_CmacAesRequest* req; + uint8_t* dataPtr; + uint8_t* req_in; + uint8_t* req_key; + uint32_t hdr_sz; + whKeyId key_id; + int ret; - /* For non-HSM keys on incremental calls (update/final with no key argument - * provided), send the stored key bytes so the server can reconstruct the - * CMAC context */ - if (key == NULL && keyLen == 0 && WH_KEYID_ISERASED(key_id) && - (inLen != 0 || mac_len != 0)) { - key = (const uint8_t*)cmac->aes.devKey; - keyLen = cmac->aes.keylen; + if (ctx == NULL || cmac == NULL || in == NULL || inLen == 0 || + outMacLen == 0 || (key == NULL && keyLen != 0)) { + return WH_ERROR_BADARGS; } - /* Update type and return success for 0 length data, nothing else to do */ - if ((inLen == 0) && (keyLen == 0) && (mac_len == 0)) { - /* Update the type */ - cmac->type = type; - return WH_ERROR_OK; + ret = _CmacValidateTagLen(outMacLen); + if (ret != WH_ERROR_OK) { + return ret; } - WH_DEBUG_CLIENT_VERBOSE( - "cmac key:%p key_len:%d in:%p in_len:%d out:%p out_len:%d " - "keyId:%x\n", - key, (int)keyLen, in, (int)inLen, outMac, (int)mac_len, key_id); + _CmacResolveClientKey(cmac, &key, &keyLen, &key_id); + + ret = _CmacValidateInlineKeyLen(keyLen); + if (ret != WH_ERROR_OK) { + return ret; + } - /* Get data pointer */ dataPtr = (uint8_t*)wh_CommClient_GetDataPtr(ctx->comm); if (dataPtr == NULL) { return WH_ERROR_BADARGS; } - /* Setup generic header and get pointer to request data */ req = (whMessageCrypto_CmacAesRequest*)_createCryptoRequest( dataPtr, WC_ALGO_TYPE_CMAC, ctx->cryptoAffinity); - uint8_t* req_in = (uint8_t*)(req + 1); - uint8_t* req_key = req_in + inLen; - uint32_t hdr_sz = - sizeof(whMessageCrypto_GenericRequestHeader) + sizeof(*req); + hdr_sz = sizeof(whMessageCrypto_GenericRequestHeader) + sizeof(*req); + req_in = (uint8_t*)(req + 1); + req_key = req_in + inLen; - if (inLen > WOLFHSM_CFG_COMM_DATA_LEN - hdr_sz || - keyLen > WOLFHSM_CFG_COMM_DATA_LEN - hdr_sz - inLen) { + if (inLen > WH_MESSAGE_CRYPTO_CMAC_MAX_INLINE_GENERATE_SZ || + keyLen > (uint32_t)WOLFHSM_CFG_COMM_DATA_LEN - hdr_sz - inLen) { return WH_ERROR_BADARGS; } - uint16_t req_len = hdr_sz + inLen + keyLen; - /* Setup request packet */ + memset(&req->resumeState, 0, sizeof(req->resumeState)); req->inSz = inLen; + req->outSz = outMacLen; req->keyId = key_id; req->keySz = keyLen; - req->outSz = mac_len; - /* Pack non-sensitive CMAC state into request */ + memcpy(req_in, in, inLen); + if (key != NULL && keyLen > 0) { + memcpy(req_key, key, keyLen); + } + + ret = wh_Client_SendRequest(ctx, WH_MESSAGE_GROUP_CRYPTO, WC_ALGO_TYPE_CMAC, + (uint16_t)(hdr_sz + inLen + keyLen), dataPtr); + if (ret == WH_ERROR_OK) { + cmac->type = type; + if (key != NULL && keyLen > 0 && WH_KEYID_ISERASED(key_id) && + key != (const uint8_t*)cmac->aes.devKey) { + memcpy((void*)cmac->aes.devKey, key, keyLen); + cmac->aes.keylen = keyLen; + } + } + return ret; +} + +int wh_Client_CmacGenerateResponse(whClientContext* ctx, Cmac* cmac, + uint8_t* outMac, uint32_t* outMacLen) +{ + whMessageCrypto_CmacAesResponse* res = NULL; + uint8_t* dataPtr; + uint16_t group; + uint16_t action; + uint16_t res_len = 0; + int ret; + + if (ctx == NULL || cmac == NULL || outMac == NULL || outMacLen == NULL) { + return WH_ERROR_BADARGS; + } + + dataPtr = (uint8_t*)wh_CommClient_GetDataPtr(ctx->comm); + if (dataPtr == NULL) { + return WH_ERROR_BADARGS; + } + + ret = wh_Client_RecvResponse(ctx, &group, &action, &res_len, dataPtr); + if (ret != WH_ERROR_OK) { + return ret; + } + + ret = _getCryptoResponse(dataPtr, WC_ALGO_TYPE_CMAC, (uint8_t**)&res); + /* wolfCrypt allows positive error codes on success */ + if (ret >= 0) { + /* Restore state from response (server has finalized; buffer/digest + * carry the post-finalization state). */ + ret = wh_Crypto_CmacAesRestoreStateFromMsg(cmac, &res->resumeState); + if (ret >= 0) { + ret = _CmacValidateTagLen(*outMacLen); + } + if (ret >= 0) { + if (res->outSz < *outMacLen) { + *outMacLen = res->outSz; + } + memcpy(outMac, (uint8_t*)(res + 1), *outMacLen); + } + } + return ret; +} + +int wh_Client_CmacUpdateRequest(whClientContext* ctx, Cmac* cmac, CmacType type, + const uint8_t* key, uint32_t keyLen, + const uint8_t* in, uint32_t inLen, + bool* requestSent) +{ + whMessageCrypto_CmacAesRequest* req; + uint8_t* dataPtr; + uint8_t* req_in; + uint8_t* req_key; + uint32_t hdr_sz; + whKeyId key_id; + int ret; + + if (ctx == NULL || cmac == NULL || requestSent == NULL || + (in == NULL && inLen != 0) || (key == NULL && keyLen != 0)) { + return WH_ERROR_BADARGS; + } + *requestSent = false; + + _CmacResolveClientKey(cmac, &key, &keyLen, &key_id); + + ret = _CmacValidateInlineKeyLen(keyLen); + if (ret != WH_ERROR_OK) { + return ret; + } + + /* Empty update with no key: nothing to send, just record type. */ + if (inLen == 0 && keyLen == 0) { + cmac->type = type; + return WH_ERROR_OK; + } + + dataPtr = (uint8_t*)wh_CommClient_GetDataPtr(ctx->comm); + if (dataPtr == NULL) { + return WH_ERROR_BADARGS; + } + + req = (whMessageCrypto_CmacAesRequest*)_createCryptoRequest( + dataPtr, WC_ALGO_TYPE_CMAC, ctx->cryptoAffinity); + hdr_sz = sizeof(whMessageCrypto_GenericRequestHeader) + sizeof(*req); + req_in = (uint8_t*)(req + 1); + req_key = req_in + inLen; + + if (inLen > (uint32_t)WOLFHSM_CFG_COMM_DATA_LEN - hdr_sz || + keyLen > (uint32_t)WOLFHSM_CFG_COMM_DATA_LEN - hdr_sz - inLen) { + return WH_ERROR_BADARGS; + } + + /* Wire request: input + (optional) key + full state round-trip. The + * server may leave a partial (or whole) block in cmac->buffer after + * wc_CmacUpdate, so we faithfully round-trip the entire state on every + * Request/Response pair. */ + req->inSz = inLen; + req->outSz = 0; + req->keyId = key_id; + req->keySz = keyLen; wh_Crypto_CmacAesSaveStateToMsg(&req->resumeState, cmac); - /* copy input data to request, if relevant */ - if ((in != NULL) && (inLen > 0)) { + if (in != NULL && inLen > 0) { memcpy(req_in, in, inLen); } - if ((key != NULL) && (keyLen > 0)) { + if (key != NULL && keyLen > 0) { memcpy(req_key, key, keyLen); } - /* Send request */ - ret = wh_Client_SendRequest(ctx, group, action, req_len, (uint8_t*)dataPtr); + ret = wh_Client_SendRequest(ctx, WH_MESSAGE_GROUP_CRYPTO, WC_ALGO_TYPE_CMAC, + (uint16_t)(hdr_sz + inLen + keyLen), dataPtr); if (ret == WH_ERROR_OK) { - /* Update the local type since call succeeded */ - cmac->type = type; - - /* If using non-HSM keys, store key bytes locally for future calls */ - if (key != NULL && keyLen > 0 && WH_KEYID_ISERASED(key_id)) { + *requestSent = true; + cmac->type = type; + if (key != NULL && keyLen > 0 && WH_KEYID_ISERASED(key_id) && + key != (const uint8_t*)cmac->aes.devKey) { memcpy((void*)cmac->aes.devKey, key, keyLen); cmac->aes.keylen = keyLen; } + } + return ret; +} +int wh_Client_CmacUpdateResponse(whClientContext* ctx, Cmac* cmac) +{ + whMessageCrypto_CmacAesResponse* res = NULL; + uint8_t* dataPtr; + uint16_t group; + uint16_t action; + uint16_t res_len = 0; + int ret; - uint16_t res_len = 0; - do { - ret = wh_Client_RecvResponse(ctx, &group, &action, &res_len, - (uint8_t*)dataPtr); - } while (ret == WH_ERROR_NOTREADY); - if (ret == WH_ERROR_OK) { - /* Get response */ - ret = - _getCryptoResponse(dataPtr, WC_ALGO_TYPE_CMAC, (uint8_t**)&res); - /* wolfCrypt allows positive error codes on success in some - * scenarios */ - if (ret >= 0) { - /* Restore non-sensitive state from server response */ - ret = wh_Crypto_CmacAesRestoreStateFromMsg(cmac, - &res->resumeState); - - /* Copy out finalized CMAC if present */ - if (ret == 0 && outMac != NULL && outMacLen != NULL) { - if (res->outSz < *outMacLen) { - *outMacLen = res->outSz; - } - uint8_t* res_mac = (uint8_t*)(res + 1); - memcpy(outMac, res_mac, *outMacLen); - } + if (ctx == NULL || cmac == NULL) { + return WH_ERROR_BADARGS; + } + + dataPtr = (uint8_t*)wh_CommClient_GetDataPtr(ctx->comm); + if (dataPtr == NULL) { + return WH_ERROR_BADARGS; + } + + ret = wh_Client_RecvResponse(ctx, &group, &action, &res_len, dataPtr); + if (ret != WH_ERROR_OK) { + return ret; + } + + ret = _getCryptoResponse(dataPtr, WC_ALGO_TYPE_CMAC, (uint8_t**)&res); + if (ret >= 0) { + /* Restore full state from server. The server may leave a partial + * (or whole) block in its buffer after wc_CmacUpdate (CMAC's last + * block has special handling), so we round-trip the whole state. */ + ret = wh_Crypto_CmacAesRestoreStateFromMsg(cmac, &res->resumeState); + } + return ret; +} + +int wh_Client_CmacFinalRequest(whClientContext* ctx, Cmac* cmac) +{ + whMessageCrypto_CmacAesRequest* req; + uint8_t* dataPtr; + uint8_t* req_in; + uint8_t* req_key; + const uint8_t* key = NULL; + uint32_t keyLen = 0; + whKeyId key_id; + uint32_t hdr_sz; + int ret; + + if (ctx == NULL || cmac == NULL) { + return WH_ERROR_BADARGS; + } + + _CmacResolveClientKey(cmac, &key, &keyLen, &key_id); + + ret = _CmacValidateInlineKeyLen(keyLen); + if (ret != WH_ERROR_OK) { + return ret; + } + + dataPtr = (uint8_t*)wh_CommClient_GetDataPtr(ctx->comm); + if (dataPtr == NULL) { + return WH_ERROR_BADARGS; + } + + req = (whMessageCrypto_CmacAesRequest*)_createCryptoRequest( + dataPtr, WC_ALGO_TYPE_CMAC, ctx->cryptoAffinity); + hdr_sz = sizeof(whMessageCrypto_GenericRequestHeader) + sizeof(*req); + req_in = (uint8_t*)(req + 1); + req_key = req_in; + + if (keyLen > (uint32_t)WOLFHSM_CFG_COMM_DATA_LEN - hdr_sz) { + return WH_ERROR_BADARGS; + } + + /* Final: no new input — server uses the round-tripped state (which + * already includes any partial/whole block left in cmac->buffer from + * the previous Update) and finalizes. */ + req->inSz = 0; + req->outSz = AES_BLOCK_SIZE; + req->keyId = key_id; + req->keySz = keyLen; + wh_Crypto_CmacAesSaveStateToMsg(&req->resumeState, cmac); + + if (key != NULL && keyLen > 0) { + memcpy(req_key, key, keyLen); + } + + return wh_Client_SendRequest(ctx, WH_MESSAGE_GROUP_CRYPTO, + WC_ALGO_TYPE_CMAC, (uint16_t)(hdr_sz + keyLen), + dataPtr); +} + +int wh_Client_CmacFinalResponse(whClientContext* ctx, Cmac* cmac, + uint8_t* outMac, uint32_t* outMacLen) +{ + whMessageCrypto_CmacAesResponse* res = NULL; + uint8_t* dataPtr; + uint16_t group; + uint16_t action; + uint16_t res_len = 0; + int ret; + + if (ctx == NULL || cmac == NULL || outMac == NULL || outMacLen == NULL) { + return WH_ERROR_BADARGS; + } + + dataPtr = (uint8_t*)wh_CommClient_GetDataPtr(ctx->comm); + if (dataPtr == NULL) { + return WH_ERROR_BADARGS; + } + + ret = wh_Client_RecvResponse(ctx, &group, &action, &res_len, dataPtr); + if (ret != WH_ERROR_OK) { + return ret; + } + + ret = _getCryptoResponse(dataPtr, WC_ALGO_TYPE_CMAC, (uint8_t**)&res); + if (ret >= 0) { + /* Restore final state from response (server's bufferSz is 0 after + * Final). */ + ret = wh_Crypto_CmacAesRestoreStateFromMsg(cmac, &res->resumeState); + if (ret >= 0) { + ret = _CmacValidateTagLen(*outMacLen); + } + if (ret >= 0) { + if (res->outSz < *outMacLen) { + *outMacLen = res->outSz; } + memcpy(outMac, (uint8_t*)(res + 1), *outMacLen); + } + } + return ret; +} + +int wh_Client_Cmac(whClientContext* ctx, Cmac* cmac, CmacType type, + const uint8_t* key, uint32_t keyLen, const uint8_t* in, + uint32_t inLen, uint8_t* outMac, uint32_t* outMacLen) +{ + int ret = WH_ERROR_OK; + + if (ctx == NULL || cmac == NULL) { + return WH_ERROR_BADARGS; + } + + /* No-op init: record type only. */ + if (inLen == 0 && keyLen == 0 && (outMac == NULL || outMacLen == NULL)) { + cmac->type = type; + return WH_ERROR_OK; + } + + if (outMac != NULL && outMacLen != NULL) { + ret = _CmacValidateTagLen(*outMacLen); + if (ret != WH_ERROR_OK) { + return ret; + } + } + + /* Oneshot fast path: input + output present and the input fits inline, + * plus either an explicit key (matches wc_AesCmacGenerate_ex semantics — + * fresh oneshot, prior cmac state is irrelevant and may be uninitialized) + * or fresh cmac state with a cached HSM keyId. The Generate request + * resets state server-side. */ + if (in != NULL && inLen > 0 && outMac != NULL && outMacLen != NULL && + *outMacLen > 0 && + inLen <= WH_MESSAGE_CRYPTO_CMAC_MAX_INLINE_GENERATE_SZ && + ((key != NULL && keyLen > 0) || + (cmac->bufferSz == 0 && cmac->totalSz == 0))) { + ret = wh_Client_CmacGenerateRequest(ctx, cmac, type, key, keyLen, in, + inLen, *outMacLen); + if (ret == WH_ERROR_OK) { + do { + ret = wh_Client_CmacGenerateResponse(ctx, cmac, outMac, + outMacLen); + } while (ret == WH_ERROR_NOTREADY); + } + return ret; + } + + /* Streaming path: Update + Final. The existing blocking semantic is + * a single-shot Update (no chunking), so just one Update call. */ + if (in != NULL && inLen > 0) { + bool sent = false; + ret = wh_Client_CmacUpdateRequest(ctx, cmac, type, key, keyLen, in, + inLen, &sent); + if (ret == WH_ERROR_OK && sent) { + do { + ret = wh_Client_CmacUpdateResponse(ctx, cmac); + } while (ret == WH_ERROR_NOTREADY); + } + } + else if (key != NULL && keyLen > 0) { + /* Key-provision only (no input, no output): cache key client-side + * via Update with no input. Server returns updated state (which + * is effectively unchanged since no data was processed). */ + bool sent = false; + ret = wh_Client_CmacUpdateRequest(ctx, cmac, type, key, keyLen, NULL, 0, + &sent); + if (ret == WH_ERROR_OK && sent) { + do { + ret = wh_Client_CmacUpdateResponse(ctx, cmac); + } while (ret == WH_ERROR_NOTREADY); + } + } + + if (ret == WH_ERROR_OK && outMac != NULL && outMacLen != NULL) { + ret = wh_Client_CmacFinalRequest(ctx, cmac); + if (ret == WH_ERROR_OK) { + do { + ret = wh_Client_CmacFinalResponse(ctx, cmac, outMac, outMacLen); + } while (ret == WH_ERROR_NOTREADY); } } return ret; @@ -4835,139 +5158,468 @@ int wh_Client_Cmac(whClientContext* ctx, Cmac* cmac, CmacType type, #ifdef WOLFHSM_CFG_DMA -int wh_Client_CmacDma(whClientContext* ctx, Cmac* cmac, CmacType type, - const uint8_t* key, uint32_t keyLen, const uint8_t* in, - uint32_t inLen, uint8_t* outMac, uint32_t* outMacLen) + +/* Stash the DMA input mapping for POST cleanup on the matching Response. */ +static void _CmacDmaStashInput(whClientContext* ctx, uintptr_t inAddr, + uintptr_t clientAddr, uint64_t inSz) { - int ret = WH_ERROR_OK; - whMessageCrypto_CmacAesDmaRequest* req = NULL; - whMessageCrypto_CmacAesDmaResponse* res = NULL; - uint8_t* dataPtr = NULL; - uintptr_t inAddr = 0; + ctx->dma.asyncCtx.cmac.inAddr = inAddr; + ctx->dma.asyncCtx.cmac.clientAddr = clientAddr; + ctx->dma.asyncCtx.cmac.inSz = inSz; +} - if (ctx == NULL || cmac == NULL) { +/* Run POST DMA cleanup if a mapping was stashed, then clear the stash. */ +static void _CmacDmaPostCleanup(whClientContext* ctx) +{ + if (ctx->dma.asyncCtx.cmac.inSz > 0) { + uintptr_t inAddr = ctx->dma.asyncCtx.cmac.inAddr; + (void)wh_Client_DmaProcessClientAddress( + ctx, ctx->dma.asyncCtx.cmac.clientAddr, (void**)&inAddr, + ctx->dma.asyncCtx.cmac.inSz, WH_DMA_OPER_CLIENT_READ_POST, + (whDmaFlags){0}); + ctx->dma.asyncCtx.cmac.inSz = 0; + } +} + +int wh_Client_CmacGenerateDmaRequest(whClientContext* ctx, Cmac* cmac, + CmacType type, const uint8_t* key, + uint32_t keyLen, const uint8_t* in, + uint32_t inLen, uint32_t outMacLen) +{ + whMessageCrypto_CmacAesDmaRequest* req; + uint8_t* dataPtr; + uint8_t* req_key; + uint32_t hdr_sz; + uintptr_t inAddr = 0; + bool inAddrAcquired = false; + whKeyId key_id; + int ret; + + if (ctx == NULL || cmac == NULL || in == NULL || inLen == 0 || + outMacLen == 0 || (key == NULL && keyLen != 0)) { + return WH_ERROR_BADARGS; + } + + ret = _CmacValidateTagLen(outMacLen); + if (ret != WH_ERROR_OK) { + return ret; + } + + /* Fail-fast on occupied transport to avoid leaking the DMA mapping if + * SendRequest rejects the request. */ + if (wh_CommClient_IsRequestPending(ctx->comm) == 1) { + return WH_ERROR_REQUEST_PENDING; + } + + _CmacResolveClientKey(cmac, &key, &keyLen, &key_id); + + ret = _CmacValidateInlineKeyLen(keyLen); + if (ret != WH_ERROR_OK) { + return ret; + } + + dataPtr = (uint8_t*)wh_CommClient_GetDataPtr(ctx->comm); + if (dataPtr == NULL) { + return WH_ERROR_BADARGS; + } + + req = (whMessageCrypto_CmacAesDmaRequest*)_createCryptoRequest( + dataPtr, WC_ALGO_TYPE_CMAC, ctx->cryptoAffinity); + memset(req, 0, sizeof(*req)); + + hdr_sz = sizeof(whMessageCrypto_GenericRequestHeader) + sizeof(*req); + req_key = (uint8_t*)(req + 1); + + if (keyLen > (uint32_t)WOLFHSM_CFG_COMM_DATA_LEN - hdr_sz) { + return WH_ERROR_BADARGS; + } + + req->outSz = outMacLen; + req->keyId = key_id; + req->keySz = keyLen; + req->inlineInSz = 0; + req->input.sz = inLen; + + if (key != NULL && keyLen > 0) { + memcpy(req_key, key, keyLen); + } + + ret = wh_Client_DmaProcessClientAddress(ctx, (uintptr_t)in, (void**)&inAddr, + inLen, WH_DMA_OPER_CLIENT_READ_PRE, + (whDmaFlags){0}); + if (ret == WH_ERROR_OK) { + inAddrAcquired = true; + req->input.addr = inAddr; + _CmacDmaStashInput(ctx, inAddr, (uintptr_t)in, inLen); + } + + if (ret == WH_ERROR_OK) { + ret = wh_Client_SendRequest(ctx, WH_MESSAGE_GROUP_CRYPTO_DMA, + WC_ALGO_TYPE_CMAC, + (uint16_t)(hdr_sz + keyLen), dataPtr); + } + + if (ret == WH_ERROR_OK) { + cmac->type = type; + if (key != NULL && keyLen > 0 && WH_KEYID_ISERASED(key_id) && + key != (const uint8_t*)cmac->aes.devKey) { + memcpy((void*)cmac->aes.devKey, key, keyLen); + cmac->aes.keylen = keyLen; + } + } + else if (inAddrAcquired) { + _CmacDmaPostCleanup(ctx); + } + return ret; +} + +int wh_Client_CmacGenerateDmaResponse(whClientContext* ctx, Cmac* cmac, + uint8_t* outMac, uint32_t* outMacLen) +{ + whMessageCrypto_CmacAesDmaResponse* res = NULL; + uint8_t* dataPtr; + uint16_t respSz = 0; + int ret; + + if (ctx == NULL || cmac == NULL || outMac == NULL || outMacLen == NULL) { + return WH_ERROR_BADARGS; + } + + dataPtr = (uint8_t*)wh_CommClient_GetDataPtr(ctx->comm); + if (dataPtr == NULL) { + return WH_ERROR_BADARGS; + } + + ret = wh_Client_RecvResponse(ctx, NULL, NULL, &respSz, dataPtr); + if (ret == WH_ERROR_NOTREADY) { + return ret; + } + + if (ret == WH_ERROR_OK) { + ret = _getCryptoResponse(dataPtr, WC_ALGO_TYPE_CMAC, (uint8_t**)&res); + if (ret >= 0) { + ret = wh_Crypto_CmacAesRestoreStateFromMsg(cmac, &res->resumeState); + if (ret >= 0) { + ret = _CmacValidateTagLen(*outMacLen); + } + if (ret >= 0) { + if (res->outSz < *outMacLen) { + *outMacLen = res->outSz; + } + memcpy(outMac, (uint8_t*)(res + 1), *outMacLen); + } + } + } + + _CmacDmaPostCleanup(ctx); + return ret; +} + +int wh_Client_CmacDmaUpdateRequest(whClientContext* ctx, Cmac* cmac, + CmacType type, const uint8_t* key, + uint32_t keyLen, const uint8_t* in, + uint32_t inLen, bool* requestSent) +{ + whMessageCrypto_CmacAesDmaRequest* req; + uint8_t* dataPtr; + uint8_t* req_key; + uintptr_t inAddr = 0; + bool inAddrAcquired = false; + whKeyId key_id; + uint32_t hdr_sz; + int ret; + + if (ctx == NULL || cmac == NULL || requestSent == NULL || + (in == NULL && inLen != 0) || (key == NULL && keyLen != 0)) { return WH_ERROR_BADARGS; } + *requestSent = false; - whKeyId key_id = WH_DEVCTX_TO_KEYID(cmac->devCtx); - uint32_t mac_len = - ((outMac == NULL) || (outMacLen == NULL)) ? 0 : *outMacLen; + _CmacResolveClientKey(cmac, &key, &keyLen, &key_id); - /* For non-HSM keys on subsequent calls (no key provided), send the - * stored key bytes so the server can reconstruct the CMAC context */ - if (key == NULL && keyLen == 0 && WH_KEYID_ISERASED(key_id) && - (inLen != 0 || mac_len != 0)) { - key = (const uint8_t*)cmac->aes.devKey; - keyLen = cmac->aes.keylen; + ret = _CmacValidateInlineKeyLen(keyLen); + if (ret != WH_ERROR_OK) { + return ret; } - /* Return success for a call with NULL params, or 0 len's */ - if ((inLen == 0) && (keyLen == 0) && (mac_len == 0)) { - /* Update the type */ + /* Empty update with no key: nothing to send, just record type. */ + if (inLen == 0 && keyLen == 0) { cmac->type = type; return WH_ERROR_OK; } - WH_DEBUG_CLIENT_VERBOSE( - "cmac dma key:%p key_len:%d in:%p in_len:%d out:%p out_len:%d " - "keyId:%x\n", - key, (int)keyLen, in, (int)inLen, outMac, (int)mac_len, key_id); + /* Fail-fast on occupied transport before acquiring any DMA mapping. */ + if (wh_CommClient_IsRequestPending(ctx->comm) == 1) { + return WH_ERROR_REQUEST_PENDING; + } - /* Get data pointer from the context to use as request/response storage */ dataPtr = (uint8_t*)wh_CommClient_GetDataPtr(ctx->comm); if (dataPtr == NULL) { return WH_ERROR_BADARGS; } - /* Setup generic header and get pointer to request data */ req = (whMessageCrypto_CmacAesDmaRequest*)_createCryptoRequest( dataPtr, WC_ALGO_TYPE_CMAC, ctx->cryptoAffinity); memset(req, 0, sizeof(*req)); - uint8_t* req_key = (uint8_t*)(req + 1); - uint32_t hdr_sz = - sizeof(whMessageCrypto_GenericRequestHeader) + sizeof(*req); + hdr_sz = sizeof(whMessageCrypto_GenericRequestHeader) + sizeof(*req); + req_key = (uint8_t*)(req + 1); - if (keyLen > WOLFHSM_CFG_COMM_DATA_LEN - hdr_sz) { + if (keyLen > (uint32_t)WOLFHSM_CFG_COMM_DATA_LEN - hdr_sz) { return WH_ERROR_BADARGS; } - uint16_t req_len = hdr_sz + keyLen; - - /* Setup request fields */ - req->outSz = mac_len; - req->keyId = key_id; - req->keySz = keyLen; - /* Pack non-sensitive CMAC state into request */ + /* Wire request: full state round-trip + (optional) inline key + DMA + * input. Server may leave a partial/whole block in cmac->buffer after + * wc_CmacUpdate, so resumeState carries the entire CMAC state. */ wh_Crypto_CmacAesSaveStateToMsg(&req->resumeState, cmac); + req->outSz = 0; + req->keyId = key_id; + req->keySz = keyLen; + req->inlineInSz = 0; - /* Copy key bytes into trailing data */ - if ((key != NULL) && (keyLen > 0)) { + if (key != NULL && keyLen > 0) { memcpy(req_key, key, keyLen); } - /* DMA for input data only */ - if (ret == WH_ERROR_OK && in != NULL && inLen != 0) { - req->input.sz = inLen; - ret = wh_Client_DmaProcessClientAddress( - ctx, (uintptr_t)in, (void**)&inAddr, req->input.sz, + /* PRE DMA translate for the input (if any). */ + if (inLen > 0) { + ret = wh_Client_DmaProcessClientAddress( + ctx, (uintptr_t)in, (void**)&inAddr, inLen, WH_DMA_OPER_CLIENT_READ_PRE, (whDmaFlags){0}); if (ret == WH_ERROR_OK) { + inAddrAcquired = true; + req->input.sz = inLen; req->input.addr = inAddr; + _CmacDmaStashInput(ctx, inAddr, (uintptr_t)in, inLen); } } + else { + ret = WH_ERROR_OK; + } if (ret == WH_ERROR_OK) { - /* Send the request */ ret = wh_Client_SendRequest(ctx, WH_MESSAGE_GROUP_CRYPTO_DMA, - WC_ALGO_TYPE_CMAC, req_len, - (uint8_t*)dataPtr); + WC_ALGO_TYPE_CMAC, + (uint16_t)(hdr_sz + keyLen), dataPtr); } if (ret == WH_ERROR_OK) { - /* Update the local type since call succeeded */ - cmac->type = type; - - /* Store key bytes locally for future calls (non-HSM keys) */ - if (key != NULL && keyLen > 0 && WH_KEYID_ISERASED(key_id)) { + *requestSent = true; + cmac->type = type; + if (key != NULL && keyLen > 0 && WH_KEYID_ISERASED(key_id) && + key != (const uint8_t*)cmac->aes.devKey) { memcpy((void*)cmac->aes.devKey, key, keyLen); cmac->aes.keylen = keyLen; } + } + else if (inAddrAcquired) { + _CmacDmaPostCleanup(ctx); + } + return ret; +} - uint16_t respSz = 0; - do { - ret = wh_Client_RecvResponse(ctx, NULL, NULL, &respSz, - (uint8_t*)dataPtr); - } while (ret == WH_ERROR_NOTREADY); +int wh_Client_CmacDmaUpdateResponse(whClientContext* ctx, Cmac* cmac) +{ + whMessageCrypto_CmacAesDmaResponse* res = NULL; + uint8_t* dataPtr; + uint16_t respSz = 0; + int ret; - if (ret == WH_ERROR_OK) { - ret = - _getCryptoResponse(dataPtr, WC_ALGO_TYPE_CMAC, (uint8_t**)&res); - /* wolfCrypt allows positive error codes on success */ - if (ret >= 0) { - /* Restore non-sensitive state from server response */ - ret = wh_Crypto_CmacAesRestoreStateFromMsg(cmac, - &res->resumeState); - - /* Copy out finalized CMAC if present */ - if (ret == 0 && outMac != NULL && outMacLen != NULL) { - if (res->outSz < *outMacLen) { - *outMacLen = res->outSz; - } - uint8_t* res_mac = (uint8_t*)(res + 1); - memcpy(outMac, res_mac, *outMacLen); - } + if (ctx == NULL || cmac == NULL) { + return WH_ERROR_BADARGS; + } + + dataPtr = (uint8_t*)wh_CommClient_GetDataPtr(ctx->comm); + if (dataPtr == NULL) { + return WH_ERROR_BADARGS; + } + + ret = wh_Client_RecvResponse(ctx, NULL, NULL, &respSz, dataPtr); + if (ret == WH_ERROR_NOTREADY) { + return ret; + } + + if (ret == WH_ERROR_OK) { + ret = _getCryptoResponse(dataPtr, WC_ALGO_TYPE_CMAC, (uint8_t**)&res); + if (ret >= 0) { + /* Restore full state from server (includes any partial/whole + * block left in the server's wc_CmacUpdate buffer). */ + ret = wh_Crypto_CmacAesRestoreStateFromMsg(cmac, &res->resumeState); + } + } + + _CmacDmaPostCleanup(ctx); + return ret; +} + +int wh_Client_CmacDmaFinalRequest(whClientContext* ctx, Cmac* cmac) +{ + whMessageCrypto_CmacAesDmaRequest* req; + uint8_t* dataPtr; + uint8_t* req_key; + const uint8_t* key = NULL; + uint32_t keyLen = 0; + whKeyId key_id; + uint32_t hdr_sz; + int ret; + + if (ctx == NULL || cmac == NULL) { + return WH_ERROR_BADARGS; + } + + _CmacResolveClientKey(cmac, &key, &keyLen, &key_id); + + ret = _CmacValidateInlineKeyLen(keyLen); + if (ret != WH_ERROR_OK) { + return ret; + } + + dataPtr = (uint8_t*)wh_CommClient_GetDataPtr(ctx->comm); + if (dataPtr == NULL) { + return WH_ERROR_BADARGS; + } + + req = (whMessageCrypto_CmacAesDmaRequest*)_createCryptoRequest( + dataPtr, WC_ALGO_TYPE_CMAC, ctx->cryptoAffinity); + memset(req, 0, sizeof(*req)); + + hdr_sz = sizeof(whMessageCrypto_GenericRequestHeader) + sizeof(*req); + req_key = (uint8_t*)(req + 1); + + if (keyLen > (uint32_t)WOLFHSM_CFG_COMM_DATA_LEN - hdr_sz) { + return WH_ERROR_BADARGS; + } + + /* Final: no new input — server uses the round-tripped state and + * finalizes. No DMA addresses are used. */ + wh_Crypto_CmacAesSaveStateToMsg(&req->resumeState, cmac); + req->outSz = AES_BLOCK_SIZE; + req->keyId = key_id; + req->keySz = keyLen; + req->inlineInSz = 0; + + if (key != NULL && keyLen > 0) { + memcpy(req_key, key, keyLen); + } + + return wh_Client_SendRequest(ctx, WH_MESSAGE_GROUP_CRYPTO_DMA, + WC_ALGO_TYPE_CMAC, (uint16_t)(hdr_sz + keyLen), + dataPtr); +} + +int wh_Client_CmacDmaFinalResponse(whClientContext* ctx, Cmac* cmac, + uint8_t* outMac, uint32_t* outMacLen) +{ + whMessageCrypto_CmacAesDmaResponse* res = NULL; + uint8_t* dataPtr; + uint16_t respSz = 0; + int ret; + + if (ctx == NULL || cmac == NULL || outMac == NULL || outMacLen == NULL) { + return WH_ERROR_BADARGS; + } + + dataPtr = (uint8_t*)wh_CommClient_GetDataPtr(ctx->comm); + if (dataPtr == NULL) { + return WH_ERROR_BADARGS; + } + + ret = wh_Client_RecvResponse(ctx, NULL, NULL, &respSz, dataPtr); + if (ret != WH_ERROR_OK) { + return ret; + } + + ret = _getCryptoResponse(dataPtr, WC_ALGO_TYPE_CMAC, (uint8_t**)&res); + if (ret >= 0) { + ret = wh_Crypto_CmacAesRestoreStateFromMsg(cmac, &res->resumeState); + if (ret >= 0) { + ret = _CmacValidateTagLen(*outMacLen); + } + if (ret >= 0) { + if (res->outSz < *outMacLen) { + *outMacLen = res->outSz; } + memcpy(outMac, (uint8_t*)(res + 1), *outMacLen); } } + return ret; +} - /* Post DMA cleanup for input address */ - if (in != NULL && inAddr != 0) { - (void)wh_Client_DmaProcessClientAddress( - ctx, (uintptr_t)in, (void**)&inAddr, inLen, - WH_DMA_OPER_CLIENT_READ_POST, (whDmaFlags){0}); +int wh_Client_CmacDma(whClientContext* ctx, Cmac* cmac, CmacType type, + const uint8_t* key, uint32_t keyLen, const uint8_t* in, + uint32_t inLen, uint8_t* outMac, uint32_t* outMacLen) +{ + int ret = WH_ERROR_OK; + + if (ctx == NULL || cmac == NULL) { + return WH_ERROR_BADARGS; } + /* No-op init. */ + if (inLen == 0 && keyLen == 0 && (outMac == NULL || outMacLen == NULL)) { + cmac->type = type; + return WH_ERROR_OK; + } + + if (outMac != NULL && outMacLen != NULL) { + ret = _CmacValidateTagLen(*outMacLen); + if (ret != WH_ERROR_OK) { + return ret; + } + } + + /* Oneshot fast path: input + output present, plus either an explicit key + * (matches wc_AesCmacGenerate_ex semantics — fresh oneshot, prior cmac + * state is irrelevant and may be uninitialized) or fresh cmac state with + * a cached HSM keyId. The Generate request resets state server-side. */ + if (in != NULL && inLen > 0 && outMac != NULL && outMacLen != NULL && + *outMacLen > 0 && + ((key != NULL && keyLen > 0) || + (cmac->bufferSz == 0 && cmac->totalSz == 0))) { + ret = wh_Client_CmacGenerateDmaRequest(ctx, cmac, type, key, keyLen, in, + inLen, *outMacLen); + if (ret == WH_ERROR_OK) { + do { + ret = wh_Client_CmacGenerateDmaResponse(ctx, cmac, outMac, + outMacLen); + } while (ret == WH_ERROR_NOTREADY); + } + return ret; + } + + /* Streaming path: DMA Update + DMA Final. The existing blocking semantic + * is a single-shot Update (no chunking), so just one Update call. */ + if (in != NULL && inLen > 0) { + bool sent = false; + ret = wh_Client_CmacDmaUpdateRequest(ctx, cmac, type, key, keyLen, in, + inLen, &sent); + if (ret == WH_ERROR_OK && sent) { + do { + ret = wh_Client_CmacDmaUpdateResponse(ctx, cmac); + } while (ret == WH_ERROR_NOTREADY); + } + } + else if (key != NULL && keyLen > 0) { + bool sent = false; + ret = wh_Client_CmacDmaUpdateRequest(ctx, cmac, type, key, keyLen, NULL, + 0, &sent); + if (ret == WH_ERROR_OK && sent) { + do { + ret = wh_Client_CmacDmaUpdateResponse(ctx, cmac); + } while (ret == WH_ERROR_NOTREADY); + } + } + + if (ret == WH_ERROR_OK && outMac != NULL && outMacLen != NULL) { + ret = wh_Client_CmacDmaFinalRequest(ctx, cmac); + if (ret == WH_ERROR_OK) { + do { + ret = wh_Client_CmacDmaFinalResponse(ctx, cmac, outMac, + outMacLen); + } while (ret == WH_ERROR_NOTREADY); + } + } return ret; } #endif /* WOLFHSM_CFG_DMA */ diff --git a/src/wh_message_crypto.c b/src/wh_message_crypto.c index a4e6d697b..0ba44b342 100644 --- a/src/wh_message_crypto.c +++ b/src/wh_message_crypto.c @@ -958,6 +958,7 @@ int wh_MessageCrypto_TranslateCmacAesDmaRequest( WH_T32(magic, dest, src, outSz); WH_T32(magic, dest, src, keySz); + WH_T32(magic, dest, src, inlineInSz); WH_T16(magic, dest, src, keyId); ret = wh_MessageCrypto_TranslateCmacAesState(magic, &src->resumeState, diff --git a/src/wh_server_crypto.c b/src/wh_server_crypto.c index 33f7b1fb0..bd486d095 100644 --- a/src/wh_server_crypto.c +++ b/src/wh_server_crypto.c @@ -5757,8 +5757,15 @@ static int _HandleCmacDma(whServerContext* ctx, uint16_t magic, int devId, return ret; } - /* Validate variable-length fields fit within inSize */ + /* Validate variable-length fields fit within inSize. Trailing layout: + * uint8_t in[inlineInSz] + * uint8_t key[keySz] + */ uint32_t available = inSize - sizeof(whMessageCrypto_CmacAesDmaRequest); + if (req.inlineInSz > available) { + return WH_ERROR_BADARGS; + } + available -= req.inlineInSz; if (req.keySz > available) { return WH_ERROR_BADARGS; } @@ -5769,8 +5776,9 @@ static int _HandleCmacDma(whServerContext* ctx, uint16_t magic, int devId, word32 len; /* Pointers to inline trailing data */ - uint8_t* key = + uint8_t* inlineIn = (uint8_t*)(cryptoDataIn) + sizeof(whMessageCrypto_CmacAesDmaRequest); + uint8_t* key = inlineIn + req.inlineInSz; uint8_t* out = (uint8_t*)(cryptoDataOut) + sizeof(whMessageCrypto_CmacAesDmaResponse); @@ -5783,8 +5791,10 @@ static int _HandleCmacDma(whServerContext* ctx, uint16_t magic, int devId, uint32_t tmpKeyLen = sizeof(tmpKey); Cmac cmac[1]; - /* Attempt oneshot if input and output are both present */ - if (req.input.sz != 0 && req.outSz != 0) { + /* Oneshot fast path: DMA input only (no inline), output requested. The + * streaming protocol never produces outSz>0 with DMA input (Final is + * inline-only), so this branch is only taken by CmacGenerateDma. */ + if (req.inlineInSz == 0 && req.input.sz != 0 && req.outSz != 0) { len = req.outSz; /* Translate DMA address for input */ @@ -5828,9 +5838,14 @@ static int _HandleCmacDma(whServerContext* ctx, uint16_t magic, int devId, } } else { + /* Streaming update/final with optional client-side assembled first + * block (inline) plus DMA whole blocks. Final carries partial tail + * inline only. */ WH_DEBUG_SERVER_VERBOSE( - "dma cmac begin keySz:%d inSz:%d outSz:%d keyId:%x\n", - (int)req.keySz, (int)req.input.sz, (int)req.outSz, req.keyId); + "dma cmac begin keySz:%d inlineInSz:%d dmaInSz:%d outSz:%d " + "keyId:%x\n", + (int)req.keySz, (int)req.inlineInSz, (int)req.input.sz, + (int)req.outSz, req.keyId); /* Resolve key */ ret = @@ -5849,7 +5864,15 @@ static int _HandleCmacDma(whServerContext* ctx, uint16_t magic, int devId, ret = wh_Crypto_CmacAesRestoreStateFromMsg(cmac, &req.resumeState); } - /* Handle CMAC update with DMA input */ + /* Feed inline input first (assembled first block on Update, or + * partial tail on Final). */ + if (ret == 0 && req.inlineInSz != 0) { + ret = wc_CmacUpdate(cmac, inlineIn, req.inlineInSz); + WH_DEBUG_SERVER_VERBOSE("dma cmac inline update done. ret:%d\n", + ret); + } + + /* Feed DMA input (whole blocks on Update; never present on Final). */ if (ret == 0 && req.input.sz != 0) { ret = wh_Server_DmaProcessClientAddress( ctx, req.input.addr, &inAddr, req.input.sz, @@ -5859,7 +5882,8 @@ static int _HandleCmacDma(whServerContext* ctx, uint16_t magic, int devId, } if (ret == WH_ERROR_OK) { ret = wc_CmacUpdate(cmac, inAddr, req.input.sz); - WH_DEBUG_SERVER_VERBOSE("dma cmac update done. ret:%d\n", ret); + WH_DEBUG_SERVER_VERBOSE("dma cmac dma update done. ret:%d\n", + ret); } } diff --git a/test/wh_test_crypto.c b/test/wh_test_crypto.c index 8ab5de942..0a7a4d78e 100644 --- a/test/wh_test_crypto.c +++ b/test/wh_test_crypto.c @@ -9263,6 +9263,760 @@ static int whTestCrypto_Cmac(whClientContext* ctx, int devId, WC_RNG* rng) } return ret; } + +/* Direct exercise of the new async non-DMA CMAC primitives. */ +static int whTestCrypto_CmacAsync(whClientContext* ctx, int devId, WC_RNG* rng) +{ + int ret = WH_ERROR_OK; + Cmac cmac[1]; + uint8_t tag[AES_BLOCK_SIZE] = {0}; + uint32_t tagSz; + whKeyId keyId; + uint8_t labelIn[WH_NVM_LABEL_LEN] = "CMAC Async Label"; + + (void)rng; + +#ifdef WOLFSSL_AES_128 + /* NIST SP 800-38B AES-128 vectors. m_long covers the 0/40/64-byte test + * messages by prefix; the tags below are the canonical NIST outputs. */ + const byte k128[] = {0x2b, 0x7e, 0x15, 0x16, 0x28, 0xae, 0xd2, 0xa6, + 0xab, 0xf7, 0x15, 0x88, 0x09, 0xcf, 0x4f, 0x3c}; + const byte m128[] = {0x6b, 0xc1, 0xbe, 0xe2, 0x2e, 0x40, 0x9f, 0x96, + 0xe9, 0x3d, 0x7e, 0x11, 0x73, 0x93, 0x17, 0x2a}; + const byte m_long[64] = { + 0x6b, 0xc1, 0xbe, 0xe2, 0x2e, 0x40, 0x9f, 0x96, 0xe9, 0x3d, 0x7e, + 0x11, 0x73, 0x93, 0x17, 0x2a, 0xae, 0x2d, 0x8a, 0x57, 0x1e, 0x03, + 0xac, 0x9c, 0x9e, 0xb7, 0x6f, 0xac, 0x45, 0xaf, 0x8e, 0x51, 0x30, + 0xc8, 0x1c, 0x46, 0xa3, 0x5c, 0xe4, 0x11, 0xe5, 0xfb, 0xc1, 0x19, + 0x1a, 0x0a, 0x52, 0xef, 0xf6, 0x9f, 0x24, 0x45, 0xdf, 0x4f, 0x9b, + 0x17, 0xad, 0x2b, 0x41, 0x7b, 0xe6, 0x6c, 0x37, 0x10}; + const byte t128[] = {0x07, 0x0a, 0x16, 0xb4, 0x6b, 0x4d, 0x41, 0x44, + 0xf7, 0x9b, 0xdd, 0x9d, 0xd0, 0x4a, 0x28, 0x7c}; + const byte t128_0[] = {0xbb, 0x1d, 0x69, 0x29, 0xe9, 0x59, 0x37, 0x28, + 0x7f, 0xa3, 0x7d, 0x12, 0x9b, 0x75, 0x67, 0x46}; + const byte t128_320[] = {0xdf, 0xa6, 0x67, 0x47, 0xde, 0x9a, 0xe6, 0x30, + 0x30, 0xca, 0x32, 0x61, 0x14, 0x97, 0xc8, 0x27}; + const byte t128_512[] = {0x51, 0xf0, 0xbe, 0xbf, 0x7e, 0x3b, 0x9d, 0x92, + 0xfc, 0x49, 0x74, 0x17, 0x79, 0x36, 0x3c, 0xfe}; + + /* Case A: oneshot Generate via the async pair, with a cached HSM key. */ + if (ret == 0) { + keyId = WH_KEYID_ERASED; + ret = wh_Client_KeyCache(ctx, WH_NVM_FLAGS_USAGE_SIGN, labelIn, + sizeof(labelIn), (uint8_t*)k128, sizeof(k128), + &keyId); + if (ret != 0) { + WH_ERROR_PRINT("Async CMAC: KeyCache(A) failed %d\n", ret); + } + } + if (ret == 0) { + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + if (ret == 0) { + ret = wh_Client_CmacSetKeyId(cmac, keyId); + } + } + if (ret == 0) { + ret = wh_Client_CmacGenerateRequest(ctx, cmac, WC_CMAC_AES, NULL, 0, + m128, sizeof(m128), AES_BLOCK_SIZE); + } + if (ret == 0) { + memset(tag, 0, sizeof(tag)); + tagSz = sizeof(tag); + do { + ret = wh_Client_CmacGenerateResponse(ctx, cmac, tag, &tagSz); + } while (ret == WH_ERROR_NOTREADY); + } + if (ret == 0 && memcmp(tag, t128, AES_BLOCK_SIZE) != 0) { + WH_ERROR_PRINT("Async CMAC: Generate MAC mismatch (case A)\n"); + ret = -1; + } + if (keyId != WH_KEYID_ERASED) { + (void)wh_Client_KeyEvict(ctx, keyId); + } + + /* Case B: streaming Update + Final via the async pair. */ + if (ret == 0) { + keyId = WH_KEYID_ERASED; + ret = wh_Client_KeyCache(ctx, WH_NVM_FLAGS_USAGE_SIGN, labelIn, + sizeof(labelIn), (uint8_t*)k128, sizeof(k128), + &keyId); + } + if (ret == 0) { + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + if (ret == 0) { + ret = wh_Client_CmacSetKeyId(cmac, keyId); + } + } + if (ret == 0) { + bool sent = false; + ret = wh_Client_CmacUpdateRequest(ctx, cmac, WC_CMAC_AES, NULL, 0, m128, + sizeof(m128), &sent); + if (ret == 0 && sent) { + do { + ret = wh_Client_CmacUpdateResponse(ctx, cmac); + } while (ret == WH_ERROR_NOTREADY); + } + } + if (ret == 0) { + ret = wh_Client_CmacFinalRequest(ctx, cmac); + } + if (ret == 0) { + memset(tag, 0, sizeof(tag)); + tagSz = sizeof(tag); + do { + ret = wh_Client_CmacFinalResponse(ctx, cmac, tag, &tagSz); + } while (ret == WH_ERROR_NOTREADY); + } + if (ret == 0 && memcmp(tag, t128, AES_BLOCK_SIZE) != 0) { + WH_ERROR_PRINT("Async CMAC: streaming MAC mismatch (case B)\n"); + ret = -1; + } + if (keyId != WH_KEYID_ERASED) { + (void)wh_Client_KeyEvict(ctx, keyId); + } + + /* Case C: oneshot Generate with inline key bytes (no HSM keyId). */ + if (ret == 0) { + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + } + if (ret == 0) { + ret = wh_Client_CmacGenerateRequest(ctx, cmac, WC_CMAC_AES, k128, + sizeof(k128), m128, sizeof(m128), + AES_BLOCK_SIZE); + } + if (ret == 0) { + memset(tag, 0, sizeof(tag)); + tagSz = sizeof(tag); + do { + ret = wh_Client_CmacGenerateResponse(ctx, cmac, tag, &tagSz); + } while (ret == WH_ERROR_NOTREADY); + } + if (ret == 0 && memcmp(tag, t128, AES_BLOCK_SIZE) != 0) { + WH_ERROR_PRINT("Async CMAC: inline-key MAC mismatch (case C)\n"); + ret = -1; + } + + /* Case D: empty-message streaming (Final with no preceding Update). + * Exercises the zero-input round-trip of the resumeState. */ + if (ret == 0) { + keyId = WH_KEYID_ERASED; + ret = wh_Client_KeyCache(ctx, WH_NVM_FLAGS_USAGE_SIGN, labelIn, + sizeof(labelIn), (uint8_t*)k128, sizeof(k128), + &keyId); + } + if (ret == 0) { + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + if (ret == 0) { + ret = wh_Client_CmacSetKeyId(cmac, keyId); + } + } + if (ret == 0) { + ret = wh_Client_CmacFinalRequest(ctx, cmac); + } + if (ret == 0) { + memset(tag, 0, sizeof(tag)); + tagSz = sizeof(tag); + do { + ret = wh_Client_CmacFinalResponse(ctx, cmac, tag, &tagSz); + } while (ret == WH_ERROR_NOTREADY); + } + if (ret == 0 && memcmp(tag, t128_0, AES_BLOCK_SIZE) != 0) { + WH_ERROR_PRINT("Async CMAC: empty MAC mismatch (case D)\n"); + ret = -1; + } + if (keyId != WH_KEYID_ERASED) { + (void)wh_Client_KeyEvict(ctx, keyId); + } + + /* Case E: non-block-aligned single Update (40 bytes = 2 full blocks + + * 8-byte tail). Confirms the server holds the tail in its buffer and + * round-trips it back through resumeState. */ + if (ret == 0) { + keyId = WH_KEYID_ERASED; + ret = wh_Client_KeyCache(ctx, WH_NVM_FLAGS_USAGE_SIGN, labelIn, + sizeof(labelIn), (uint8_t*)k128, sizeof(k128), + &keyId); + } + if (ret == 0) { + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + if (ret == 0) { + ret = wh_Client_CmacSetKeyId(cmac, keyId); + } + } + if (ret == 0) { + bool sent = false; + ret = wh_Client_CmacUpdateRequest(ctx, cmac, WC_CMAC_AES, NULL, 0, + m_long, 40, &sent); + if (ret == 0 && sent) { + do { + ret = wh_Client_CmacUpdateResponse(ctx, cmac); + } while (ret == WH_ERROR_NOTREADY); + } + } + if (ret == 0) { + ret = wh_Client_CmacFinalRequest(ctx, cmac); + } + if (ret == 0) { + memset(tag, 0, sizeof(tag)); + tagSz = sizeof(tag); + do { + ret = wh_Client_CmacFinalResponse(ctx, cmac, tag, &tagSz); + } while (ret == WH_ERROR_NOTREADY); + } + if (ret == 0 && memcmp(tag, t128_320, AES_BLOCK_SIZE) != 0) { + WH_ERROR_PRINT("Async CMAC: 40-byte MAC mismatch (case E)\n"); + ret = -1; + } + if (keyId != WH_KEYID_ERASED) { + (void)wh_Client_KeyEvict(ctx, keyId); + } + + /* Case F: multi-Update streaming, split at non-block boundary + * (27 + 37 = 64 bytes). This is the canonical regression test for + * partial-block state round-tripping between calls. */ + if (ret == 0) { + keyId = WH_KEYID_ERASED; + ret = wh_Client_KeyCache(ctx, WH_NVM_FLAGS_USAGE_SIGN, labelIn, + sizeof(labelIn), (uint8_t*)k128, sizeof(k128), + &keyId); + } + if (ret == 0) { + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + if (ret == 0) { + ret = wh_Client_CmacSetKeyId(cmac, keyId); + } + } + if (ret == 0) { + bool sent = false; + ret = wh_Client_CmacUpdateRequest(ctx, cmac, WC_CMAC_AES, NULL, 0, + m_long, 27, &sent); + if (ret == 0 && sent) { + do { + ret = wh_Client_CmacUpdateResponse(ctx, cmac); + } while (ret == WH_ERROR_NOTREADY); + } + } + if (ret == 0) { + bool sent = false; + ret = wh_Client_CmacUpdateRequest(ctx, cmac, WC_CMAC_AES, NULL, 0, + m_long + 27, 37, &sent); + if (ret == 0 && sent) { + do { + ret = wh_Client_CmacUpdateResponse(ctx, cmac); + } while (ret == WH_ERROR_NOTREADY); + } + } + if (ret == 0) { + ret = wh_Client_CmacFinalRequest(ctx, cmac); + } + if (ret == 0) { + memset(tag, 0, sizeof(tag)); + tagSz = sizeof(tag); + do { + ret = wh_Client_CmacFinalResponse(ctx, cmac, tag, &tagSz); + } while (ret == WH_ERROR_NOTREADY); + } + if (ret == 0 && memcmp(tag, t128_512, AES_BLOCK_SIZE) != 0) { + WH_ERROR_PRINT("Async CMAC: split-update MAC mismatch (case F)\n"); + ret = -1; + } + if (keyId != WH_KEYID_ERASED) { + (void)wh_Client_KeyEvict(ctx, keyId); + } + + /* Case G: reject inline keys longer than AES_256_KEY_SIZE in both + * Generate and Update request paths (defends against the devKey + * overflow). */ + if (ret == 0) { + uint8_t bigKey[AES_256_KEY_SIZE + 1] = {0}; + bool sent = true; + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + if (ret == 0 && + wh_Client_CmacGenerateRequest(ctx, cmac, WC_CMAC_AES, bigKey, + sizeof(bigKey), m128, sizeof(m128), + AES_BLOCK_SIZE) != WH_ERROR_BADARGS) { + WH_ERROR_PRINT( + "Async CMAC: oversize keyLen not BADARGS (Generate)\n"); + ret = -1; + } + if (ret == 0 && wh_Client_CmacUpdateRequest( + ctx, cmac, WC_CMAC_AES, bigKey, sizeof(bigKey), + m128, sizeof(m128), &sent) != WH_ERROR_BADARGS) { + WH_ERROR_PRINT( + "Async CMAC: oversize keyLen not BADARGS (Update)\n"); + ret = -1; + } + } + + /* Case H: argument validation. NULL ctx / cmac / requestSent must + * yield BADARGS without sending anything. */ + if (ret == 0) { + bool sent = true; + if (wh_Client_CmacGenerateRequest(NULL, cmac, WC_CMAC_AES, NULL, 0, + m128, 1, + AES_BLOCK_SIZE) != WH_ERROR_BADARGS) { + WH_ERROR_PRINT("Async CMAC: NULL ctx not BADARGS (Generate)\n"); + ret = -1; + } + if (ret == 0 && + wh_Client_CmacUpdateRequest(NULL, cmac, WC_CMAC_AES, NULL, 0, m128, + 1, &sent) != WH_ERROR_BADARGS) { + WH_ERROR_PRINT("Async CMAC: NULL ctx not BADARGS (Update)\n"); + ret = -1; + } + if (ret == 0 && + wh_Client_CmacFinalRequest(NULL, cmac) != WH_ERROR_BADARGS) { + WH_ERROR_PRINT("Async CMAC: NULL ctx not BADARGS (Final)\n"); + ret = -1; + } + } + + /* Case I: NULL key with nonzero keyLen must BADARGS in both Request + * paths (defends against a NULL deref in the inline-key memcpy). */ + if (ret == 0) { + bool sent = true; + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + if (ret == 0 && + wh_Client_CmacGenerateRequest(ctx, cmac, WC_CMAC_AES, NULL, + AES_128_KEY_SIZE, m128, sizeof(m128), + AES_BLOCK_SIZE) != WH_ERROR_BADARGS) { + WH_ERROR_PRINT( + "Async CMAC: NULL key + nonzero keyLen not BADARGS (Gen)\n"); + ret = -1; + } + if (ret == 0 && wh_Client_CmacUpdateRequest( + ctx, cmac, WC_CMAC_AES, NULL, AES_128_KEY_SIZE, + m128, sizeof(m128), &sent) != WH_ERROR_BADARGS) { + WH_ERROR_PRINT( + "Async CMAC: NULL key + nonzero keyLen not BADARGS (Upd)\n"); + ret = -1; + } + } + + /* Case J: wh_Client_Cmac must reject caller tag lengths outside + * [WC_CMAC_TAG_MIN_SZ, WC_CMAC_TAG_MAX_SZ] with WH_ERROR_BUFFER_SIZE, + * matching wolfCrypt's wc_CmacFinal contract. Streaming Final hardcodes + * outSz=AES_BLOCK_SIZE on the wire, so without this validation a + * sub-min or over-max caller buffer would silently receive a truncated + * tag. */ + if (ret == 0) { + uint8_t smallTag[3]; + uint32_t smallSz = sizeof(smallTag); + uint8_t bigTag[AES_BLOCK_SIZE + 1]; + uint32_t bigSz = sizeof(bigTag); + if (wh_Client_Cmac(ctx, cmac, WC_CMAC_AES, k128, sizeof(k128), m128, + sizeof(m128), smallTag, + &smallSz) != WH_ERROR_BUFFER_SIZE) { + WH_ERROR_PRINT( + "Async CMAC: sub-min outMacLen not WH_ERROR_BUFFER_SIZE\n"); + ret = -1; + } + if (ret == 0 && wh_Client_Cmac(ctx, cmac, WC_CMAC_AES, k128, + sizeof(k128), m128, sizeof(m128), bigTag, + &bigSz) != WH_ERROR_BUFFER_SIZE) { + WH_ERROR_PRINT( + "Async CMAC: over-max outMacLen not WH_ERROR_BUFFER_SIZE\n"); + ret = -1; + } + } + + /* Case K: wh_Client_CmacFinalResponse must also reject bad tag lengths + * (defense-in-depth for direct users of the async Final pair). Drive a + * full Update via the async pair, then call FinalRequest and finally + * FinalResponse with a sub-min outMacLen. */ + if (ret == 0) { + keyId = WH_KEYID_ERASED; + ret = wh_Client_KeyCache(ctx, WH_NVM_FLAGS_USAGE_SIGN, labelIn, + sizeof(labelIn), (uint8_t*)k128, sizeof(k128), + &keyId); + } + if (ret == 0) { + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + if (ret == 0) { + ret = wh_Client_CmacSetKeyId(cmac, keyId); + } + } + if (ret == 0) { + bool sent = false; + ret = wh_Client_CmacUpdateRequest(ctx, cmac, WC_CMAC_AES, NULL, 0, m128, + sizeof(m128), &sent); + if (ret == 0 && sent) { + do { + ret = wh_Client_CmacUpdateResponse(ctx, cmac); + } while (ret == WH_ERROR_NOTREADY); + } + } + if (ret == 0) { + ret = wh_Client_CmacFinalRequest(ctx, cmac); + } + if (ret == 0) { + uint8_t smallTag[3] = {0}; + uint32_t smallSz = sizeof(smallTag); + int finalRet; + do { + finalRet = + wh_Client_CmacFinalResponse(ctx, cmac, smallTag, &smallSz); + } while (finalRet == WH_ERROR_NOTREADY); + if (finalRet != WH_ERROR_BUFFER_SIZE) { + WH_ERROR_PRINT("Async CMAC: FinalResponse sub-min outMacLen not " + "WH_ERROR_BUFFER_SIZE\n"); + ret = -1; + } + } + if (keyId != WH_KEYID_ERASED) { + (void)wh_Client_KeyEvict(ctx, keyId); + } +#endif /* WOLFSSL_AES_128 */ + + if (ret == 0) { + WH_TEST_PRINT("CMAC ASYNC DEVID=0x%X SUCCESS\n", devId); + } + return ret; +} + +#ifdef WOLFHSM_CFG_DMA +/* Direct exercise of the new async DMA CMAC primitives. */ +static int whTestCrypto_CmacDmaAsync(whClientContext* ctx, int devId, + WC_RNG* rng) +{ + int ret = WH_ERROR_OK; + Cmac cmac[1]; + uint8_t tag[AES_BLOCK_SIZE] = {0}; + uint32_t tagSz; + whKeyId keyId; + uint8_t labelIn[WH_NVM_LABEL_LEN] = "CMAC DMA Async Label"; + + (void)rng; + +#ifdef WOLFSSL_AES_128 + const byte k128[] = {0x2b, 0x7e, 0x15, 0x16, 0x28, 0xae, 0xd2, 0xa6, + 0xab, 0xf7, 0x15, 0x88, 0x09, 0xcf, 0x4f, 0x3c}; + const byte m128[] = {0x6b, 0xc1, 0xbe, 0xe2, 0x2e, 0x40, 0x9f, 0x96, + 0xe9, 0x3d, 0x7e, 0x11, 0x73, 0x93, 0x17, 0x2a}; + const byte m_long[64] = { + 0x6b, 0xc1, 0xbe, 0xe2, 0x2e, 0x40, 0x9f, 0x96, 0xe9, 0x3d, 0x7e, + 0x11, 0x73, 0x93, 0x17, 0x2a, 0xae, 0x2d, 0x8a, 0x57, 0x1e, 0x03, + 0xac, 0x9c, 0x9e, 0xb7, 0x6f, 0xac, 0x45, 0xaf, 0x8e, 0x51, 0x30, + 0xc8, 0x1c, 0x46, 0xa3, 0x5c, 0xe4, 0x11, 0xe5, 0xfb, 0xc1, 0x19, + 0x1a, 0x0a, 0x52, 0xef, 0xf6, 0x9f, 0x24, 0x45, 0xdf, 0x4f, 0x9b, + 0x17, 0xad, 0x2b, 0x41, 0x7b, 0xe6, 0x6c, 0x37, 0x10}; + const byte t128[] = {0x07, 0x0a, 0x16, 0xb4, 0x6b, 0x4d, 0x41, 0x44, + 0xf7, 0x9b, 0xdd, 0x9d, 0xd0, 0x4a, 0x28, 0x7c}; + const byte t128_320[] = {0xdf, 0xa6, 0x67, 0x47, 0xde, 0x9a, 0xe6, 0x30, + 0x30, 0xca, 0x32, 0x61, 0x14, 0x97, 0xc8, 0x27}; + const byte t128_512[] = {0x51, 0xf0, 0xbe, 0xbf, 0x7e, 0x3b, 0x9d, 0x92, + 0xfc, 0x49, 0x74, 0x17, 0x79, 0x36, 0x3c, 0xfe}; + + /* Case A: DMA oneshot Generate. */ + if (ret == 0) { + keyId = WH_KEYID_ERASED; + ret = wh_Client_KeyCache(ctx, WH_NVM_FLAGS_USAGE_SIGN, labelIn, + sizeof(labelIn), (uint8_t*)k128, sizeof(k128), + &keyId); + } + if (ret == 0) { + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + if (ret == 0) { + ret = wh_Client_CmacSetKeyId(cmac, keyId); + } + } + if (ret == 0) { + ret = wh_Client_CmacGenerateDmaRequest(ctx, cmac, WC_CMAC_AES, NULL, 0, + m128, sizeof(m128), + AES_BLOCK_SIZE); + } + if (ret == 0) { + memset(tag, 0, sizeof(tag)); + tagSz = sizeof(tag); + do { + ret = wh_Client_CmacGenerateDmaResponse(ctx, cmac, tag, &tagSz); + } while (ret == WH_ERROR_NOTREADY); + } + if (ret == 0 && memcmp(tag, t128, AES_BLOCK_SIZE) != 0) { + WH_ERROR_PRINT("Async DMA CMAC: Generate mismatch (case A)\n"); + ret = -1; + } + if (keyId != WH_KEYID_ERASED) { + (void)wh_Client_KeyEvict(ctx, keyId); + } + + /* Case B: DMA streaming Update + Final. */ + if (ret == 0) { + keyId = WH_KEYID_ERASED; + ret = wh_Client_KeyCache(ctx, WH_NVM_FLAGS_USAGE_SIGN, labelIn, + sizeof(labelIn), (uint8_t*)k128, sizeof(k128), + &keyId); + } + if (ret == 0) { + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + if (ret == 0) { + ret = wh_Client_CmacSetKeyId(cmac, keyId); + } + } + if (ret == 0) { + bool sent = false; + ret = wh_Client_CmacDmaUpdateRequest(ctx, cmac, WC_CMAC_AES, NULL, 0, + m128, sizeof(m128), &sent); + if (ret == 0 && sent) { + do { + ret = wh_Client_CmacDmaUpdateResponse(ctx, cmac); + } while (ret == WH_ERROR_NOTREADY); + } + } + if (ret == 0) { + ret = wh_Client_CmacDmaFinalRequest(ctx, cmac); + } + if (ret == 0) { + memset(tag, 0, sizeof(tag)); + tagSz = sizeof(tag); + do { + ret = wh_Client_CmacDmaFinalResponse(ctx, cmac, tag, &tagSz); + } while (ret == WH_ERROR_NOTREADY); + } + if (ret == 0 && memcmp(tag, t128, AES_BLOCK_SIZE) != 0) { + WH_ERROR_PRINT("Async DMA CMAC: streaming mismatch (case B)\n"); + ret = -1; + } + if (keyId != WH_KEYID_ERASED) { + (void)wh_Client_KeyEvict(ctx, keyId); + } + + /* Case C: DMA streaming with a 40-byte non-block-aligned message. */ + if (ret == 0) { + keyId = WH_KEYID_ERASED; + ret = wh_Client_KeyCache(ctx, WH_NVM_FLAGS_USAGE_SIGN, labelIn, + sizeof(labelIn), (uint8_t*)k128, sizeof(k128), + &keyId); + } + if (ret == 0) { + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + if (ret == 0) { + ret = wh_Client_CmacSetKeyId(cmac, keyId); + } + } + if (ret == 0) { + bool sent = false; + ret = wh_Client_CmacDmaUpdateRequest(ctx, cmac, WC_CMAC_AES, NULL, 0, + m_long, 40, &sent); + if (ret == 0 && sent) { + do { + ret = wh_Client_CmacDmaUpdateResponse(ctx, cmac); + } while (ret == WH_ERROR_NOTREADY); + } + } + if (ret == 0) { + ret = wh_Client_CmacDmaFinalRequest(ctx, cmac); + } + if (ret == 0) { + memset(tag, 0, sizeof(tag)); + tagSz = sizeof(tag); + do { + ret = wh_Client_CmacDmaFinalResponse(ctx, cmac, tag, &tagSz); + } while (ret == WH_ERROR_NOTREADY); + } + if (ret == 0 && memcmp(tag, t128_320, AES_BLOCK_SIZE) != 0) { + WH_ERROR_PRINT("Async DMA CMAC: 40-byte MAC mismatch (case C)\n"); + ret = -1; + } + if (keyId != WH_KEYID_ERASED) { + (void)wh_Client_KeyEvict(ctx, keyId); + } + + /* Case D: DMA multi-Update streaming, split at non-block boundary + * (27 + 37 = 64 bytes). Regression test for state round-tripping. */ + if (ret == 0) { + keyId = WH_KEYID_ERASED; + ret = wh_Client_KeyCache(ctx, WH_NVM_FLAGS_USAGE_SIGN, labelIn, + sizeof(labelIn), (uint8_t*)k128, sizeof(k128), + &keyId); + } + if (ret == 0) { + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + if (ret == 0) { + ret = wh_Client_CmacSetKeyId(cmac, keyId); + } + } + if (ret == 0) { + bool sent = false; + ret = wh_Client_CmacDmaUpdateRequest(ctx, cmac, WC_CMAC_AES, NULL, 0, + m_long, 27, &sent); + if (ret == 0 && sent) { + do { + ret = wh_Client_CmacDmaUpdateResponse(ctx, cmac); + } while (ret == WH_ERROR_NOTREADY); + } + } + if (ret == 0) { + bool sent = false; + ret = wh_Client_CmacDmaUpdateRequest(ctx, cmac, WC_CMAC_AES, NULL, 0, + m_long + 27, 37, &sent); + if (ret == 0 && sent) { + do { + ret = wh_Client_CmacDmaUpdateResponse(ctx, cmac); + } while (ret == WH_ERROR_NOTREADY); + } + } + if (ret == 0) { + ret = wh_Client_CmacDmaFinalRequest(ctx, cmac); + } + if (ret == 0) { + memset(tag, 0, sizeof(tag)); + tagSz = sizeof(tag); + do { + ret = wh_Client_CmacDmaFinalResponse(ctx, cmac, tag, &tagSz); + } while (ret == WH_ERROR_NOTREADY); + } + if (ret == 0 && memcmp(tag, t128_512, AES_BLOCK_SIZE) != 0) { + WH_ERROR_PRINT("Async DMA CMAC: split-update MAC mismatch (case D)\n"); + ret = -1; + } + if (keyId != WH_KEYID_ERASED) { + (void)wh_Client_KeyEvict(ctx, keyId); + } + + /* Case E: reject inline keys longer than AES_256_KEY_SIZE in both DMA + * Generate and Update request paths. */ + if (ret == 0) { + uint8_t bigKey[AES_256_KEY_SIZE + 1] = {0}; + bool sent = true; + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + if (ret == 0 && + wh_Client_CmacGenerateDmaRequest( + ctx, cmac, WC_CMAC_AES, bigKey, sizeof(bigKey), m128, + sizeof(m128), AES_BLOCK_SIZE) != WH_ERROR_BADARGS) { + WH_ERROR_PRINT( + "Async DMA CMAC: oversize keyLen not BADARGS (Gen)\n"); + ret = -1; + } + if (ret == 0 && wh_Client_CmacDmaUpdateRequest( + ctx, cmac, WC_CMAC_AES, bigKey, sizeof(bigKey), + m128, sizeof(m128), &sent) != WH_ERROR_BADARGS) { + WH_ERROR_PRINT( + "Async DMA CMAC: oversize keyLen not BADARGS (Upd)\n"); + ret = -1; + } + } + + /* Case F: argument validation. */ + if (ret == 0) { + bool sent = true; + if (wh_Client_CmacGenerateDmaRequest(NULL, cmac, WC_CMAC_AES, NULL, 0, + m128, 1, AES_BLOCK_SIZE) != + WH_ERROR_BADARGS) { + WH_ERROR_PRINT("Async DMA CMAC: NULL ctx not BADARGS (Gen)\n"); + ret = -1; + } + if (ret == 0 && wh_Client_CmacDmaUpdateRequest( + NULL, cmac, WC_CMAC_AES, NULL, 0, m128, 1, &sent) != + WH_ERROR_BADARGS) { + WH_ERROR_PRINT("Async DMA CMAC: NULL ctx not BADARGS (Upd)\n"); + ret = -1; + } + if (ret == 0 && + wh_Client_CmacDmaFinalRequest(NULL, cmac) != WH_ERROR_BADARGS) { + WH_ERROR_PRINT("Async DMA CMAC: NULL ctx not BADARGS (Fin)\n"); + ret = -1; + } + } + + /* Case G: NULL key with nonzero keyLen must BADARGS in both DMA Request + * paths (defends against a NULL deref in the inline-key memcpy). */ + if (ret == 0) { + bool sent = true; + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + if (ret == 0 && + wh_Client_CmacGenerateDmaRequest( + ctx, cmac, WC_CMAC_AES, NULL, AES_128_KEY_SIZE, m128, + sizeof(m128), AES_BLOCK_SIZE) != WH_ERROR_BADARGS) { + WH_ERROR_PRINT("Async DMA CMAC: NULL key + nonzero keyLen not " + "BADARGS (Gen)\n"); + ret = -1; + } + if (ret == 0 && wh_Client_CmacDmaUpdateRequest( + ctx, cmac, WC_CMAC_AES, NULL, AES_128_KEY_SIZE, + m128, sizeof(m128), &sent) != WH_ERROR_BADARGS) { + WH_ERROR_PRINT("Async DMA CMAC: NULL key + nonzero keyLen not " + "BADARGS (Upd)\n"); + ret = -1; + } + } + + /* Case H: wh_Client_CmacDma must reject caller tag lengths outside + * [WC_CMAC_TAG_MIN_SZ, WC_CMAC_TAG_MAX_SZ] with WH_ERROR_BUFFER_SIZE, + * and so must wh_Client_CmacDmaFinalResponse for direct users of the + * async pair. */ + if (ret == 0) { + uint8_t smallTag[3]; + uint32_t smallSz = sizeof(smallTag); + uint8_t bigTag[AES_BLOCK_SIZE + 1]; + uint32_t bigSz = sizeof(bigTag); + if (wh_Client_CmacDma(ctx, cmac, WC_CMAC_AES, k128, sizeof(k128), m128, + sizeof(m128), smallTag, + &smallSz) != WH_ERROR_BUFFER_SIZE) { + WH_ERROR_PRINT( + "Async DMA CMAC: sub-min outMacLen not WH_ERROR_BUFFER_SIZE\n"); + ret = -1; + } + if (ret == 0 && + wh_Client_CmacDma(ctx, cmac, WC_CMAC_AES, k128, sizeof(k128), m128, + sizeof(m128), bigTag, + &bigSz) != WH_ERROR_BUFFER_SIZE) { + WH_ERROR_PRINT("Async DMA CMAC: over-max outMacLen not " + "WH_ERROR_BUFFER_SIZE\n"); + ret = -1; + } + } + if (ret == 0) { + keyId = WH_KEYID_ERASED; + ret = wh_Client_KeyCache(ctx, WH_NVM_FLAGS_USAGE_SIGN, labelIn, + sizeof(labelIn), (uint8_t*)k128, sizeof(k128), + &keyId); + } + if (ret == 0) { + ret = wc_InitCmac_ex(cmac, NULL, 0, WC_CMAC_AES, NULL, NULL, devId); + if (ret == 0) { + ret = wh_Client_CmacSetKeyId(cmac, keyId); + } + } + if (ret == 0) { + bool sent = false; + ret = wh_Client_CmacDmaUpdateRequest(ctx, cmac, WC_CMAC_AES, NULL, 0, + m128, sizeof(m128), &sent); + if (ret == 0 && sent) { + do { + ret = wh_Client_CmacDmaUpdateResponse(ctx, cmac); + } while (ret == WH_ERROR_NOTREADY); + } + } + if (ret == 0) { + ret = wh_Client_CmacDmaFinalRequest(ctx, cmac); + } + if (ret == 0) { + uint8_t smallTag[3] = {0}; + uint32_t smallSz = sizeof(smallTag); + int finalRet; + do { + finalRet = + wh_Client_CmacDmaFinalResponse(ctx, cmac, smallTag, &smallSz); + } while (finalRet == WH_ERROR_NOTREADY); + if (finalRet != WH_ERROR_BUFFER_SIZE) { + WH_ERROR_PRINT("Async DMA CMAC: FinalResponse sub-min outMacLen " + "not WH_ERROR_BUFFER_SIZE\n"); + ret = -1; + } + } + if (keyId != WH_KEYID_ERASED) { + (void)wh_Client_KeyEvict(ctx, keyId); + } +#endif /* WOLFSSL_AES_128 */ + + if (ret == 0) { + WH_TEST_PRINT("CMAC DMA ASYNC DEVID=0x%X SUCCESS\n", devId); + } + return ret; +} +#endif /* WOLFHSM_CFG_DMA */ + #endif /* WOLFSSL_CMAC && !NO_AES && WOLFSSL_AES_DIRECT */ #ifdef HAVE_DILITHIUM @@ -11137,10 +11891,18 @@ int whTest_CryptoClientConfig(whClientConfig* config) i = 0; while ((ret == WH_ERROR_OK) && (i < WH_NUM_DEVIDS)) { ret = whTestCrypto_Cmac(client, WH_DEV_IDS_ARRAY[i], rng); + if (ret == WH_ERROR_OK) { + ret = whTestCrypto_CmacAsync(client, WH_DEV_IDS_ARRAY[i], rng); + } if (ret == WH_ERROR_OK) { i++; } } +#ifdef WOLFHSM_CFG_DMA + if (ret == WH_ERROR_OK) { + ret = whTestCrypto_CmacDmaAsync(client, WH_DEV_ID_DMA, rng); + } +#endif /* WOLFHSM_CFG_DMA */ #endif /* WOLFSSL_CMAC && !NO_AES && WOLFSSL_AES_DIRECT */ #ifndef NO_RSA diff --git a/wolfhsm/wh_client.h b/wolfhsm/wh_client.h index d5d33a617..33e025264 100644 --- a/wolfhsm/wh_client.h +++ b/wolfhsm/wh_client.h @@ -138,13 +138,23 @@ typedef struct { uint64_t aadSz; } whClientDmaAsyncAes; +/* Per-operation async DMA context for CMAC: same shape as SHA since CMAC + * DMA only transfers the input buffer. Stashed across Request/Response for + * POST cleanup. inSz == 0 means "nothing to clean up". */ +typedef struct { + uintptr_t inAddr; + uintptr_t clientAddr; + uint64_t inSz; +} whClientDmaAsyncCmac; + /* Async DMA context union. Only one DMA request can be in flight at a time * per client context, so a single union suffices. Each Response function * knows which member to access based on its own operation type. */ typedef union { - whClientDmaAsyncSha sha; - whClientDmaAsyncRng rng; - whClientDmaAsyncAes aes; + whClientDmaAsyncSha sha; + whClientDmaAsyncRng rng; + whClientDmaAsyncAes aes; + whClientDmaAsyncCmac cmac; } whClientDmaAsyncCtx; typedef struct { diff --git a/wolfhsm/wh_client_crypto.h b/wolfhsm/wh_client_crypto.h index ec7e9c46b..168cb73a1 100644 --- a/wolfhsm/wh_client_crypto.h +++ b/wolfhsm/wh_client_crypto.h @@ -1699,6 +1699,124 @@ int wh_Client_Cmac(whClientContext* ctx, Cmac* cmac, CmacType type, const uint8_t* key, uint32_t keyLen, const uint8_t* in, uint32_t inLen, uint8_t* outMac, uint32_t* outMacLen); +/** + * @brief Async request half of a non-DMA CMAC oneshot generate. + * + * Serializes and sends a single request that performs init + update + final + * on the server in one round trip. The server returns the MAC in the matching + * Response. Does NOT wait for a reply. + * + * Contract: at most one outstanding async request may be in flight per + * whClientContext. The caller MUST call wh_Client_CmacGenerateResponse before + * issuing any other async Request on the same ctx. Any existing streaming + * state in the cmac struct is silently reset — this is a oneshot, equivalent + * to wc_AesCmacGenerate_ex. + * + * @param[in] ctx Client context. + * @param[in,out] cmac CMAC context (type and non-HSM key bytes are cached on + * success). + * @param[in] type CMAC type (e.g., WC_CMAC_AES). + * @param[in] key Inline key bytes, or NULL if using a cached/HSM key. + * @param[in] keyLen Key length in bytes (0 if using a cached/HSM key). + * @param[in] in Input data. Must not be NULL. + * @param[in] inLen Input length. Must be > 0 and must not exceed + * WH_MESSAGE_CRYPTO_CMAC_MAX_INLINE_GENERATE_SZ. + * @param[in] outMacLen Requested MAC length in bytes. Must be > 0. + * @return WH_ERROR_OK on success, WH_ERROR_BADARGS for invalid args or + * oversize input, or a negative error from the transport. On any + * error the cmac struct is left unchanged. + */ +int wh_Client_CmacGenerateRequest(whClientContext* ctx, Cmac* cmac, + CmacType type, const uint8_t* key, + uint32_t keyLen, const uint8_t* in, + uint32_t inLen, uint32_t outMacLen); + +/** + * @brief Async response half of a non-DMA CMAC oneshot generate. + * + * Single-shot RecvResponse; returns WH_ERROR_NOTREADY if the server has not + * yet replied. On success, restores state from the response, copies the MAC + * into outMac (truncated to *outMacLen) and updates *outMacLen to the actual + * number of bytes written. + */ +int wh_Client_CmacGenerateResponse(whClientContext* ctx, Cmac* cmac, + uint8_t* outMac, uint32_t* outMacLen); + +/** + * @brief Async request half of a non-DMA CMAC streaming Update. + * + * Serializes and sends an Update request carrying inLen bytes of inline + * input plus the full CMAC state (digest + buffer + bookkeeping) via + * resumeState. The server runs wc_CmacUpdate against the round-tripped + * state, so all partial-block accounting happens server-side and the + * post-Update state is returned in the matching Response. Does NOT wait + * for a reply. + * + * Contract: at most one outstanding async request may be in flight per + * whClientContext (enforced by the comm layer). If *requestSent is true, the + * caller MUST call wh_Client_CmacUpdateResponse before issuing any other + * async Request on the same ctx. + * + * Key handling: if key/keyLen are provided, the bytes are cached client-side + * so subsequent Update/Final calls can replay them. If using an HSM-cached + * key, set it via wh_Client_CmacSetKeyId before the first Update and pass + * NULL / 0 for key/keyLen. + * + * @param[in] ctx Client context. + * @param[in,out] cmac CMAC context (full state round-tripped on success, + * type and cached key bytes updated on SendRequest + * success). + * @param[in] type CMAC type (written to cmac->type on success). + * @param[in] key Optional inline key bytes (NULL for cached/HSM key). + * @param[in] keyLen Key length in bytes (must not exceed + * AES_256_KEY_SIZE; 0 for cached/HSM key). + * @param[in] in Input data (may be NULL only if inLen == 0). + * @param[in] inLen Input length. Must fit in the comm buffer alongside + * the request header and key bytes. + * @param[out] requestSent Set to true if a server request was sent and a + * matching Response call is required; false only + * when inLen == 0 and keyLen == 0 (no-op). + * @return WH_ERROR_OK on success, WH_ERROR_BADARGS on invalid arguments or + * when inLen exceeds the per-call capacity. + */ +int wh_Client_CmacUpdateRequest(whClientContext* ctx, Cmac* cmac, CmacType type, + const uint8_t* key, uint32_t keyLen, + const uint8_t* in, uint32_t inLen, + bool* requestSent); + +/** + * @brief Async response half of a non-DMA CMAC streaming Update. + * + * Single-shot RecvResponse; returns WH_ERROR_NOTREADY if the server has not + * yet replied. On success, restores the full CMAC state (buffer, bufferSz, + * digest, totalSz) from the response — the server may leave a partial or + * whole block in its buffer after wc_CmacUpdate (CMAC's last block has + * special handling), so that bookkeeping is round-tripped back to the + * client. MUST only be called if the matching Request returned + * requestSent == true. + */ +int wh_Client_CmacUpdateResponse(whClientContext* ctx, Cmac* cmac); + +/** + * @brief Async request half of a non-DMA CMAC streaming Final. + * + * Sends a Final request with no inline input — the round-tripped + * resumeState carries the current cmac->buffer (0..AES_BLOCK_SIZE-1 bytes) + * as the trailing partial block for the server to finalize. Key material + * travels with the request when available. + */ +int wh_Client_CmacFinalRequest(whClientContext* ctx, Cmac* cmac); + +/** + * @brief Async response half of a non-DMA CMAC streaming Final. + * + * Single-shot RecvResponse. Restores final state from the response, then + * copies the MAC into outMac (truncated to *outMacLen) and updates + * *outMacLen. + */ +int wh_Client_CmacFinalResponse(whClientContext* ctx, Cmac* cmac, + uint8_t* outMac, uint32_t* outMacLen); + /** * @brief Associates a CMAC key with a specific key ID. @@ -1748,6 +1866,95 @@ int wh_Client_CmacGetKeyId(Cmac* key, whNvmId* outId); int wh_Client_CmacDma(whClientContext* ctx, Cmac* cmac, CmacType type, const uint8_t* key, uint32_t keyLen, const uint8_t* in, uint32_t inLen, uint8_t* outMac, uint32_t* outMacLen); + +/** + * @brief Async request half of a DMA CMAC oneshot generate. + * + * Performs PRE address translation for the input buffer, sends the DMA + * request, and stashes the translated address for POST cleanup in the + * matching Response. Does NOT wait for a reply. The server processes the + * oneshot in a single round trip via wc_AesCmacGenerate_ex. + * + * Contract: at most one outstanding async request may be in flight per + * whClientContext. The caller MUST call wh_Client_CmacGenerateDmaResponse + * before issuing any other async Request on the same ctx, and must keep in + * valid until the Response completes. Any existing streaming state in the + * cmac struct is silently reset — this is a oneshot, equivalent to + * wc_AesCmacGenerate_ex. + */ +int wh_Client_CmacGenerateDmaRequest(whClientContext* ctx, Cmac* cmac, + CmacType type, const uint8_t* key, + uint32_t keyLen, const uint8_t* in, + uint32_t inLen, uint32_t outMacLen); + +/** + * @brief Async response half of a DMA CMAC oneshot generate. + * + * Single-shot RecvResponse; returns WH_ERROR_NOTREADY if the server has not + * yet replied. On any non-NOTREADY exit, performs POST DMA cleanup on the + * input buffer. On success, copies the MAC into outMac (truncated to + * *outMacLen), updates *outMacLen, and restores the post-finalization CMAC + * state (buffer, bufferSz, digest, totalSz) from the response. The AES + * round key and CMAC subkey material in cmac are NOT reset — callers + * recycling the cmac struct must reinitialize it via wc_InitCmac_ex. + */ +int wh_Client_CmacGenerateDmaResponse(whClientContext* ctx, Cmac* cmac, + uint8_t* outMac, uint32_t* outMacLen); + +/** + * @brief Async request half of a DMA CMAC streaming Update. + * + * Performs PRE address translation for the input buffer, round-trips the + * full CMAC state to the server via resumeState, and sends every byte of + * the input via DMA. No client-side partial-block buffering and no inline + * trailing data — the server runs wc_CmacUpdate against the round-tripped + * state. Stashes the translated input address for POST cleanup in the + * matching Response. Does NOT wait for a reply. + * + * Contract: at most one outstanding async request may be in flight per + * whClientContext. If *requestSent is true, the caller MUST keep in valid + * and call wh_Client_CmacDmaUpdateResponse before issuing any other async + * Request. *requestSent is false only when inLen == 0 and keyLen == 0 + * (no-op). + */ +int wh_Client_CmacDmaUpdateRequest(whClientContext* ctx, Cmac* cmac, + CmacType type, const uint8_t* key, + uint32_t keyLen, const uint8_t* in, + uint32_t inLen, bool* requestSent); + +/** + * @brief Async response half of a DMA CMAC streaming Update. + * + * Single-shot RecvResponse; returns WH_ERROR_NOTREADY if the server has + * not yet replied. On any non-NOTREADY exit, performs POST DMA cleanup + * for the input buffer. On success, restores the full CMAC state (buffer, + * bufferSz, digest, totalSz) from the response — including any + * partial/whole block left in the server's wc_CmacUpdate buffer. + */ +int wh_Client_CmacDmaUpdateResponse(whClientContext* ctx, Cmac* cmac); + +/** + * @brief Async request half of a DMA CMAC streaming Final. + * + * Sends a Final request with no DMA addresses and no inline input — the + * round-tripped resumeState carries the partial-block tail + * (0..AES_BLOCK_SIZE-1 bytes) for the server to finalize. Key material + * travels with the request when available. + */ +int wh_Client_CmacDmaFinalRequest(whClientContext* ctx, Cmac* cmac); + +/** + * @brief Async response half of a DMA CMAC streaming Final. + * + * Single-shot RecvResponse. Copies the MAC into outMac (truncated to + * *outMacLen), updates *outMacLen, and restores the post-finalization + * CMAC state (buffer, bufferSz, digest, totalSz) from the response. The + * AES round key and CMAC subkey material in cmac are NOT reset — callers + * recycling the cmac struct must reinitialize it via wc_InitCmac_ex. No + * DMA cleanup is needed (Final doesn't use DMA addresses). + */ +int wh_Client_CmacDmaFinalResponse(whClientContext* ctx, Cmac* cmac, + uint8_t* outMac, uint32_t* outMacLen); #endif /* WOLFHSM_CFG_DMA */ #endif /* WOLFSSL_CMAC */ diff --git a/wolfhsm/wh_message_crypto.h b/wolfhsm/wh_message_crypto.h index ac3417743..83c56ff58 100644 --- a/wolfhsm/wh_message_crypto.h +++ b/wolfhsm/wh_message_crypto.h @@ -903,6 +903,15 @@ typedef struct { */ } whMessageCrypto_CmacAesResponse; +/* Maximum number of input bytes that wh_Client_CmacGenerateRequest can carry + * inline in one message. Oneshot CMAC accepts arbitrary input length, so + * this is NOT block-aligned. Conservatively reserves AES_256_KEY_SIZE (32) + * bytes for the key. */ +#define WH_MESSAGE_CRYPTO_CMAC_MAX_INLINE_GENERATE_SZ \ + (WOLFHSM_CFG_COMM_DATA_LEN - \ + (uint32_t)sizeof(whMessageCrypto_GenericRequestHeader) - \ + (uint32_t)sizeof(whMessageCrypto_CmacAesRequest) - 32u) + int wh_MessageCrypto_TranslateCmacAesState( uint16_t magic, const whMessageCrypto_CmacAesState* src, whMessageCrypto_CmacAesState* dest); @@ -1091,16 +1100,31 @@ int wh_MessageCrypto_TranslateSha2DmaResponse( uint16_t magic, const whMessageCrypto_Sha2DmaResponse* src, whMessageCrypto_Sha2DmaResponse* dest); -/* CMAC-AES DMA Request - only input data goes via DMA; state, key, and output - * are passed inline in the message for cross-architecture safety */ +/* CMAC-AES DMA Request - state, key, and output are passed inline in the + * message for cross-architecture safety. Input may be carried via DMA + * (input.sz bytes, whole-block aligned) AND/OR inline (inlineInSz bytes) — + * when both are present the inline portion is logically processed BEFORE the + * DMA portion. The assembled first block from the client's partial-block + * buffer goes inline; the bulk of whole-block input goes via DMA. On Final + * the tail (0..AES_BLOCK_SIZE-1 bytes) goes inline with input.sz = 0. + * + * Wire layout in the comm buffer: + * whMessageCrypto_GenericRequestHeader + * whMessageCrypto_CmacAesDmaRequest + * uint8_t in[inlineInSz] + * uint8_t key[keySz] + */ typedef struct { whMessageCrypto_CmacAesState resumeState; /* portable CMAC state */ - whMessageCrypto_DmaBuffer input; /* Input data via DMA */ - uint32_t outSz; /* output MAC size (0 = not finalizing) */ - uint32_t keySz; /* inline key size (0 = use keyId) */ - uint16_t keyId; /* HSM key ID */ - uint8_t WH_PAD[6]; - /* Trailing data: uint8_t key[keySz] */ + whMessageCrypto_DmaBuffer input; /* Whole-block DMA input */ + uint32_t outSz; /* output MAC size (0 = not finalizing) */ + uint32_t keySz; /* inline key size (0 = use keyId) */ + uint32_t inlineInSz; /* inline trailing input size (assembled first + * block on Update, partial tail on Final, 0 on + * oneshot Generate) */ + uint16_t keyId; /* HSM key ID */ + uint8_t WH_PAD[2]; + /* Trailing data: uint8_t in[inlineInSz]; uint8_t key[keySz]; */ } whMessageCrypto_CmacAesDmaRequest; /* CMAC-AES DMA Response - state and output MAC returned inline */