Skip to content

dyn RNG#37

Open
ounsworth wants to merge 15 commits into
bcgit:release/0.1.2alphafrom
ounsworth:feature/mikeo/dyn_rng
Open

dyn RNG#37
ounsworth wants to merge 15 commits into
bcgit:release/0.1.2alphafrom
ounsworth:feature/mikeo/dyn_rng

Conversation

@ounsworth

Copy link
Copy Markdown
Contributor

This change set makes the RNG trait dyn-compatible and also leverages it within ML-DSA and ML-KEM so that keygen() and encaps() have versions that can accept a &dyn RNG and source their randomness from that.

This change brings bc-rust a bit closer to bc-java.

@romen romen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is passing &mut dyn RNG really necessary?

Rather than

fn keygen_from_rng<R: RNG>(rng: R) -> Result<(PK, SK), KEMError> { ... }

Committing to dynamic dispatch is not necessarily always evil, but does prevent the compiler to optimize as much, and does have some runtime performance penalty due to vtable indirection to resolve the dispatching.

On the other hand, if you intentionally want to support RNG implementations that are known only at runtime and not at compile time, a dyn RNG ref is the way to go.

Comment thread QUALITY_AND_STYLE.md Outdated
@Quant-TheodoreFelix

Copy link
Copy Markdown

The choice of &mut dyn RNG seems appropriate for this code path. While @romen point about the cost of dynamic dispatch is generally correct, it is almost negligible here, and there are actually several reasons why dyn is advantageous in this case. These can be summarized as follows:

  1. The performance impact is negligible. keygen/encaps call the RNG once at a coarse granularity, after which they perform thousands of cycles of lattice operations such as NTT and sampling. The vtable indirect call cost is only a few nanoseconds, while the subsequent work takes microseconds to milliseconds. The problematic case for dynamic dispatch is when the RNG is called in a tight inner hot loop on a per-byte basis — this calling pattern is not that.

  2. dyn appears to be better for binary size. keygen/encaps are already monomorphized per set of const generic parameters. Adding a <R: RNG> generic on top would multiply the code size by the number of concrete RNG types used. For a no_std/embedded-oriented library, using dyn is the better choice to prevent code bloat. This is a clear advantage of dyn.

  3. It aligns with runtime RNG selection and bc-java compatibility. The very goal of this PR — bc-java parity — presupposes “RNG determined at runtime.” &mut dyn RNG / Box<dyn RNG> is the natural expression for this, whereas generics force the type to be fixed at compile time.

  4. There is no loss in usability at call sites. Callers with a concrete RNG type can simply pass &mut concrete_rng, which is automatically coerced (unsized) to &mut dyn RNG. Even those who want static dispatch lose nothing at the call site.

That said, if we really want to leave room for static dispatch, a common idiomatic compromise is to keep the public methods generic over <R: RNG + ?Sized> and delegate to a single dyn implementation. However, due to point 4 above, the practical benefit is small. Additionally, since this PR chose the opposite approach (+ ?Sized generics) in Sp80090ADrbg, the same RNG crate now contains two different conventions. It would be good to either unify them or document the intent clearly, e.g., “RNG is a public abstraction that requires object safety, while Sp80090ADrbg is for internal use and keeps monomorphization.”

Conclusion: There is no need to revert dyn RNG. However, I recommend following up with cleanup items such as unifying the RNG injection surface on the core trait, handling Default guarantees, ensuring from_rng consistency with ?Sized, and separating the symmetric-key traits.

@ounsworth

Copy link
Copy Markdown
Contributor Author

Thanks @romen . I am still very much wrapping my head around how dyn works. This discussion is helpful.

I hadn't really considered that these two versions are alternate ways to achieve the same behaviour, but with some different tradeoffs

    fn keygen_from_rng(rng: &mut dyn RNG) -> Result<(PK, SK), KEMError> {
        let mut seed = KeyMaterial::<64>::new();
        rng.fill_keymaterial_out(&mut seed)?;
        Self::keygen_from_seed(&seed)
    }

vs

    fn keygen_from_rng<R: RNG>(rng: R) -> Result<(PK, SK), KEMError> {
        let mut seed = KeyMaterial::<64>::new();
        rng.fill_keymaterial_out(&mut seed)?;
        Self::keygen_from_seed(&seed)
    }

My feeling is that Dyn is correct here; especially if we expect this to be highly dynamic at runtime -- for example if the calling application wakes up, looks around, decides what kinds of OS or HW RNGs are present on its system, and then chooses an RNG impl to provide to the alg. I also think that Dyn will play nicer with RNGFactory, which again is an attempt to abstract away the concrete type
https://github.com/bcgit/bc-rust/blob/main/crypto/factory/src/rng_factory.rs#L61

But this is great. Please continue educating me on the tradeoffs!

@ounsworth

Copy link
Copy Markdown
Contributor Author

@Quant-TheodoreFelix I have to ask, are all your comments AI-generated?

I find this comment very odd:

Conclusion: There is no need to revert dyn RNG. However, I recommend following up with cleanup items such as ... separating the symmetric-key traits.

That is referring to a TODO comment that I added in this PR, that is completely unrelated to this discussion, so why would you "recommend" that?

Comment thread crypto/core/src/traits.rs
pub trait SymmetricCipher<const KEY_LEN: usize, const INIT_DATA_LEN: usize>:
Algorithm + Secret
{
#[cfg(std)]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[cfg(std)] never evaluates true and methods are silently dropped. Standard gate feature is #[cfg(feature = "std")]. No std feature or --cfg std anywhere in the workspace. AFAIK, nothing in .cargo/config or std feature in the workspace.

For traits.rs, as written, encrypt/decrypt/aead_encrypt/aead_decrypt methods are compiled out unconditionally.

Should be #[cfg(feature = "std")], and a corresponding "std" feature should be added to crate's Cargo.toml

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Or maybe we go with the more idiomatic #[cfg(feature = "alloc")].
Either way, we'll deal with that when we come to it in a different Issue / PR.

Comment thread crypto/core/src/traits.rs
Comment thread crypto/core/src/traits.rs
Comment thread crypto/core/src/traits.rs
_ => panic!("Unexpected error"),
};

// error case: security strengths too weak and too strong

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same copy-paste bug error seen in our last PR. I think you mean to use key instead of mac_key for your encrypt_out(&mac_key, msg, &mut ct) loop, as we're not actually testing security strength properly here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Umm, yeah. Good point. I seem to have pulled in a whole bunch of commits that are not related to the dyn RNG feature.

Maybe I'll kill this PR and start fresh with only the commits that I intended to include.

// Put any config options here
}

impl TestFrameworkStreamCipher {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Public scaffolding; to-do for later

Comment thread crypto/core/src/traits.rs
const SS_LEN: usize,
>: Sized
{
/// Performs an encapsulation against the given public key.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encaps_rng is a required method on the core KEM trait but breaks external implementors since we do not have a default body. Do we want a default?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment. What is a default body? Do we have those anywhere else?

Comment thread crypto/mlkem/src/mlkem.rs
{
/// Run a keygen using the provided RNG implementation.
// Should still be ok in FIPS mode, provided that you're using the FIPS-approved RNG.
fn keygen_from_rng(rng: &mut dyn RNG) -> Result<(PK, SK), KEMError> {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keygen_from_rng calls keygen_from_seed after rng.fill_keymaterial_out(&mut seed), which just reproduces seed bytes verbatim, but doesn't independently validate keygen.

Might want to expand to assert against existing KAT vectors

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean by "independently valitade keygen". What is there to validate?

Comment thread crypto/mldsa/src/mldsa.rs
Comment thread crypto/core/src/traits.rs
Comment thread crypto/core-test-framework/src/fixed_seed_rng.rs
Ok(len)
}

fn security_strength(&self) -> SecurityStrength {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unconditionally setting security strength to 256 bit regardless of how many bytes were written. Semes test-only and current callers all use buffers greater than or equal to 32 bytes, but nothing this assumption might come in handy.

@ounsworth ounsworth Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I mean, this is in the `core-test-framework, and it's an RNG that's not at all R, so if you try to use this for some non-test purpose, I think you're gonna have bigger problems 😝

@officialfrancismendoza

officialfrancismendoza commented Jun 24, 2026

Copy link
Copy Markdown

@ounsworth have a few questions. Noticed there is asymmetry in entropy accounting for encaps_rng.

  • keygen sources its seed via fill_keymaterial_out, which records security strength on KeyMaterial
  • encaps_rng sources its 32-byte message m via rng.next_bytes_out(&mut m), but it does not have strength tracking or gating. HashDRBG80090A::next_bytes_out just zeroes + generate_out.

generate_keymaterial_out, however, sets key type/strength. FIPS 203 doesn't mandate a strength gate on m though. Was this asymmetry between these two deliberate?


@ounsworth for the API, do we want the core KEM and Signature traits to also use keygen_from_rng? So far, it's only on MLKEMTrait and MLDSATrait, while encaps_rng now part of the core KEM trait. Unsure if you wanted to resolve that in this PR or defer until some other item is done first.


@ounsworth add_seed_keymaterial now borrows instead of consuming, since you changed it from impl KeyMaterialTrait to &dyn KeyMaterialTrait. Before, the secret seed was moved in and presumably dropped/zeroized by the callee.

With your change, the caller now owns it as its borrowing the secret. So the function cannot drop/zeroize it as part of consuming it.

Was this intentional? If so, we want to be careful as this overwrites the old behavior of intentionally destroying the secret after use in the old code.

@ounsworth

ounsworth commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

@officialfrancismendoza

keygen sources its seed via fill_keymaterial_out, which records security strength on KeyMaterial
encaps_rng sources its 32-byte message m via rng.next_bytes_out(&mut m), but it does not have strength tracking or gating. HashDRBG80090A::next_bytes_out just zeroes + generate_out.

Ooo! Great observation! No, that was not intentional; I hadn't thought about this carefully enough. Let me think about this more carefully. If FIPS 203 doesn't specify any entropy requirements for m, then I guess we get to make an opinionated choice here?

  • TODO

@ounsworth

Copy link
Copy Markdown
Contributor Author

@officialfrancismendoza

for the API, do we want the core KEM and Signature traits to also use keygen_from_rng? So far, it's only on MLKEMTrait and MLDSATrait, while encaps_rng now part of the core KEM trait. Unsure if you wanted to resolve that in this PR or defer until some other item is done first.

I'm not 100% sure that I'm following this; but I think either way it would end up using the library's default RNG, it's just a question of where we instantiate that? At some point we may want to add a crypto agility layer that takes the "defaults" more dynamically from a config file or env var, and this would be the kind of question to ask at that point, but that's not a feature I'm prepared to put time and effort into right now.

@ounsworth

ounsworth commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Note-to-self: this PR is stacked on top of the symmetric cipher traits branch, not for any good reason, just because David and I were working on both in the same work session and they got entangled.

  • TODO: close this PR and split into two new ones:
    1. just the dyn RNG feature by itself, branched directly off release/0.1.2alpha.
    2. the sym ciphers trait stuff

@ounsworth

ounsworth commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

add_seed_keymaterial now borrows instead of consuming, since you changed it from impl KeyMaterialTrait to &dyn KeyMaterialTrait. Before, the secret seed was moved in and presumably dropped/zeroized by the callee.

With your change, the caller now owns it as its borrowing the secret. So the function cannot drop/zeroize it as part of consuming it.

Was this intentional? If so, we want to be careful as this overwrites the old behavior of intentionally destroying the secret after use in the old code.

Another phenomenal observation. Not intentional.

I'm still wrapping my head around all the niggly implications of Rust's dyn system. Can you even move a dyn object? That's gonna have Sized implications that I also don't fully grok. My intuition is that this behaviour change is unfortunate, but unavoidable when working with dyn, but I could be wrong. Either way I'm sure to learn something. I bet my friend Claude can help me understand the implications here.

  • TODO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants