Skip to content

Check-in pregenerated WASIp2/3 bindings#828

Open
alexcrichton wants to merge 4 commits intorust-random:masterfrom
alexcrichton:checkin-wasip3-bindings
Open

Check-in pregenerated WASIp2/3 bindings#828
alexcrichton wants to merge 4 commits intorust-random:masterfrom
alexcrichton:checkin-wasip3-bindings

Conversation

@alexcrichton
Copy link
Collaborator

This commit updates the backends for the wasm32-wasip{2,3} targets to skip using the wasip2 and wasip3 crates and instead use generated bindings directly. This avoids the dependency on the wasip2 and wasip3 crates which transitively depend on wit-bindgen. This then additionally avoids a Cargo bug where although getrandom never activates some features of wit-bindgen all the optional dependencies of wit-bindgen are included in downstream Cargo.lock files.

A CI job is added which regenerates the bindings and ensures that everything is up-to-date to enforce that the state in-repo is fresh at all times. Some small workarounds were necessary in the generated bindings to force the custom sections to get emitted correctly, but this reflects what the upstream crates/bindings are already doing and is necessary to replicate here with checked in bindings for now.

cc #827

This commit updates the backends for the `wasm32-wasip{2,3}` targets to
skip using the `wasip2` and `wasip3` crates and instead use generated
bindings directly. This avoids the dependency on the `wasip2` and
`wasip3` crates which transitively depend on `wit-bindgen`. This then
additionally avoids a Cargo bug where although `getrandom` never
activates some features of `wit-bindgen` all the optional dependencies
of `wit-bindgen` are included in downstream `Cargo.lock` files.

A CI job is added which regenerates the bindings and ensures that
everything is up-to-date to enforce that the state in-repo is fresh at
all times. Some small workarounds were necessary in the generated
bindings to force the custom sections to get emitted correctly, but this
reflects what the upstream crates/bindings are already doing and is
necessary to replicate here with checked in bindings for now.
\0asm\x0d\0\x01\0\0\x19\x16wit-component-encoding\x04\0\x07\xb3\x02\x01A\x02\x01\
A\x06\x01B\x05\x01p}\x01@\x01\x03lenw\0\0\x04\0\x10get-random-bytes\x01\x01\x01@\
\0\0w\x04\0\x0eget-random-u64\x01\x02\x03\0\x19wasi:random/random@0.2.10\x05\0\x01\
B\x05\x01p}\x01@\x01\x03lenw\0\0\x04\0\x19get-insecure-random-bytes\x01\x01\x01@\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a lot of references to unused features within this string. Is it possible to remove them?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not currently, no, right now the type information can't get minimized. In theory it's possible-ish but that would require changes externally which would be a relatively difficult endeavor to undertake. None of this will be present in the final artifact, however, and this is purely something consumed at link-time by wasm-component-ld.

Copy link
Member

@newpavlov newpavlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for demonstration on how the wasip* crate issue could be resolved!

Unfortunately, the resulting code is significantly more complex than I personally would like. I think we either should keep using the wasip* crates (and tolerate the bloated Cargo.lock while hoping for a future fix of the Cargo bug), or use a simplified hand-written version of the bindings similar to suggested in this comment.

Side note: it's a bit sad to see the amount of complexity introduced in new versions of WASI (same with WASM in general). And it's not only matter of complexity, but also of additional restrictions on supported APIs. I am sure there are good reasons for it, but I liked WASIp1 more with its straightforward syscall-like interface.

unsafe {
#[cfg(target_arch = "wasm32")]
#[link(
wasm_import_module = "wasi:random/random@0.3.0-rc-2026-01-06"
Copy link
Member

@newpavlov newpavlov Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens with this binding when a future non-RC version of WASI is released? Would it be backwards-compatible?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes WASIp3 isn't official yet so all releases are backwards-incompatible. That's already true today with the wasip3 crate where it needs new releases for each RC being made (we're getting close to releasing though...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes WASIp3 isn't official yet so all releases are backwards-incompatible. That's already true today with the wasip3 crate where it needs new releases for each RC being made (we're getting close to releasing though...)

This seems to mean that every time there is an update to wasip3, we'll need to update getrandom to support the new version. Either we'll need to re-generate the generated binding code, or we'll need to increase the version of the wasip3 dependency. Is that right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, yes. Until wasip3 is stable, that's what's required. This is a chief reason why wasm32-wasip3 is a tier 3 target in Rust with no precompiled binaries, it's not stable yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the MSRV of the wasip3 dependencies will keep increasing too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now there's not really a concept of a supported Rust version for wasip3 since wasip3 isn't supported on stable on any Rust version, so sort of yeah. Once wasip3 is on stable then that means there are no more snapshots or anything and getrandom-the-crate can pick a version and forget about it.

pub fn get_random_u64() -> u64 {
unsafe {
#[cfg(target_arch = "wasm32")]
#[link(wasm_import_module = "wasi:random/random@0.2.10")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would #[link(wasm_import_module = "wasi:random/random@0.2.0")] work here if we are to use manual bindings?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that'd work, but the type information in the custom section would need to be updated as well (e.g. generate bindings from the original 0.2.0 WITs instead of 0.2.10)

@alexcrichton
Copy link
Collaborator Author

No problem! Wanted to at least show an example of what things could look like.

it's a bit sad to see the amount of complexity introduced in new versions of WASI

Well, to be fair, it's all generated code. This isn't intended to be readable and as-written by a human. It's exceptionally difficult to generate code like that and doesn't really have all that many benefits in the end. Otherwise the goals here are summaried here which are far more ambitious than WASIp1. We couldn't do about 95% of what we're doing with WASIp1, and doing more things can sometimes increase complexity. Mostly want to highlight that we're not just making things complicated because we feel like it.

I liked WASIp1 more with its straightforward syscall-like interface

The main difference is that with WASIp1 there was a handful of syscalls and that's it, there was literally nothing else and nor could there ever really be anything else (no migration path to anything else). With WASIp2 and components APIs can have any shape and form and have any name. Producing a component binary requires knowing this type information and without the type information all you've got is a core wasm import. It's similar to how in assembly a function might use two input registers, but you don't actually have any idea what's in those registers (pointers? lengths? zero-extended 8-bit integers?). LLVM does not have native support for components, so everything here is the next-best approach which is to schlep along type information.

If you want you can try deleting the type section information. That would leave you with a WASIp1-like approach of "just define a function and call it". The problem you're then faced with is "who brings along the type information". For WASIp2 it's probably the standard library or something else. You'd end up just praying someone else brought it along though because without that you're unable to generate a component.

@newpavlov
Copy link
Member

newpavlov commented Mar 10, 2026

It's similar to how in assembly a function might use two input registers, but you don't actually have any idea what's in those registers

I would expect to have a "base syscall layer" which would have it's type information built-in into the WASI spec (it can be kept in the WIT form), while the component model would be used for extensions and richer abstractions.

Using your analogy, Linux defines syscalls and their ABIs statically without relying on any linker or code generation trickery. Sure, it's nice to have a machine readable description of syscalls with rich type information, but I don't see why we need to duplicate type information for every syscall in generated binaries.

But I guess it's off-topic, my rant will not change WASIp2/3. :)

If you want you can try deleting the type section information.

What is a potential failure mode of using:

#[link(wasm_import_module = "wasi:random/random@0.2.10")]
unsafe extern "C" {
    #[link_name = "get-random-u64"]
    fn get_random_u64() -> i64;
}

in a project which does not bring type information from elsewhere? Would it result in a wasm-component-ld failure?

@alexcrichton
Copy link
Collaborator Author

Yeah, while handwritten bindings such as that will work to generate a core wasm module it may fail at link time when wasm-component-ld is invoked without type information. Under the hook wasm-ld (from LLVM) is invoked and it'll produce a core wasm module which has the import, but when wasm-component-ld tries to turn that into a component it may not know the component type for that function.

This is difficult to see with the WASIp2 target because the type information comes from a number of other sources (e.g. libc, libstd, other crates, etc), so it's sort of tough to get into a situation where there's no type info. An example reproducer is this project compiled with:

$ cargo rustc --target wasm32-wasip2 -- -C link-arg=--wasi-adapter=none
...
   Compiling wat v0.1.0 (/home/alex/code/wat)
error: linking with `wasm-component-ld` failed: exit status: 1
  |
  = note:  "wasm-component-ld" "-flavor" "wasm" "--export" "hi" "-z" "stack-size=1048576" "--stack-first" "--allow-undefined" "--no-demangle" "<24 object files omitted>" "/home/alex/code/wat/target/wasm32-wasip2/debug/deps/{libgetrandom-6cce658c17674b1c,libdlmalloc-ff0aa16cc237f92c,libcfg_if-1c976c13f169ba83,libwit_bindgen-4f31fab50d6d10d1}.rlib" "<sysroot>/lib/rustlib/wasm32-wasip2/lib/{liballoc-*,libcore-*,libcompiler_builtins-*}.rlib" "-L" "/home/alex/code/wat/target/wasm32-wasip2/debug/build/wit-bindgen-352f3720fbcf01d5/out" "-L" "<sysroot>/lib/rustlib/wasm32-wasip2/lib/self-contained" "-o" "/home/alex/code/wat/target/wasm32-wasip2/debug/deps/wat.wasm" "--gc-sections" "--no-entry" "-O0" "--wasi-adapter=none"
  = note: some arguments are omitted. use `--verbose` to show all linker arguments
  = note: error: failed to encode component

          Caused by:
              0: failed to decode world from module
              1: module was not valid
              2: failed to resolve import `wasi:random/random@0.2.10::get-random-u64`
              3: module requires an import interface named `wasi:random/random@0.2.10`


error: could not compile `wat` (lib) due to 1 previous error

Here #![no_std] is used to avoid libstd's type information, cdylib avoids libc's type information, and --adapter=none avoids the legacy wasip1-to-wasip2 adapter's type information (part of wasm-component-ld). This is a bit of a contrived example, however, and may not reflect anyone's real-world practice.

If y'all want you could certainly try out:

#[cfg_attr(target_env = "p2", link(wasm_import_module = "wasi:random/random@0.2.10"))]
#[cfg_attr(target_env = "p3", link(wasm_import_module = "wasi:random/random@0.3.0-rc-..."))]
unsafe extern "C" {
    #[link_name = "get-random-u64"]
    fn get_random_u64() -> i64;
}

While I can produce a project that it will fail to work within due to lack of type information I can't say that such a project is idiomatic and/or expected to show up much. I'd leave this to y'all in terms of a cost calculus where removing deps from downstream lockfiles for niche platforms may have a higher benefit to y'all than niche scenarios on those platforms failing to compile.

FWIW y'all aren't the only ones who don't want all the rigamarole of everything involved here. Other folks in wasm have also sought out a simpler system/space. I'm personally hesitant to bake WASI into wasm-component-ld, for example, but that may truly be the best way forward. If that were done then you wouldn't need any bnidings other than the small bit above.

@newpavlov
Copy link
Member

What will happen with a manual binding without duplicated type information (e.g.#[link(wasm_import_module = "wasi:random/random@0.2.10")]) after a hypothetical v0.2.11 is released and used by std/wasip2? Is my understanding correct that wasm-component-ld would fail since it observes type information only about v0.2.11 symbols which are theoretically different from those defined with @0.2.10?

@alexcrichton
Copy link
Collaborator Author

Ah that's a good point I should mention, that's a special-case for a variety of reasons. More-or-less it's fine to use 0.2.0 and that'll work with whomever brings in type information. Technically any 0.2.N will work, and this is something where in the future simply writing 0.2 will work but we haven't gotten there quite just yet. In the meantime the N in 0.2.N is basically just informational, so for getrandom I'd recommend just putting 0 there, rewriting to 0.2.0. That'll work so long as someone else brings in type information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants