Skip to content

Shrink cwasm .wasmtime.{traps,addrmap} sections#13628

Open
alexcrichton wants to merge 2 commits into
bytecodealliance:mainfrom
alexcrichton:shrink-sections
Open

Shrink cwasm .wasmtime.{traps,addrmap} sections#13628
alexcrichton wants to merge 2 commits into
bytecodealliance:mainfrom
alexcrichton:shrink-sections

Conversation

@alexcrichton

Copy link
Copy Markdown
Member

This PR is me scratching an itch I've had for quite some time, notably that the .wasmtime.{traps,addrmap} sections are generally huge and pretty inefficiently encoded. These are inserted into all *.cwasm outputs by default and can often represent over half the size of a compiled module, which I find pretty wasteful. I was curious to throw the problem at an LLM and see if it had recommendations on alternative encoding schemes, and I feel that the result here is pretty understandable and low-complexity while shrinking both of these sections by ~75% over their encodings today. The end result is a 30% size reduction of a libpython.cwasm from 25M to 17M which is a relatively huge improvement. The new encodings are drop-in replacements for the previous API of encoding/searching, and they share general structure but not so much internals to enable the two to still diverge over time if necessary.

This commit scratches an itch I've had for a long time about how we
encode traps into a final `*.cwasm`. This is frequently a pretty
substantial portion of a `*.cwasm` hovering around ~10-15% of the size
often. The goal of this commit is to shrink the size of this section by
at least a factor of two, and this currently shrinks it by ~75%.

The basic problem of this section is it's encoding 5 bytes of
information per trap, the u32 pc offset and the u8 trap code. The
previous encoding used all 5 bytes per trap, but this is generally not
the most efficient method. The other constraint for this section,
however, is that we want O(log N) search time to find a trap code for a
particular trapping offset meaning that a linear scan is a bit too much
to bite off here.

The general idea of this new encoding is as follows:

* Split the entire list of traps for a `*.cwasm` into fixed-width
  blocks, here defined as 128 traps-per-block.
* A fixed-width index is created which maps from first-pc-in-block to
  where-block-is-encoded. This index is the O(log N) search.
* Each block is encoded as:
  * First a trap code byte. Currently the most common trap in this block.
  * Next, for each entry in the block,
    `uleb((offset - prev_offset) << 1 | different_trap)` is encoded.
    This enables a delta-encoding of offsets which is the main source of
    compression, and the lowest bit, if present, means that the uleb is
    followed by a trap byte indicating what trap this offset corresponds
    to.

Overall this gets the original 5-byte-per-trap overhead to roughly 1.5
bytes-per-trap which shaves off 75% of the size of this section. The
lookup factor for traps is still O(log N) with a slightly higher
constant factor than before.

The 128 traps-per-block factor is relatively arbitrary at this time, but
some analysis showed that it was a relatively good sweet spot of not
being too big while still getting the lion's share of compression
benefits.
This commit mirrors the previous commit for the `.wasmtime.addrmap`
section of binaries. The encoding is similar in structure but the
encoding of each block is slightly different where it handles the
different nature of the address map section. Notably the payload of
pc-delta's lowest bit of each entry indicates whether this is a "none"
position or not. If a position is available then it's sleb-encoded as a
delta from the previous position.

The goal is to compress the 8-bytes-per-entry to ~2 bytes-per-entry
which is largely achieved with this commit. Each entry tends to be
pretty close pc-wise to the previous entry and pretty close source-wise
from the previous entry as well. Overall this shrinks the
`.wasmtime.addrmap` section by ~75% locally.

In sum for a `libpython.so` this shaves of 8M of a 25M binary, saving
~30% in total file size between this optimization and the previous.

cc bytecodealliance#3547 - note though this doesn't close the issue because this only
compresses the section better, it doesn't remove extraneous entries
which won't ever be needed.
@alexcrichton alexcrichton requested a review from a team as a code owner June 12, 2026 22:46
@alexcrichton alexcrichton requested review from pchickey and removed request for a team June 12, 2026 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant