feat(dwarf): make --dwarf remap work on real DWARF + witness oracle (v0.20.0) by avrabe · Pull Request #209 · pulseengine/meld

avrabe · 2026-05-29T20:08:47Z

Summary

v0.19.0 shipped DwarfHandling::Remap validated on synthetic DWARF. This PR runs the falsification the witness-integration epic (#130) called for — fusing a real Rust component (hello_rust.wasm) with --dwarf remap and checking the output DWARF against real compiler output. That immediately falsified the v0.19.0 remap (it stripped on the real fixture), surfacing four address-handling gaps and gimli's all-or-nothing failure mode. All are now fixed, and a strong end-to-end oracle pins the result.

This is the #3 (witness proof) + the multi-source documentation (#208) slice of the planned v0.20.0 (async #142 follows as v0.21.0).

What the falsification found (and fixed)

Gap	Cause	Fix
Every `low_pc` missed	WebAssembly DWARF `low_pc` is the function body start (locals-decl byte), not the first instruction	locals/prefix region maps linearly (locals preserved verbatim)
End addresses missed	half-open `[low,high)` — `high_pc`/range-end/`end_sequence` point one past the last instruction	map `addr == input_end` → `output_body_end`
Mid-instruction addresses missed	LLVM emits fixed-width zero-padded LEBs for relocatable indices; DWARF rows land inside operand bytes	resolve to the containing operator (exact on boundaries)
Data addresses missed	`DW_OP_addr` linear-memory locations share the address space	pass through unchanged when `>= code_extent` (correct under multi-memory)
One miss stripped all DWARF	`gimli::write::Dwarf::from` is all-or-nothing on `convert_address`	per-address tombstoning — total `convert_address`, `0xFFFF_FFFF` for unmappable code

The guarantee strengthens from "all-or-nothing correct-or-strip" to "every emitted address is correct or an ignored tombstone — never a plausible-but-wrong address."

The oracle (`tests/dwarf_remap_witness.rs`)

Fuses a real Rust component with --dwarf remap, parses the output DWARF with gimli, and asserts every non-tombstone DW_TAG_subprogram low_pc equals some fused function's component-provenance code_range.start — two independently-derived facts (DWARF vs. re-parsed code section) that only line up if the remap math is right. This is the pulseengine/witness code-offset → source-line contract, checked in-tree.

Result on the fixture: all but 9 of 225 functions map exactly; the 9 are dead code meld dropped, correctly tombstoned.

Scope / residual

Multi-DWARF-source fusion still falls back to strip — merging independent DWARF unit sets is blocked by gimli's writer API; analysis + relocator design filed as DWARF Phase 2 follow-up: multi-source DWARF merge (cross-module .debug_* relocator) #208.
gimli added as a dev-dependency (already a normal dep) for the oracle.
dwarf.rs is Tier-5, so this PR gets the Mythos delta scan.
LS-D-1 amended with the v0.20.0 hardening; verified by run_ls_verification.py ([ OK ] LS-D-1).

🤖 Generated with Claude Code

…130) Falsifying the v0.19.0 single-source remap against a real Rust component (hello_rust.wasm) surfaced four address classes the synthetic oracle missed, plus gimli's all-or-nothing failure mode. All fixed: - low_pc is the function BODY START (locals-decl byte), not the first instruction — the locals/prefix region now maps linearly (locals are preserved verbatim). Previously every low_pc underflowed to a miss. - Exclusive-end addresses (high_pc-as-addr, range ends, line end_sequence) map to output_body_end. - LLVM emits fixed-width zero-padded LEBs for relocatable indices, so DWARF rows can land inside operand bytes — these resolve to the CONTAINING operator (exact on boundaries) for correct line attribution. - Linear-memory/data addresses (DW_OP_addr) at/beyond the code extent pass through unchanged (correct under multi-memory fusion). - Per-address tombstoning replaces all-or-nothing strip: gimli's Dwarf::from aborts on a single unmappable address, which on real modules (dead-code gaps) stripped ALL DWARF. convert_address is now total — correct offset where mappable, identity for data/tombstones, 0xFFFF_FFFF for unmappable code. Guarantee strengthens to "every emitted address is correct or an ignored tombstone, never wrong". Witness oracle (tests/dwarf_remap_witness.rs, gimli dev-dep): fuses a real Rust component with --dwarf remap and asserts every non-tombstone subprogram low_pc equals some fused function's component-provenance code_range.start (the pulseengine/witness contract, in-tree). On the fixture all but 9 of 225 functions map exactly; the 9 are dead code. LS-D-1 amended with the v0.20.0 hardening. Multi-source still strips (gimli writer-API blocker, see #208). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

temper-pulseengine · 2026-05-29T20:13:22Z

Automated review for PR #209

pulseengine/meld:feat/dwarf-remap-witness → pulseengine/meld:main

Verdict: 💬 Comment

Summary: The commit introduces a real-DWARF hardening and per-address tombstoning feature in MeldCore. It includes several bug fixes and improvements to address issues found during testing.

Findings: 0 mechanical (rivet) · 1 from local AI model.

Findings (1):

Cargo.toml:10
```
version = "0.20.0"
```
The version number has been updated to reflect the new feature.

Generated by a local AI model and post-validated against a strict JSON contract. Each finding includes the verbatim line being criticised — verify by reading the file at the cited location.

Reviewed at d4b9c34

github-actions · 2026-05-29T20:17:24Z

LS-N verification gate

⚠️ 36/38 verified — 2 missing regression tests

	count
Passed (≥1 test, all green)	36
Failed (≥1 test failure)	0
Missing (no `ls__NN_` test found)	2

_{Approved loss-scenarios.yaml entries are expected to have a

regression test named ls_<letter>_<num>_* (e.g. LS-A-11 →

ls_a_11_*). The gate runs each prefix via cargo test --lib --no-fail-fast and aggregates pass/fail/missing.}

Failed LS entries

(none)

Missing regression tests

LS-R-13
LS-M-6

_{Updated automatically by tools/post_verification_comment.py.

Source of truth: safety/stpa/loss-scenarios.yaml.}

github-actions · 2026-05-29T20:36:52Z

Mythos delta-pass (auto)

✅ NO FINDINGS across 1 Tier-5 file(s)

File	Verdict	Hypothesis
``	✅ NO FINDINGS	—

_{Auto-run via anthropics/claude-code-action@v1

(SHA-pinned) on the touched Tier-5 files, using the

maintainer's Max-plan OAuth token. See

.github/workflows/mythos-auto.yml and

scripts/mythos/discover.md.}

) The Mythos auto-scan flagged that `AddressRemap::translate` could emit a plausible-but-wrong address at a function boundary. Input bodies are contiguous, so one address is both function A's exclusive end (A.input_end) and the next function B's start (B.input_start); the BTreeMap range lookup selects B and returned B.output_body_start, which differs from A.output_body_end when A and B are not adjacent in the fused output (interleaved / reordered functions). A's high_pc-as-addr, range-list end, and end_sequence then mapped to a wrong offset. `translate` now detects the divergent-boundary case (prev.output_body_end != span.output_body_start) and returns None so the caller tombstones the address rather than emit a wrong one. When the two interpretations coincide (input order preserved — the single-source case meld emits) the boundary resolves cleanly. Pinned by `translate_ambiguous_boundary_tombstones_when_reordered`; LS-D-1 amended. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…(Mythos #209) Second Mythos auto-scan finding: AddressRemap::code_extent used the max input_end over REGISTERED spans, so a source module's dropped trailing function (DCE'd, no span) has input addresses exceeding code_extent. convert_address then classified them as linear-memory data and passed them through as stale input-module offsets instead of tombstoning. code_extent is now the input module's full code-section extent (every input function, via module_function_layouts which parses them all), so a dropped function's address stays in the code range and tombstones. Pinned by `code_extent_covers_dropped_trailing_function`; LS-D-1 amended. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

avrabe and others added 2 commits May 30, 2026 05:52

github-actions Bot added the mythos-pass-done Mythos delta-pass completed on Tier-5 file changes; findings (or NO FINDINGS) attached to PR label May 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dwarf): make --dwarf remap work on real DWARF + witness oracle (v0.20.0)#209

feat(dwarf): make --dwarf remap work on real DWARF + witness oracle (v0.20.0)#209
avrabe wants to merge 3 commits into
mainfrom
feat/dwarf-remap-witness

avrabe commented May 29, 2026

Uh oh!

temper-pulseengine Bot commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

avrabe commented May 29, 2026

Summary

What the falsification found (and fixed)

The oracle (tests/dwarf_remap_witness.rs)

Scope / residual

Uh oh!

temper-pulseengine Bot commented May 29, 2026

Automated review for PR #209

Uh oh!

github-actions Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

LS-N verification gate

Uh oh!

github-actions Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Mythos delta-pass (auto)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

The oracle (`tests/dwarf_remap_witness.rs`)

github-actions Bot commented May 29, 2026 •

edited

Loading

github-actions Bot commented May 29, 2026 •

edited

Loading