Add formal verification proofs and spec-sync tooling by avrabe · Pull Request #2 · pulseengine/meld

avrabe · 2026-02-13T05:23:49Z

Summary

Add Rocq 9.0 formal verification framework for the fusion pipeline with 146 proved theorems (Qed) and only 2 Admitted across ~7,700 lines of proof
Add spec-sync tool for generating Rocq definitions from the WebAssembly spec
Fix Rust implementation gaps: start function wrapper, cabi_realloc detection, string transcoding (Latin-1/UTF-8/UTF-16), export_func_idx calculation
Add wit-bindgen e2e test infrastructure and Bazel build fixes

Proof highlights

Area	Theorems	Key results
Merge correctness	31 Qed	Index-space merging preserves validity
Merge remap	31 Qed	Cumulative offset remapping is injective and bounded
Merge layout	15 Qed	Memory layout computation correctness
Resolve spec	40 Qed	Kahn's algorithm: acyclic ↔ valid topological ordering
Adapter spec	9 Qed	Adapter generation preserves call semantics
Rust-to-Rocq	17 Qed	Translated merger core properties
Segments/resolver	3 Qed	Segment layout, resolver lookup

Remaining 2 Admitted (both genuine gaps)

gen_all_remaps_mem_zero — Rocq model of gen_remaps_for_module needs SharedMemory MemIdx special-case to match Rust implementation (maps all memory indices to 0)
fusion_preserves_semantics — Top-level semantic preservation theorem requiring full step-relation development

Test plan

cargo test passes (38/38 tests)
bazel build //proofs:all_proofs compiles all proof files (requires Nix toolchain)
Verify no regressions in existing Rust functionality
Review proof structure follows CLAUDE.md conventions (human-readable, meaningful hypothesis names, max 3-4 bullet depth)

Fixture-based testing framework for wit-bindgen composed components: - tests/wit_bindgen/ with BUILD.bazel for fuse + wasmtime test flow - External wit_bindgen repo (v0.52.0) for WIT definitions - Uses rules_wasm_component fix branch (issue #339) for bitflags Tests tagged "manual" until fixtures are generated locally with: wit-bindgen test --languages rust --artifacts artifacts tests/runtime Also: bitflags workspace dep, Cargo.lock updates for wasm targets, clippy fixes for Rust 2024 if-let chains.

- Revert if-let chains (rules_rust doesn't support them yet) - Add clippy::collapsible_if allow for Bazel compat - Add missing wasmparser dep to meld-cli BUILD - Add wat test dep to meld-core BUILD - Disable rebasing_end_to_end_test in Bazel (wasmtime 41 needs nightly) - Use struct init instead of field reassign in test

Pure Rocq specs defining what fusion transformations should achieve: proofs/spec/ - wasm_core.v: WebAssembly core module model (types, instructions, indices) - component_model.v: Component Model abstractions (composed components) - fusion_spec.v: Fusion correctness properties (remap validity, injectivity) proofs/transformations/merge/ - merge_spec.v: Merge transformation specification with correctness theorems Key theorems to prove: - fusion_preserves_semantics: fused module behaves like composed components - merge_correctness: remapped indices are valid in fused module - memory_layout_disjoint: shared memory regions don't overlap Builds with: bazel build //proofs/spec:fusion_spec //proofs/transformations/merge:merge_spec

Tool parses OCaml type definitions from WebAssembly spec reference interpreter and generates corresponding Rocq definitions. tools/spec-sync/ - Parses types.ml and ast.ml from wasm-spec interpreter - Generates Rocq Inductive/Record definitions - Tracks spec version for change detection Known limitations (TODO): - Mutual recursion ('and' keyword) not fully handled - Some complex type patterns need refinement Usage: spec-sync --spec-dir <wasm-spec-path> --output <output.v>

…ring docs Proof files: Split merge_spec into modular merge_defs/layout/remap/correctness chain. Add resolve_spec with topological sort and adapter site completeness proofs. Add adapter_spec with FACT-style adapter generation properties. Add fusion_spec forward simulation framework. Reduced Admitted from 34 to 5 across all compiled proof files. Rust implementation: Complete UTF-8 to UTF-16 and UTF-16 to UTF-8 string transcoding in FACT adapter generator with full surrogate pair handling. Add cabi_realloc detection and export_func_idx tracking in resolver/merger. Documentation: Add Rocq 9.0 proof engineering guide to CLAUDE.md covering tactics, patterns, and common pitfalls. Add name explanation to README.

Proofs: Close resolve_correctness with proper structural preconditions (Admitted→Qed). Close generate_adapters_correct with algorithm invariant preconditions (Admitted→Qed). Document merge_mems_length design gap: SharedMemory MemIdx remap produces cumulative offsets but merged module has only 1 memory, making the inequality direction wrong for callers. Rust: Generate start function wrapper when multiple components have start functions, instead of silently dropping all but the first. Admitted count: 3 (merge_mems_length, topo_sort_cycle_detection forward, fusion_preserves_semantics). Qed count: 181.

Prove the forward direction of topo_sort_cycle_detection (Kahn's algorithm correctness): acyclic graphs admit a valid topological ordering. The proof uses walk construction + pigeonhole to show acyclic graphs have source nodes, then strong induction to build the ordering by removing sources. Fix merge_mems_length design gap: replace the wrong-direction inequality with merge_mems_count (SharedMemory produces exactly 1 memory) and mem_remaps_zero (SharedMemory maps all MemIdx to 0). Both caller theorems now have correct proofs using the SharedMemory-specific reasoning. Admitted: 2 (gen_all_remaps_mem_zero model gap, fusion_preserves_semantics) Qed: 146 across all compiled proof files

Update the Rocq model of gen_remaps_for_module to special-case MemIdx for SharedMemory, matching the Rust implementation where all memory indices map to 0. This required cascading changes through merge_defs.v, merge_remap.v, and merge_correctness.v. Key changes: - Add gen_remaps_for_space_zero for MemIdx (all fused_idx = 0) - Update in_gen_remaps_for_module_fused with conditional fused_idx - Restrict gen_all_remaps_injective to non-MemIdx (correctly: MemIdx is intentionally non-injective in SharedMemory mode) - Prove gen_all_remaps_mem_zero (was Admitted) - Split fusion_preserves_semantics: trap_equivalence (Qed) + forward_simulation (Admitted, requires refined step relations) Remaining: 1 Admitted (fusion_forward_simulation) which requires non-stub step/trap relations to prove.

Refine the operational semantics from unconstrained stubs to abstract step effects with active module tracking, making forward simulation provable: - Add ces_active field to composed_exec_state - Add local_effect abstraction for stack/locals changes - Constrain composed_step: CS_LocalOp only changes active module, CS_CrossModuleCall switches active module - Fix state_correspondence: only active module's stack/locals must match fused state (eliminates original counterexample) - Prove fusion_forward_simulation by case analysis on composed_step - Prove fusion_trap_equivalence with active module preconditions Result: 0 Admitted across all proof files, 135 Qed total.

Bridge theorem (merge_bridge.v): - Add instr_wf predicate for instruction well-formedness (index bounds + recursive nested body well-formedness) - Add instr_size measure for well-founded induction over nested instrs - Prove instr_list_rewritable via strong induction (lt_wf_ind): every well-formed instruction list can be rewritten using gen_all_remaps - Close gen_all_remaps_enables_rewriting with zero Admitted proofs Dead code removal: - Delete merge_spec.v (2035-line pre-split monolith, not compiled by Bazel since the split into merge_defs/layout/remap/correctness/bridge) - Update doc references in CLAUDE.md, BUILD.bazel, merger_core.rs, merger_core_proofs.v to point to the split modules The entire proofs/ tree now has zero Admitted proofs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Implement function import resolution for both cross-component (via adapter_sites) and intra-component (via module_resolutions) calls. This ensures that call targets are correctly remapped during fusion, enabling proper cross-component function calls. The merger now: 1. Resolves cross-component imports by looking up adapter_sites and mapping import indices to their target function indices 2. Resolves intra-component imports by looking up module_resolutions and finding the target module's export function index 3. Populates function_index_map for imported functions so the body rewriter can replace call targets correctly

Add comprehensive integration test that verifies the full fusion pipeline: - Parser correctly identifies imports/exports - Resolver matches imports to exports across components - Merger remaps call targets correctly - Encoder produces valid WASM - Runtime execution returns expected results The test creates two modules where module B imports and calls a function from module A, then verifies that after fusion, calling the imported function returns the correct result (3 + 4 = 7).

Add comprehensive formal verification section to README.md: - Overview of verification status (22 files, 207 theorems, 0 Admitted) - Key verified properties (merge correctness, resolve correctness, etc.) - Build instructions for running proofs - Link to proof engineering guidelines Add proof verification instructions to CONTRIBUTING.md: - Nix installation requirement for Rocq proofs - Bazel build commands for verifying merge and fusion specs These changes help users understand the formal guarantees provided by Meld and how to verify the proofs themselves.

Update MODULE.bazel: - Change rules_wasm_component override from git_override to local_path_override - Point to local copy at /tmp/rules_wasm_component_local - Add TODO comment explaining this is temporary until upstream updates componentize-py hash Update proofs/transformations/merge/BUILD.bazel: - Change merge_spec target from rocq_library to filegroup - This aggregates all merge proof targets without requiring a separate library - Simplifies the build structure for proof verification

Update proofs for Rocq 9.0 compatibility: fusion_spec.v: - Add split_all tactic to handle Rocq 9.0's changed split behavior - Replace repeat split with split_all in multiple proof locations - Add missing rewrite steps for set_stack_value_stack - Fix equation orientation issues (eq_sym) - Update global value update proofs to use proper lemmas wasm_semantics.v: - Add set_stack_value_stack lemma - Add set_stack_and_global_globals and set_stack_and_global_value_stack lemmas - Fix update_global_value_same application to handle equation orientation These changes ensure the proofs work correctly with Rocq 9.0's updated behavior while maintaining the same verification guarantees.

Update MODULE.bazel.lock with latest dependency versions and checksums. This ensures reproducible builds with the updated rules_wasm_component local path override. Note: This large file (3.2MB) is necessary for Bazel module system reproducibility and is intentionally committed despite exceeding the usual size limit for lock files.

Add PROOF_GUIDE.md that explains Meld's formal verification system in a beginner-friendly way. Targeted at developers with university-level CS math background but no prior Rocq/Coq experience. The guide covers: - Introduction to formal verification and why it matters for Meld - Rocq 9.0 basics and key features used in Meld - Meld's proof architecture and how it relates to Rust code - Reading and understanding theorem statements and proofs - Key theorems explained (fusion_preserves_semantics, etc.) - Common proof patterns and techniques - Building, verifying, and contributing to proofs - Learning resources and next steps This documentation bridges the gap between mathematical background and practical understanding of Meld's formal verification, making it accessible to new contributors while maintaining rigor.

…theme Enhance PROOF_GUIDE.md with visual diagrams and better thematic presentation: - Add Mermaid diagram showing proof architecture with color-coded sections - Add Mermaid diagram illustrating the proof pipeline - Add Mermaid diagram explaining forward simulation pattern - Use consistent color palette (blues, greens, oranges, yellows) - Update title and subtitle to reflect fusing theme - Add poetic subtitle about metals melded in a forge The diagrams will render beautifully on GitHub and help visual learners understand Melds proof structure and techniques more intuitively.

Fix the cycle in the proof architecture diagram where guide was pointing to its parent Proofs subgraph. Move the guide node outside the subgraph and connect it properly to avoid the cycle error. This ensures the Mermaid diagrams will render correctly on GitHub.

Enhance the proof guide with a detailed 'Deep Dive: Tactics in Action' section: - Add 5 concrete examples from actual Meld proofs - Show split_all usage and Rocq 9.0 migration - Demonstrate case analysis with destruct - Explain equality reasoning with rewrite/reflexivity - Show injectivity proofs with injection - Provide complex proof structure example - Add key insights about tactic combinations - Include comprehensive learning resources These examples bridge the gap between abstract tactic descriptions and practical proof development, helping beginners understand how tactics are actually used in Meld's verification system.

Enhance the formal verification section in README.md to include a link to the new PROOF_GUIDE.md, providing a beginner-friendly on-ramp to understanding Meld's formal verification system. The README now guides users from high-level overview → beginner guide → proof engineering guidelines, creating a complete learning path.

Clean up documentation to remove status metrics and standardize color schemes: - Remove poetic subtitle and status metrics from PROOF_GUIDE.md - Update all Mermaid diagrams to use README's consistent color scheme (#f9f and #bbf) - Remove proof file/lemma counts from README.md (always needs updating) - Use classDef for consistent styling across all diagrams - Maintain clean, professional tone throughout These changes ensure documentation stays current and presents a unified visual style.

Add ARCHITECTURE.md to meld-core/ that explains the software architecture and data flow of Meld's fusion engine. Key features: - Clear component model explanation (using standard WebAssembly terminology) - Detailed architecture diagram with consistent color scheme - Step-by-step data flow through all 5 phases (Parse, Resolve, Merge, Adapt, Encode) - Core module breakdown with responsibilities and key components - Memory strategy explanations (shared vs separate) - Cross-component call resolution and optimization - Error handling strategy - Development guidelines and future work - Consistent Mermaid diagrams matching README style This provides a complete technical overview of how Meld works internally, using standard WebAssembly terminology instead of P2/P3 semantics.

Standardize all Mermaid diagrams in ARCHITECTURE.md to use consistent classDef approach: - Update all diagrams to use classDef buildFill and runtimeFill - Fix Mermaid syntax error in error handling diagram (Result~T, Error~ → Result) - Ensure consistent styling across all 10 diagrams - Match the color scheme used in README.md and PROOF_GUIDE.md These changes ensure all diagrams render correctly on GitHub and present a unified visual style throughout the documentation.

Fix Mermaid syntax error by replacing '? operator' with 'question mark operator' in the error handling diagram. The '?' character was causing parsing issues. This ensures the diagram renders correctly on GitHub.

Fix Mermaid syntax error by replacing 'Error::with_context()' with 'Error_with_context' in the error handling diagram. The '::' characters were causing parsing issues. This should resolve the final rendering issue.

Fix incorrect automotive SPACE reference and replace with proper Automotive SPICE (ISO/IEC 15504) compliance documentation. Add comprehensive Automotive SPICE sections: - SPICE compliance overview with process group diagram - Detailed process mapping table (SWE.1-6) - ISO 26262 safety principles and architecture - Formal verification as compliance evidence - Safety-critical design principles This aligns Meld's documentation with actual automotive industry standards and provides a foundation for safety certification efforts.

Fix Mermaid syntax errors by removing pipe characters from edge labels in the Automotive SPICE process diagram. The pipe characters were causing parsing issues with GitHub's Mermaid renderer. This ensures all diagrams render correctly while maintaining the same informational content.

Replace complex Automotive SPICE documentation with clear, practical architecture guide that focuses on what Meld actually does. Key improvements: - Removes automotive standards jargon - Explains each pipeline stage clearly - Provides simple, explanatory text for all diagrams - Connects architecture to formal proofs - Includes practical examples and use cases - Much more readable and approachable The documentation now focuses on practical understanding rather than compliance standards, making it accessible to all developers.

Significantly improve the Basic Rocq Concepts section with: - Line-by-line explanation of the example code - Clear breakdown of what each Rocq construct means - Practical analogy to Rust concepts (enum, fn, etc.) - Detailed explanation of the proof construction process - Key concepts summary This makes the Rocq introduction much more accessible to developers new to interactive theorem provers by explaining not just what the code looks like, but what it actually means and does.

Fix Mermaid syntax error by putting all class assignments on a single line. The line break in the class definition was causing parsing issues with GitHub's Mermaid renderer. This ensures the proof architecture diagram renders correctly.

Enhance the tactics table with a 'Plain English' column that explains what each tactic actually does in everyday language. This makes the tactics much more accessible to developers new to interactive theorem provers by providing intuitive descriptions alongside the technical ones. Examples: - intros: 'Assume these facts are true' - split: 'Break this AND statement into parts' - destruct: 'Consider all possible cases' - reflexivity: 'Both sides are identical' This bridges the gap between formal proof terminology and practical understanding.

Shared memory is fundamentally broken for real components — when one component grows memory, every other component's address space is corrupted. Multi-memory (WebAssembly Core Spec 3.0) lets each component keep its own linear memory, eliminating this class of bugs. Rust implementation: - Complete multi-memory path in merger (imported + defined memories) - Fix all hardcoded memory_index:0 in adapter transcoders - Implement cross-memory copy adapter (cabi_realloc + memory.copy) - Default MemoryStrategy changed from SharedMemory to MultiMemory - Add multi-memory integration tests Rocq proofs (0 Admitted): - Add memidx parameter to all memory instructions in wasm_core.v - Add MemIdx rewrite cases to instr_rewrites relation - Generalize sc_memory_eq from hardcoded index 0 to lookup_remap - Parameterize merge pipeline on memory_strategy - Prove remap_injective_separate_memory for all index spaces Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Remove nonexistent native.rs from CLAUDE.md file tree - Add missing error.rs, rewriter.rs, segments.rs, adapter/fact.rs - Fix invalid test commands (no --test integration target, proptest is a dev-dependency not a feature flag) - Update proofs/DECISIONS.md date to 2026-02-17 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Implement cross-memory result-copy adapter in fact.rs: when callee returns (ptr, len) into its memory, adapter allocates in caller memory via cabi_realloc, copies data with memory.copy, returns new pointer - Add returns_pointer_pair field to AdapterOptions for detecting (i32, i32) return signatures that need cross-memory copying - Add test_multi_memory_result_copy_adapter: hand-assembled wasm module with two memories, verifying data initialized in memory 1 is correctly copied to memory 0 via the adapter pattern (asserts run() returns 66) - Add 7 eval_instr constructors in wasm_semantics.v: MemorySize, MemoryGrow, Load, Store, MemoryCopy, MemoryFill, MemoryInit - Add eval_memory_remap helper and 7 forward simulation cases in fusion_spec.v for all memory instructions, no Admitted proofs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Source code fixes (all 61 tests pass): - Bug 1: Wire adapter functions into call graph (lib.rs wire_adapter_indices) - Bug 2-3: Resolve imported globals/tables through module_resolutions (merger.rs) - Bug 4-5: Remap Concrete heap types and handle abstract types in init exprs - Bug 6: Proper const expression remapping with ParsedConstExpr (segments.rs) - Bug 7: Replace fuzzy .contains() with exact == matching (resolver.rs) - Bug 10: Error on missing realloc instead of fallback to function index 0 Proof suite strengthening (fusion_spec, adapter_spec, resolve_spec build): - Replace 4 vacuous True definitions with real specifications (adapter_spec.v) - Strengthen trap constructors with concrete conditions (fusion_spec.v) - Add element-wise function reference correspondence to table_corresponds - Distinguish FS_InlinedCall from FS_Instr with module provenance - Delete vacuous module_resolution_internal, derive acyclicity (resolve_spec.v) - Fix merge_remap.v and merge_correctness.v for strategy parameter changes Also: parse ComponentCanonicalSection, post-return support, alignment fixes, 3 new multi-memory integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add sc_memory_surj and sc_table_surj fields to state_correspondence for backward memory/table surjectivity (every fused index comes from some source module). Document two fundamental modeling issues in the Admitted proofs: 1. fusion_trap_equivalence backward direction: CT_OutOfBounds requires the memory to be in the active module, but fused module has all modules' memories. Backward OOB/TypeMismatch is unprovable with current per-module trap conditions. Forward direction fully proven. 2. fusion_forward_simulation CS_CrossModuleCall: CS_CrossModuleCall does not constrain ms_src' (source module state after call), and Heval_tgt operates on ms_tgt not fes_module_state. Both need model refinement to close. Also document clear resolution paths for each gap. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… handling Two bug fixes in the fusion pipeline: 1. function_index_map (and global/table maps) stored 0-based array positions instead of absolute wasm indices. When unresolved imports exist (e.g. WASI host imports), the import section occupies indices 0..K-1 and defined functions start at K — but call targets in rewritten function bodies referenced array position 0 instead of K. Fix: pre-compute unresolved import counts, offset all index-map values, and explicitly map unresolved imports to their positions. 2. generate_memory_copy_adapter's same-memory early-return path skipped cabi_post_return entirely. The callee's post-return function was never called, violating the Canonical ABI contract. Fix: handle post-return in the same-memory path, matching generate_direct_adapter. Also fixes 14+ cascading Rocq 9.0 proof errors in resolve_spec.v and applies merge_correctness.v compilation optimizations (linear tactic replacing exponential repeat). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The generate_transcoding_adapter was the last callsite still using merged.functions.get(target_func as usize) instead of merged.defined_func(target_func). With unresolved imports, target_func is an absolute wasm index (offset by import count), so direct array indexing would look up the wrong function entry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The canonical section's PostReturn option specifies a component-level core function index. For multi-module components this differs from the module-local index used as a key in function_index_map. Without decomposition, the lookup silently fails (or worse, hits the wrong function). Fix: use decompose_component_core_func_index in the resolver to convert to (module_idx, module_local_func_idx) before storing. This mirrors the existing correct pattern for realloc in the merger. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Bridge the gap between the flat-concatenation proof model (merge_defs.v) and the import-resolving code implementation (merger.rs). The bridging proof: - Defines import resolution abstractly as a predicate on imports - Proves resolved offsets <= flat offsets (resolved_offset_le_flat) - Proves resolved remap is bounded, injective, and complete for defined items - Shows the flat model's instruction rewriting transfers to any valid resolution (resolved_enables_rewriting) - Includes trivial_resolve lemma showing flat model is a special case Also adds documentation to merger.rs and merge_defs.v explaining the proof-implementation gap and how merge_resolution.v bridges it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Create proofs/STATUS.md with pipeline coverage matrix, proof statistics (23 files, 11,082 lines, 244 Qed, 4 Admitted), and documentation of known model gaps. Fix incorrect "0 Admitted" claim in PROOF_GUIDE.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Apply crown jewel template with centered header, flat-square badges, pipeline nav ring, and cross-cutting verification callout. Update wsc reference to Sigil. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

avrabe and others added 30 commits February 4, 2026 19:08

Add resolver lookup proof

dde7d9b

Switch rules_wasm_component to main branch

db16c7d

chore: ignore Coq auxiliary files

e94dc4a

fix(docs): fix Mermaid syntax in error handling diagram

0267486

Fix Mermaid syntax error by replacing '? operator' with 'question mark operator' in the error handling diagram. The '?' character was causing parsing issues. This ensures the diagram renders correctly on GitHub.

fix(docs): fix Mermaid syntax in error handling diagram (2)

568b9a4

Fix Mermaid syntax error by replacing 'Error::with_context()' with 'Error_with_context' in the error handling diagram. The '::' characters were causing parsing issues. This should resolve the final rendering issue.

avrabe and others added 16 commits February 17, 2026 07:19

docs: align README with PulseEngine visual identity

0b4e6da

Apply crown jewel template with centered header, flat-square badges, pipeline nav ring, and cross-cutting verification callout. Update wsc reference to Sigil. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

avrabe merged commit 0c6ab42 into main Feb 20, 2026
3 checks passed

avrabe deleted the codex/proofs-spec-baseline branch February 20, 2026 16:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add formal verification proofs and spec-sync tooling#2

Add formal verification proofs and spec-sync tooling#2
avrabe merged 46 commits intomainfrom
codex/proofs-spec-baseline

avrabe commented Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

avrabe commented Feb 13, 2026

Summary

Proof highlights

Remaining 2 Admitted (both genuine gaps)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant