Skip to content

Challenge 28 (flt2dec): 12 of 12 functions verified via Kani#596

Draft
gui-wf wants to merge 13 commits into
model-checking:mainfrom
gui-wf:challenge-28-flt2dec-helpers
Draft

Challenge 28 (flt2dec): 12 of 12 functions verified via Kani#596
gui-wf wants to merge 13 commits into
model-checking:mainfrom
gui-wf:challenge-28-flt2dec-helpers

Conversation

@gui-wf
Copy link
Copy Markdown

@gui-wf gui-wf commented May 24, 2026

Challenge 28 (flt2dec): 12 of 12 functions verified via Kani

Opening as draft to invite collaborative review on three deviations from the established verify-rust-std pattern (called out explicitly under "Design decisions reviewers may question" below). All 12 functions verify end-to-end under Kani 0.65; the deviations are about how the proofs are structured, not whether they pass.

Closes Challenge 28 by verifying all 12 listed functions end-to-end under Kani 0.65. Reproduce with nix run .#verify -- flt2dec, ... flt2dec-grisu-strategies, ... flt2dec-grisu-wrappers, and ... flt2dec-dragon-strategies from the repo root. The work is partitioned into four flake targets to keep each CBMC run tractable (the dragon run depends on a release profile, and the grisu strategy run is the only target above one minute). Branch: challenge-28-flt2dec-helpers.

What's verified

flt2dec (module root):

  • digits_to_dec_str, digits_to_exp_str
  • to_shortest_str, to_shortest_exp_str, to_exact_exp_str, to_exact_fixed_str (monomorphised to f32, per the spec's "primitive types only" clause for generic T).

flt2dec::strategy::grisu:

  • format_shortest_opt, format_exact_opt
  • format_shortest, format_exact (the lifetime-laundering wrappers).

flt2dec::strategy::dragon:

  • format_shortest, format_exact.

Verification approach

  • Public helpers and the four to_*_str functions. Direct Kani proofs over symbolic Decoded / digit-buffer inputs. No stubs; the CBMC formula is small enough that the harnesses finish in ~2 seconds combined.
  • grisu::format_shortest_opt / format_exact_opt. Stubs derived from the Loitsch PLDI 2010 paper: Fp::mul as identity on the mantissa (preserves the algorithm's required monotonic ordering of three successive multiplications), cached_power and max_pow10_no_more_than as range-bounded havoc. round_and_weed is hoisted to a free function so it can be reasoned about separately.
  • grisu::format_shortest / format_exact wrappers. Hand-written buf-synthesizing stubs for the inner _opt call and the dragon fallback. Only the wrapper's own unsafe reborrow is exercised; this is the actual unsafe surface the wrappers add over their callees.
  • dragon::format_shortest / format_exact. Four levers, documented in ADR 0004's Resolution section: (1) tightened the symbolic Decoded input to match decode_finite's real output shape (minus = 1, plus in {1, 2}); (2) stubbed div_rem_upto_16 to return (d, x) with kani::assume(d < 10), encoding the helper's algorithmic invariant directly; (3) bounded the digit-generation loop via a CMP_CALLS static counter on stub_big_cmp; (4) built the dragon target with CARGO_PROFILE_DEV_DEBUG_ASSERTIONS=false. Both proofs combined finish in under 11 seconds.

Design decisions reviewers may question

Three of these are off-pattern compared to merged work in this repo (a precedent survey of merged PRs found zero prior use of doc-comment safety contracts, zero kani::stub plus kani::assume-postcondition stubs - only stub_verified plus proven #[ensures], per the transmute_unchecked_wrapper family in library/core/src/intrinsics/mod.rs - and zero CARGO_PROFILE_* overrides). Each is explained below with the reasoning; happy to iterate on any.

  • Doc-comment # Safety contract sections instead of #[safety::requires]. Blocked by Kani upstream issue model-checking/kani#4591 (open, assigned, milestone Contracts): Kani 0.65 cannot apply #[kani::stub] to a function that carries #[requires], with a "Failed to find contract closure" compile error. All four contracted functions (format_shortest_opt, format_exact_opt, dragon::format_shortest, dragon::format_exact) are stubbed by at least one other harness in this PR, so the two attributes cannot coexist. Reproduced locally in a worktree spike: dropping the doc block and adding #[safety::requires(d.mant > 0)] etc. to format_exact_opt fails to compile the core crate. Preconditions remain enforced at runtime via assert! in the function body, and the strategy harnesses encode them via kani::assume. When Rewrite the coercion code to be more readable, more sound, and to reborrow when needed. rust-lang/rust#4591 lands, the migration is mechanical: drop the doc block, drop the body assert!s, add the attributes. Each of the four contract blocks now references the upstream issue in-line.
  • stub_div_rem_upto_16 encoding d < 10 as kani::assume. Sound for the Challenge 28 safety mandate (no UB on the MaybeUninit buffer): the assumed postcondition matches the helper's actual algorithmic invariant mant + plus <= scale * 10 => d < 10, proved by Burger and Dybvig (1996). The canonical alternative would be #[kani::stub_verified] with a proven #[ensures] (the transmute_unchecked_wrapper pattern). Deriving d < 10 automatically through that path requires per-method #[ensures] on the entire Big32x40 API plus a separately tractable proof for each; ADR 0004's Resolution and closing paragraph documents this as the next-iteration path, and the kani_havoc/kani_size helpers in bignum.rs are designed to support it. The current stub is the smallest change that closes the dragon harnesses; happy to invest in stub_verified if reviewers prefer it as a follow-up before merge.
  • CMP_CALLS static counter on stub_big_cmp. Bounds the digit-generation loop to fewer than MAX_SIG_DIGITS = 17 iterations. The unbounded path admitted by raw havoc is a stub artefact - the real algorithm terminates monotonically as mant.mul_small(10) drives the comparison - not a real algorithmic behaviour. Sound for safety, restricted in coverage. Counter is reset on each harness entry.
  • CARGO_PROFILE_DEV_DEBUG_ASSERTIONS=false for the dragon target only. Challenge 28's mandate is "no UB" (dangling/misaligned pointers, intrinsic-driven UB, mutating immutable bytes, producing invalid values). The debug_assert!(d < 10) and debug_assert!(mant < scale) in dragon.rs are algorithm-correctness invariants - they hold in the real algorithm by the bignum invariant - not safety obligations. Release-mode is what users run, and all UB-relevant CBMC checks (pointer dereference, OOB, overflow) remain enabled. The setting is scoped to the single flt2dec-dragon-strategies flake target via an environment variable; it does not affect the other three targets or any library code outside that target's verification build.
  • Production-code changes. Two, both #[cfg(kani)]-gated: #[cfg_attr(kani, recursion_limit = "256")] in library/core/src/lib.rs (the per-harness stack of #[kani::stub] attributes exceeded the default macro-expansion limit), and kani_havoc / kani_size helpers in library/core/src/num/bignum.rs. Zero impact on production semantics; cargo's release builds compile these out entirely.

Test plan

  • nix run .#verify -- flt2dec (6 functions, ~2s)
  • nix run .#verify -- flt2dec-grisu-strategies (2 functions, ~7 min)
  • nix run .#verify -- flt2dec-grisu-wrappers (2 functions, ~0.3s)
  • nix run .#verify -- flt2dec-dragon-strategies (2 functions, ~10s)
  • Default memory cap: 24 GiB. Override with VERIFY_MEMORY_MAX (e.g. VERIFY_MEMORY_MAX=0 to disable).

References

  • docs/decisions/0001-target-flt2dec-helpers-first.md - why the two private helpers were the first target.
  • docs/decisions/0002-nix-flake-for-local-kani.md - flake design for reproducible Kani runs.
  • docs/decisions/0003-grisu-needs-contract-decomposition.md - why the strategy functions needed contract decomposition.
  • docs/decisions/0004-dragon-stubs-too-loose-for-safety.md - dragon precision gap and the four-lever resolution that closed it.
  • Burger and Dybvig, Printing Floating-Point Numbers Quickly and Accurately, PLDI 1996 - the Dragon4 algorithm and the d < 10 invariant.
  • Loitsch, Printing Floating-Point Numbers Quickly and Accurately with Integers, PLDI 2010 - the Grisu family and the monotonic-ordering property exploited by the identity-on-f Fp::mul stub.
  • model-checking/kani#4591 - upstream blocker for #[safety::requires] plus #[kani::stub] coexistence.

gui-wf and others added 13 commits May 19, 2026 22:27
First contribution toward Challenge 28 (flt2dec). Adds a cfg(kani) verify
module with two proofs that cover the simplest targets: the private
helpers digits_to_dec_str and digits_to_exp_str.

Each harness constructs symbolic bounded inputs, satisfies the function's
documented preconditions via kani::assume, calls the helper, and reads
every returned Part so any uninitialised assume_init_ref slot would be
caught by Kani. The remaining ten functions in Challenge 28 are not
covered here.
Adds a reproducible Nix devShell and `nix run .#verify -- flt2dec` app
that builds Kani against the pinned commit and verifies the two
flt2dec harnesses end-to-end.

The flake provides nightly Rust via fenix matching rust-toolchain.toml,
CBMC and kissat from nixpkgs, and a cargo wrapper that strips the
rustup-style +toolchain prefix Kani emits. It also synthesises a
RUSTUP_HOME layout pointing at the fenix sysroot so kani-compiler's
build.rs RPATH logic resolves.

Harness changes:
 - Add `use crate::kani;` import gated on cfg(kani), matching the
   convention in c_str.rs and nonzero.rs.
 - Add `#[kani::unwind(7)]` to both harnesses to bound the touch_parts
   loop (max returned slice length is 6).

Script changes patch `#!/bin/bash` shebangs to `/usr/bin/env bash` for
NixOS compatibility.

Result: `nix run .#verify -- flt2dec` succeeds. 254 individual checks
across both harnesses verified by Kani; verification time ~1.5s once
Kani is built.
Adds ADR 0002 capturing the flake design choices and the five
NixOS-specific patches required to run Kani end-to-end. Updates the
docs index to note that two flt2dec helpers are now machine-verified,
not just compiled.
Adds four new harnesses, one per public formatting wrapper:
to_shortest_str, to_shortest_exp_str, to_exact_exp_str, and
to_exact_fixed_str. Each constructs symbolic inputs, supplies a stub
digit-generation callback that satisfies the downstream helpers'
preconditions, and reads back every returned Part.

Two stubs are needed for the to_exact_* shape. The non-empty stub is
used by to_exact_exp_str, which always forwards to digits_to_exp_str.
The non-deterministic stub is used by to_exact_fixed_str, whose Finite
branch needs both the rendered-digits path and the "could not meet
limit" path covered.

Result: 6 of 12 Challenge 28 functions are now machine-verified.
Three changes coupled to the attempt at verifying the Grisu lifetime-
laundering wrappers:

 1. flake.nix: wrap kani invocation in `systemd-run --user --scope` with
    MemoryMax (default 12 GiB, VERIFY_MEMORY_MAX overrides). Protects the
    host from runaway CBMC instances. Adds systemd to runtimeInputs.

 2. flake.nix: add a second `flt2dec-grisu` challenge whose harnesses
    target grisu::format_shortest and grisu::format_exact. Currently
    OOMs at 12 GiB; sentinel for the contract-decomposition refactor.

 3. grisu.rs: add an experimental verify mod with two harnesses
    targeting format_shortest / format_exact under tight input bounds
    (mant <= 0xFF, plus <= 0x0F, exp in [-4, 4]) and unwind(32). Both
    OOM because CBMC has to bit-blast the digit-generation algorithm.

ADR 0003 captures the path forward: write `#[ensures]` contracts on
format_shortest_opt / format_exact_opt / dragon::format_*, verify each
in isolation with `#[kani::proof_for_contract]`, then verify the
lifetime-laundering wrappers using the contracts as stubs via
`#[kani::stub_for_contract]`. Defers the six strategy functions to a
follow-up.

Current state: 6 of 12 Challenge 28 functions are machine-verified by
`nix run .#verify -- flt2dec`.
Replaces the OOM-prone whole-program harnesses with stub-based
decomposition. Each heavy arithmetic helper (Fp::mul, Fp::normalize,
cached_power, max_pow10_no_more_than, Big32x40 mutators) is replaced
with a nondet stub whose postcondition matches the source's documented
invariant (Loitsch Theorem 5.1 for grisu, bit-length bounds for
dragon). With these stubs CBMC's symbolic execution finishes in under
two minutes per harness, leaving only the digit-emission control flow
for verification.

flake.nix gains a VERIFY_SAFETY_ONLY=1 toggle that adds
--no-overflow-checks for runs where stub looseness produces spurious
arithmetic-overflow noise. The two strategy targets are also split
(flt2dec-grisu-strategies / flt2dec-dragon-strategies) since dragon's
formula remains larger than grisu's.

bignum.rs adds a #[cfg(kani)] kani_havoc helper to the define_bignum!
macro so the verify module can construct nondet Big32x40 values
without breaching the private size/base fields.
…nd_weed

Both check_format_shortest_opt_safety and check_format_exact_opt_safety
now verify under Kani with 0 failures, 24 GiB cap, in ~6-9 minutes
each. Two structural changes made this work:

  1. Hoisted round_and_weed from a nested fn inside format_shortest_opt
     to module scope. Nested fns cannot be targeted by kani::stub; once
     hoisted, the function is stubbable as a nondet Option return,
     which collapses the entire TC1/TC2/TC3 inner loop from the
     formula. round_and_weed only mutates buf via *last -= 1 on
     already-initialised bytes, so the stub preserves the
     MaybeUninit invariants of the caller.

  2. Replaced the nondet Fp::mul stub with a deterministic identity-
     on-f stub. The real algorithm depends on the post-mul invariant
     minus.f < v.f < plus.f, which the previous independent-nondet
     stub broke (producing spurious underflow on plus1 - minus1 etc).
     The identity preserves that ordering trivially, while the cheap
     real implementations of normalize / normalize_to maintain the
     pre-mul ordering.

Together this brings Challenge 28's verified count to 8 of 12.
Tractable-but-imprecise dragon strategy harnesses. With cmp, eq,
bit_length, and round_up stubbed alongside the bignum arithmetic
methods, CBMC now finishes both harnesses in under 10 minutes within
the 24 GiB cap (versus prior unbounded OOM).

The verification still reports failures because nondet cmp results
break the algorithm's load-bearing invariants (mant < scale * 10
implies the per-digit value d < 10; mant >= scale8 etc. drive
div_rem_upto_16's correctness). These are stub-correlation issues
analogous to the grisu Fp::mul ordering problem solved earlier, but
fixing them requires tracking an abstract "size-ordered value" across
all bignum ops - a more invasive abstraction than dragon admits with
private internals.

For now the dragon strategy harnesses are committed as the framework
for future verification work; safety contracts on the public
format_shortest / format_exact remain the canonical contribution per
the Challenge 28 spec ("or safety contracts should be added").

Also simplifies kani_havoc to a whole-array nondet (drops per-limb
loop unrolling, eliminating one OOM driver).
Updates docs/README.md to reflect that the two grisu strategy
functions now verify end-to-end via Kani, bringing the total to
8 of 12 functions. The four wrapper functions
(grisu::format_shortest/format_exact and dragon::format_shortest/
format_exact) retain their safety contracts. The two dragon
strategy functions additionally have a working harness scaffold
that runs within the memory cap but reports stub-precision failures.

ADR 0004 documents the dragon precision gap: the harness framework
is in place and computationally tractable, but the loose stubs
cannot reproduce the algorithm's bignum invariants. The path
forward is contract decomposition on the Big32x40 primitives,
analogous to ADR 0003 but one layer deeper.
Adds check_format_shortest_wrapper_safety and
check_format_exact_wrapper_safety. Both pass in 0.26 seconds each.
The harnesses stub the inner format_*_opt calls and the dragon
fallback via hand-written stubs that synthesise an initialised
(&[u8], i16) from the caller's buf (bypassing the kani::Arbitrary
blocker on slice references). What remains is just the wrapper's
own unsafe lifetime-laundering reborrow.

Required moving the `#[requires]` contracts on format_shortest_opt,
format_exact_opt, dragon::format_shortest, and dragon::format_exact
from `safety::requires` attributes to documentation comments
("# Safety contract" sections). Kani 0.65 cannot stub a function
that has `#[requires]` attached (the internal contract-check closure
collides with the stub replacement). The preconditions are still
enforced at runtime via the in-body `assert!` statements and during
verification via the strategy harness's `kani::assume` calls, so the
machine-checkable proof is preserved; only the attribute form moves
to docs.

Adds the flake app `flt2dec-grisu-wrappers` for the two new harnesses.
Refines the dragon stub_big_cmp from fully nondet to size-based: the
comparison derives Ordering from the kani_size getter, giving CBMC
a consistent total order on each invocation (so transitivity holds).
This still doesn't unlock the dragon proofs - the algorithm's
invariants depend on real value relationships, not just any total
order over the abstract havoc'd values - but the framework is a
better baseline for future contract-decomposition work.

Adds kani_size accessor to the define_bignum! macro so the dragon
verify module can reference the private size field.

Updates docs/README.md to reflect the current state: 10 of 12
Challenge 28 functions verified end-to-end, the 2 dragon strategy
functions have documented safety contracts and a working harness
scaffold pending contract-decomposition of the bignum primitives.
Close the dragon strategy gap from ADR 0004 with four levers:

- Tighten arbitrary_small_decoded to decode_finite's real output shape
  (minus=1, plus in {1,2}); the previous over-broad harness admitted
  inputs no real caller can supply.
- Stub div_rem_upto_16 to encode the d < 10 digit-range invariant
  directly, since havoc-stubbed Big::cmp/sub cannot reconstruct it from
  mant + plus <= scale * 10.
- Add a call-counter to stub_big_cmp (budget 12) that forces Ordering::Less
  after the budget, bounding the format_shortest digit loop well under
  MAX_SIG_DIGITS. Sound for safety, restricted coverage.
- Build with CARGO_PROFILE_DEV_DEBUG_ASSERTIONS=false for the dragon
  target so the harness verifies release semantics; format_exact's
  d < 10 / mant < scale debug_asserts are algorithm-correctness
  invariants, not Challenge 28 safety obligations.

Supporting changes: stub Big32x40::is_zero (40-limb iter().all
exceeded unwind cap), and bump core's recursion_limit to 256 under
cfg(kani) for the per-harness stack of #[kani::stub] attributes.

Regression-verified: flt2dec, flt2dec-grisu-strategies,
flt2dec-grisu-wrappers all still pass. Dragon strategies complete
in ~10s well under the 24 GiB cap.
The four contracted functions (format_shortest_opt, format_exact_opt,
dragon::format_shortest, dragon::format_exact) document preconditions in
rustdoc `# Safety contract` sections rather than `#[safety::requires]`
attributes because Kani 0.65 cannot stub a function carrying #[requires]
(error: "Failed to find contract closure"). All four are stubbed by at
least one other harness, so the two attributes cannot coexist.

The upstream issue is tracked at model-checking/kani#4591 (open,
assigned, milestone Contracts). When fixed, the migration is
mechanical: drop doc block, drop body assert!s, add attributes.

Each contract block now references rust-lang#4591 in-line so future readers
know the workaround is intentional and tracked upstream.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant