feat: formal verification framework + StateMetric (bridge Step 3) by rororowyourboat · Pull Request #139 · BlockScience/gds-core

rororowyourboat · 2026-03-28T11:38:35Z

Summary

Phase 1a: Property-based tests for composition algebra (interchange law, associativity, commutativity, identity) — 13 Hypothesis tests
Phase 2: Formal proofs for R1/R2/R3 representability boundaries — R3 undecidability (6 propositions), R1/R2 decidability bounds, Theorem C.3 partition independence
Phase 3: Property-based round-trip testing for OWL export/import — 26 Hypothesis tests covering GDSSpec, SystemIR, CanonicalGDS, VerificationReport, SHACL validation, SPARQL conformance
StateMetric (Paper Assumption 3.2): structural annotation declaring distance functions on state variable subsets — completes bridge Steps 1-3
Research journal, verification plan, and 5 GitHub issues tracking the work
Consolidated research docs into docs/research/
CI reproducibility via Hypothesis derandomize profiles

Test plan

499 gds-framework tests pass (13 algebra PBT + 9 StateMetric tests)
184 gds-owl tests pass (26 round-trip PBT + 1 StateMetric round-trip)
mkdocs build --strict passes
Ruff lint + format clean

…ge code Research artifacts (formal analysis, paper gap, representability, deep research) don't belong in published packages or user guides. Consolidate them under docs/research/ and update all cross-references.

…trip Phase 1a: Hypothesis tests in gds-framework verifying the interchange law, associativity, commutativity, and identity properties of the composition algebra (>>, |) across 200 random block configurations each. Phase 3: Hypothesis strategies generating random GDSSpec instances and round-trip testing through rho -> Turtle RDF -> rho^{-1}, verifying structural fidelity for types, spaces, entities, blocks, wirings, and roles across 100 random specs per property. Also adds the verification plan document and hypothesis as a dev dependency for both packages.

Structured log tracking verification research — motivation, method, outcome, and next steps for each entry. Covers project setup, literature review, and Phase 1a/3 property-based testing results.

Fixes all 7 audit findings: 1. (Critical) stackable_pair excludes shared token from extra ports 2. (Critical) sequential concatenation test now asserts backward ports 3. (Important) identity tests cover both left and right for >> and | 4. (Important) gds_specs min_blocks=3 guarantees at least 1 Policy 5. (Important) removed vacuous lossiness test (covered by fixtures) 6. (Important) wire round-trip checks source/target/space, not just count 7. (Minor) Mechanism import moved to module level Adds deterministic Hypothesis profiles (derandomize=True) activated via HYPOTHESIS_PROFILE=ci env var in CI workflow for repeatable test runs.

R3 undecidability: six proofs by reduction from Rice's theorem and the halting problem, covering f_behav, constraint predicates, admissibility predicates, auto-wiring, construction validation, and scheduling. R1/R2 decidability: constructive witnesses from gds-owl export functions and SHACL shapes (13 R1 shapes, 3 R2 shapes, 7 SPARQL templates). Partition independence: proves G_struct/G_behav coincides with R1+R2/R3 via independent definitions from canonical decomposition vs formal language expressiveness, with argument for non-tautology.

…ision Fixes 8 audit findings: 1. (Critical) Aligned R3 definition to parent doc (Def 2.2): R3 has two sources — undecidability (Rice's) and computational class separation. Introduced R3-undecidable vs R3-separation sub-classifications. 2. (Critical) Auto-wiring (Prop 4) correctly labeled as R3-separation, not undecidable. Proof shows SPARQL expressiveness gap, not Rice's. 3. (Critical) Construction validation (Prop 5) now distinguishes current validators (decidable, R3-separation) from extensible case (arbitrary @model_validator, R3-undecidable). 4. (Important) Theorem C.3 reverse containment: "finite relations" are "R1 or R2" not just "R1". 5. (Important) Forward containment explicitly invokes Rice's theorem as the connecting step from Callable to undecidability. 6. (Important) Consistent SHACL-core qualifier throughout Part B, with note distinguishing SHACL-core from SHACL-SPARQL. 7. (Minor) Removed fragile line-number references from witness table. 8. (Minor) Prop 6 simplified to "not in data model" argument only.

…diction) Replace "computation beyond SPARQL" with intrinsic characterization "computation requiring mutable intermediate state or ordered multi-pass processing" so Definition C.1 is genuinely independent of the formalism. Also adds journal entries 003-004 covering PBT audit fixes, CI reproducibility, and Phase 2 formal proofs with both audit rounds.

… 3c-d) Phase 3c: SHACL conformance tests (30 examples each) validate that exported RDF from random GDSSpec and SystemIR passes structural SHACL shapes. SPARQL blocks_by_role sanity check verifies block names match. Separate test class to avoid pyshacl overhead on every round-trip. Phase 3d: TestDerivedPropertyPreservation verifies that downstream operations produce equivalent results after spec round-trip: - SystemIR: name, block names, wiring count, composition type (50 ex) - CanonicalGDS: boundary/policy/mechanism sets, state variables (50 ex) - VerificationReport: system name, finding check IDs, G-002 expected failures on BoundaryAction + Mechanism only (50 ex) New strategies: system_irs(), specs_with_canonical(), specs_with_report() derive from gds_specs() by composing blocks with >> and calling compile_system/project_canonical/verify. Total: 25 PBT tests (13 spec round-trip + 3 SHACL + 9 derived property).

1. (Important) SPARQL test moved to own TestSPARQLConformance class, no longer gated behind HAS_PYSHACL 2. (Important) G-002 test split: invariant check in TestStrategyInvariants, round-trip version added (test_g002_invariant_survives_round_trip) 3. (Important) Finding check uses Counter for multiplicity, plus len() assertion to catch collapsed duplicates 4. (Important) SystemIR wiring checks source/target content, not just count 5. (Minor) Removed dead else branch (min_blocks=3 guarantees n_blocks > 2) 6. (Minor) Extracted _compose_sequential() helper, used by both system_irs() and specs_with_report()

Step 3 of the bridge proposal: declare a distance function on a subset of state variables. Follows the AdmissibleInputConstraint / TransitionSignature pattern — structural fields are R1, behavioral distance callable is R3 (lossy in serialization). gds-framework: - StateMetric model in constraints.py (name, variables, metric_type, distance callable, description) - GDSSpec.register_state_metric() + validation (_validate_state_metrics) - Exported from gds.__init__ gds-owl: - RDF export: StateMetric class + MetricVariableEntry blank nodes - RDF import: round-trip with distance=None (R3 lossy) - SHACL shape: StateMetricShape (name required) - Round-trip fixture test 9 new framework tests (model, registration, validation). 1 new OWL round-trip fixture test.

rohan added 12 commits March 28, 2026 15:55

refactor: move research docs to docs/research/ to separate from packa…

5a3f1fa

…ge code Research artifacts (formal analysis, paper gap, representability, deep research) don't belong in published packages or user guides. Consolidate them under docs/research/ and update all cross-references.

docs: add formal verification research journal

662b27e

Structured log tracking verification research — motivation, method, outcome, and next steps for each entry. Covers project setup, literature review, and Phase 1a/3 property-based testing results.

docs: add journal entry 005 — Phase 3c-d completion and audit

929ba5f

style: apply ruff format to modified files

7296937

rororowyourboat merged commit 2a01beb into main Mar 28, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: formal verification framework + StateMetric (bridge Step 3)#139

feat: formal verification framework + StateMetric (bridge Step 3)#139
rororowyourboat merged 12 commits intomainfrom
dev

rororowyourboat commented Mar 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rororowyourboat commented Mar 28, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant