Skip to content

feat: formal verification framework + StateMetric (bridge Step 3)#139

Merged
rororowyourboat merged 12 commits intomainfrom
dev
Mar 28, 2026
Merged

feat: formal verification framework + StateMetric (bridge Step 3)#139
rororowyourboat merged 12 commits intomainfrom
dev

Conversation

@rororowyourboat
Copy link
Copy Markdown
Collaborator

Summary

  • Phase 1a: Property-based tests for composition algebra (interchange law, associativity, commutativity, identity) — 13 Hypothesis tests
  • Phase 2: Formal proofs for R1/R2/R3 representability boundaries — R3 undecidability (6 propositions), R1/R2 decidability bounds, Theorem C.3 partition independence
  • Phase 3: Property-based round-trip testing for OWL export/import — 26 Hypothesis tests covering GDSSpec, SystemIR, CanonicalGDS, VerificationReport, SHACL validation, SPARQL conformance
  • StateMetric (Paper Assumption 3.2): structural annotation declaring distance functions on state variable subsets — completes bridge Steps 1-3
  • Research journal, verification plan, and 5 GitHub issues tracking the work
  • Consolidated research docs into docs/research/
  • CI reproducibility via Hypothesis derandomize profiles

Test plan

  • 499 gds-framework tests pass (13 algebra PBT + 9 StateMetric tests)
  • 184 gds-owl tests pass (26 round-trip PBT + 1 StateMetric round-trip)
  • mkdocs build --strict passes
  • Ruff lint + format clean

rohan added 12 commits March 28, 2026 15:55
…ge code

Research artifacts (formal analysis, paper gap, representability, deep
research) don't belong in published packages or user guides. Consolidate
them under docs/research/ and update all cross-references.
…trip

Phase 1a: Hypothesis tests in gds-framework verifying the interchange law,
associativity, commutativity, and identity properties of the composition
algebra (>>, |) across 200 random block configurations each.

Phase 3: Hypothesis strategies generating random GDSSpec instances and
round-trip testing through rho -> Turtle RDF -> rho^{-1}, verifying
structural fidelity for types, spaces, entities, blocks, wirings, and
roles across 100 random specs per property.

Also adds the verification plan document and hypothesis as a dev
dependency for both packages.
Structured log tracking verification research — motivation, method,
outcome, and next steps for each entry. Covers project setup, literature
review, and Phase 1a/3 property-based testing results.
Fixes all 7 audit findings:
1. (Critical) stackable_pair excludes shared token from extra ports
2. (Critical) sequential concatenation test now asserts backward ports
3. (Important) identity tests cover both left and right for >> and |
4. (Important) gds_specs min_blocks=3 guarantees at least 1 Policy
5. (Important) removed vacuous lossiness test (covered by fixtures)
6. (Important) wire round-trip checks source/target/space, not just count
7. (Minor) Mechanism import moved to module level

Adds deterministic Hypothesis profiles (derandomize=True) activated via
HYPOTHESIS_PROFILE=ci env var in CI workflow for repeatable test runs.
R3 undecidability: six proofs by reduction from Rice's theorem and the
halting problem, covering f_behav, constraint predicates, admissibility
predicates, auto-wiring, construction validation, and scheduling.

R1/R2 decidability: constructive witnesses from gds-owl export functions
and SHACL shapes (13 R1 shapes, 3 R2 shapes, 7 SPARQL templates).

Partition independence: proves G_struct/G_behav coincides with R1+R2/R3
via independent definitions from canonical decomposition vs formal
language expressiveness, with argument for non-tautology.
…ision

Fixes 8 audit findings:

1. (Critical) Aligned R3 definition to parent doc (Def 2.2): R3 has two
   sources — undecidability (Rice's) and computational class separation.
   Introduced R3-undecidable vs R3-separation sub-classifications.

2. (Critical) Auto-wiring (Prop 4) correctly labeled as R3-separation,
   not undecidable. Proof shows SPARQL expressiveness gap, not Rice's.

3. (Critical) Construction validation (Prop 5) now distinguishes current
   validators (decidable, R3-separation) from extensible case (arbitrary
   @model_validator, R3-undecidable).

4. (Important) Theorem C.3 reverse containment: "finite relations" are
   "R1 or R2" not just "R1".

5. (Important) Forward containment explicitly invokes Rice's theorem as
   the connecting step from Callable to undecidability.

6. (Important) Consistent SHACL-core qualifier throughout Part B, with
   note distinguishing SHACL-core from SHACL-SPARQL.

7. (Minor) Removed fragile line-number references from witness table.

8. (Minor) Prop 6 simplified to "not in data model" argument only.
…diction)

Replace "computation beyond SPARQL" with intrinsic characterization
"computation requiring mutable intermediate state or ordered multi-pass
processing" so Definition C.1 is genuinely independent of the formalism.

Also adds journal entries 003-004 covering PBT audit fixes, CI
reproducibility, and Phase 2 formal proofs with both audit rounds.
… 3c-d)

Phase 3c: SHACL conformance tests (30 examples each) validate that
exported RDF from random GDSSpec and SystemIR passes structural SHACL
shapes. SPARQL blocks_by_role sanity check verifies block names match.
Separate test class to avoid pyshacl overhead on every round-trip.

Phase 3d: TestDerivedPropertyPreservation verifies that downstream
operations produce equivalent results after spec round-trip:
- SystemIR: name, block names, wiring count, composition type (50 ex)
- CanonicalGDS: boundary/policy/mechanism sets, state variables (50 ex)
- VerificationReport: system name, finding check IDs, G-002 expected
  failures on BoundaryAction + Mechanism only (50 ex)

New strategies: system_irs(), specs_with_canonical(), specs_with_report()
derive from gds_specs() by composing blocks with >> and calling
compile_system/project_canonical/verify.

Total: 25 PBT tests (13 spec round-trip + 3 SHACL + 9 derived property).
1. (Important) SPARQL test moved to own TestSPARQLConformance class,
   no longer gated behind HAS_PYSHACL
2. (Important) G-002 test split: invariant check in TestStrategyInvariants,
   round-trip version added (test_g002_invariant_survives_round_trip)
3. (Important) Finding check uses Counter for multiplicity, plus
   len() assertion to catch collapsed duplicates
4. (Important) SystemIR wiring checks source/target content, not just count
5. (Minor) Removed dead else branch (min_blocks=3 guarantees n_blocks > 2)
6. (Minor) Extracted _compose_sequential() helper, used by both
   system_irs() and specs_with_report()
Step 3 of the bridge proposal: declare a distance function on a subset
of state variables. Follows the AdmissibleInputConstraint /
TransitionSignature pattern — structural fields are R1, behavioral
distance callable is R3 (lossy in serialization).

gds-framework:
- StateMetric model in constraints.py (name, variables, metric_type,
  distance callable, description)
- GDSSpec.register_state_metric() + validation (_validate_state_metrics)
- Exported from gds.__init__

gds-owl:
- RDF export: StateMetric class + MetricVariableEntry blank nodes
- RDF import: round-trip with distance=None (R3 lossy)
- SHACL shape: StateMetricShape (name required)
- Round-trip fixture test

9 new framework tests (model, registration, validation).
1 new OWL round-trip fixture test.
@rororowyourboat rororowyourboat merged commit 2a01beb into main Mar 28, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant