feat: formal verification framework + StateMetric (bridge Step 3)#139
Merged
rororowyourboat merged 12 commits intomainfrom Mar 28, 2026
Merged
feat: formal verification framework + StateMetric (bridge Step 3)#139rororowyourboat merged 12 commits intomainfrom
rororowyourboat merged 12 commits intomainfrom
Conversation
added 12 commits
March 28, 2026 15:55
…ge code Research artifacts (formal analysis, paper gap, representability, deep research) don't belong in published packages or user guides. Consolidate them under docs/research/ and update all cross-references.
…trip
Phase 1a: Hypothesis tests in gds-framework verifying the interchange law,
associativity, commutativity, and identity properties of the composition
algebra (>>, |) across 200 random block configurations each.
Phase 3: Hypothesis strategies generating random GDSSpec instances and
round-trip testing through rho -> Turtle RDF -> rho^{-1}, verifying
structural fidelity for types, spaces, entities, blocks, wirings, and
roles across 100 random specs per property.
Also adds the verification plan document and hypothesis as a dev
dependency for both packages.
Structured log tracking verification research — motivation, method, outcome, and next steps for each entry. Covers project setup, literature review, and Phase 1a/3 property-based testing results.
Fixes all 7 audit findings: 1. (Critical) stackable_pair excludes shared token from extra ports 2. (Critical) sequential concatenation test now asserts backward ports 3. (Important) identity tests cover both left and right for >> and | 4. (Important) gds_specs min_blocks=3 guarantees at least 1 Policy 5. (Important) removed vacuous lossiness test (covered by fixtures) 6. (Important) wire round-trip checks source/target/space, not just count 7. (Minor) Mechanism import moved to module level Adds deterministic Hypothesis profiles (derandomize=True) activated via HYPOTHESIS_PROFILE=ci env var in CI workflow for repeatable test runs.
R3 undecidability: six proofs by reduction from Rice's theorem and the halting problem, covering f_behav, constraint predicates, admissibility predicates, auto-wiring, construction validation, and scheduling. R1/R2 decidability: constructive witnesses from gds-owl export functions and SHACL shapes (13 R1 shapes, 3 R2 shapes, 7 SPARQL templates). Partition independence: proves G_struct/G_behav coincides with R1+R2/R3 via independent definitions from canonical decomposition vs formal language expressiveness, with argument for non-tautology.
…ision Fixes 8 audit findings: 1. (Critical) Aligned R3 definition to parent doc (Def 2.2): R3 has two sources — undecidability (Rice's) and computational class separation. Introduced R3-undecidable vs R3-separation sub-classifications. 2. (Critical) Auto-wiring (Prop 4) correctly labeled as R3-separation, not undecidable. Proof shows SPARQL expressiveness gap, not Rice's. 3. (Critical) Construction validation (Prop 5) now distinguishes current validators (decidable, R3-separation) from extensible case (arbitrary @model_validator, R3-undecidable). 4. (Important) Theorem C.3 reverse containment: "finite relations" are "R1 or R2" not just "R1". 5. (Important) Forward containment explicitly invokes Rice's theorem as the connecting step from Callable to undecidability. 6. (Important) Consistent SHACL-core qualifier throughout Part B, with note distinguishing SHACL-core from SHACL-SPARQL. 7. (Minor) Removed fragile line-number references from witness table. 8. (Minor) Prop 6 simplified to "not in data model" argument only.
…diction) Replace "computation beyond SPARQL" with intrinsic characterization "computation requiring mutable intermediate state or ordered multi-pass processing" so Definition C.1 is genuinely independent of the formalism. Also adds journal entries 003-004 covering PBT audit fixes, CI reproducibility, and Phase 2 formal proofs with both audit rounds.
… 3c-d) Phase 3c: SHACL conformance tests (30 examples each) validate that exported RDF from random GDSSpec and SystemIR passes structural SHACL shapes. SPARQL blocks_by_role sanity check verifies block names match. Separate test class to avoid pyshacl overhead on every round-trip. Phase 3d: TestDerivedPropertyPreservation verifies that downstream operations produce equivalent results after spec round-trip: - SystemIR: name, block names, wiring count, composition type (50 ex) - CanonicalGDS: boundary/policy/mechanism sets, state variables (50 ex) - VerificationReport: system name, finding check IDs, G-002 expected failures on BoundaryAction + Mechanism only (50 ex) New strategies: system_irs(), specs_with_canonical(), specs_with_report() derive from gds_specs() by composing blocks with >> and calling compile_system/project_canonical/verify. Total: 25 PBT tests (13 spec round-trip + 3 SHACL + 9 derived property).
1. (Important) SPARQL test moved to own TestSPARQLConformance class, no longer gated behind HAS_PYSHACL 2. (Important) G-002 test split: invariant check in TestStrategyInvariants, round-trip version added (test_g002_invariant_survives_round_trip) 3. (Important) Finding check uses Counter for multiplicity, plus len() assertion to catch collapsed duplicates 4. (Important) SystemIR wiring checks source/target content, not just count 5. (Minor) Removed dead else branch (min_blocks=3 guarantees n_blocks > 2) 6. (Minor) Extracted _compose_sequential() helper, used by both system_irs() and specs_with_report()
Step 3 of the bridge proposal: declare a distance function on a subset of state variables. Follows the AdmissibleInputConstraint / TransitionSignature pattern — structural fields are R1, behavioral distance callable is R3 (lossy in serialization). gds-framework: - StateMetric model in constraints.py (name, variables, metric_type, distance callable, description) - GDSSpec.register_state_metric() + validation (_validate_state_metrics) - Exported from gds.__init__ gds-owl: - RDF export: StateMetric class + MetricVariableEntry blank nodes - RDF import: round-trip with distance=None (R3 lossy) - SHACL shape: StateMetricShape (name required) - Round-trip fixture test 9 new framework tests (model, registration, validation). 1 new OWL round-trip fixture test.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
docs/research/Test plan
mkdocs build --strictpasses