You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
❌ Patch coverage is 96.95122% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.65%. Comparing base (3ee17e6) to head (b23f585). ⚠️ Report is 1 commits behind head on main.
Added src/models/misc/grouping_by_swapping.rs with the GroupingBySwapping satisfaction model, swap-program evaluation, grouping predicate, registry metadata, and canonical example specs.
Registered and re-exported the model in src/models/misc/mod.rs, src/models/mod.rs, and src/lib.rs.
Added src/unit_tests/models/misc/grouping_by_swapping.rs covering construction guards, evaluation, brute-force YES/NO behavior, the issue example, and serialization.
Extended CLI support in problemreductions-cli/src/cli.rs, problemreductions-cli/src/commands/create.rs, problemreductions-cli/src/problem_name.rs, and problemreductions-cli/tests/cli_tests.rs so the model can be created from flags and referred to by name.
Added the paper/tutorial entry in docs/paper/reductions.typ, including the issue example and brute-force command.
Removed the temporary plan file after implementation, per pipeline policy.
Deviations from Plan
Executed the implementation directly in this worktree instead of using subagent-driven-development, because this Codex session was not explicitly authorized to spawn parallel subagents.
The paper text uses plain prose for swap positions rather than inline swap(...) math because Typst rejected that identifier during make paper.
Whitelist: FAIL — this model PR touches files outside the current model whitelist: problemreductions-cli/src/cli.rs, problemreductions-cli/src/commands/create.rs, problemreductions-cli/src/problem_name.rs, problemreductions-cli/tests/cli_tests.rs, and src/lib.rs.
PASS — src/models/misc/mod.rs aggregates grouping_by_swapping::canonical_model_example_specs() into the example DB builder
15
Paper display-name entry
PASS — docs/paper/reductions.typ
16
Paper problem-def block
PASS — docs/paper/reductions.typ
Build Status
make test: PASS
make clippy: PASS
Semantic Review
evaluate() correctness: OK — invalid configs are rejected through apply_swap_program, then the grouped-string predicate is checked on the resulting string.
dims() correctness: OK — vec![string_len; budget] matches the issue’s witness encoding of budget swap slots over 0..string_len.
Size getter consistency: OK — the declared complexity string uses string_len and budget, both provided as inherent getters.
Weight handling: OK — not applicable.
Issue Compliance
#
Check
Status
1
Problem name matches issue
OK
2
Mathematical definition matches
OK
3
Problem type (sat) matches
OK
4
Type parameters match
OK — none
5
Configuration space matches
OK — fixed-length swap program with per-slot domain 0..string_len
6
Feasibility check matches
OK — wrong-length / out-of-range configs evaluate to false
7
Objective function matches
OK — returns true iff the swap program yields grouped blocks
8
Complexity matches
OK — string_len ^ budget
Summary
16/16 structural checklist items passed.
8/8 issue-compliance checks passed.
Deterministic whitelist check failed because the PR includes CLI/root-export files outside the current model whitelist; no blacklisted generated files were committed.
No structural correctness defects found.
Quality Check
Quality Review
Design Principles
DRY: OK — the model, paper entry, and CLI wiring follow existing repo patterns without introducing obvious duplicated logic.
KISS: OK — the swap-program encoding and grouped-string predicate are straightforward and map directly to the linked issue.
HC/LC: OK — problem logic stays in the model module, while CLI and paper integration remain in their usual files.
HCI (CLI changed)
Error messages: OK — pred solve /tmp/grouping-by-swapping-pr758.json fails with an actionable hint to use --solver brute-force.
Discoverability: ISSUE — pred create GroupingBySwapping advertises --budget in the generated parameter list even though the accepted flag is --bound. This comes from the schema-help path using the raw field name from src/models/misc/grouping_by_swapping.rs without a CLI override in problemreductions-cli/src/commands/create.rs.
Consistency: ISSUE — the same help screen shows --budget under “Parameters” but --bound in the example command.
Least surprise: ISSUE — a user following the parameter list will copy the wrong flag name.
Feedback: OK — successful create, show, evaluate, and solve --solver brute-force flows all return usable output.
Test Quality
Naive test detection: ISSUE
problemreductions-cli/tests/cli_tests.rs only checks that --bound appears somewhere in the help stderr for GroupingBySwapping. Because the example line already contains --bound, that test misses the broken parameter list that still says --budget.
Issues
Critical (Must Fix)
None.
Important (Should Fix)
pred create GroupingBySwapping no-flag help advertises the wrong flag name (--budget instead of --bound), which is a real user-facing discoverability bug.
The new CLI regression test is too weak to catch that mismatch, so CI stays green despite the broken parameter list.
Minor (Nice to Have)
The deterministic model-PR whitelist likely needs updating if CLI/root-export edits are expected for new model additions.
Summary
Important: schema-driven pred create GroupingBySwapping help is inconsistent with the real CLI and points users at the wrong flag.
Important: the associated CLI help regression test does not verify the parameter list strongly enough to catch that bug.
Agentic Feature Tests
Feature Test Report: problemreductions CLI
Date: 2026-03-23 Project type: CLI tool Features tested: GroupingBySwapping Profile: ephemeral Use Case: discover the new model from the catalog, inspect it, create the canonical example, and solve it from the CLI Expected Outcome: the new model should be discoverable and the documented create/solve flow should work without contradictory guidance Verdict: fail Critical Issues: 1
Summary
Feature
Discoverable
Setup
Works
Expected Outcome Met
Doc Quality
GroupingBySwapping
partial
yes
partial
partial
inconsistent help
Per-Feature Details
GroupingBySwapping
Role: downstream CLI user validating a newly added model
Use Case: find the model in pred list, inspect it with pred show, create the canonical example, and solve it
Functionality:list, show, direct create, create --example, evaluate, and solve --solver brute-force all worked. Default solve failed with a useful hint to use --solver brute-force.
Expected vs Actual Outcome: partially achieved. The main CLI flow works, but the schema-generated help is inconsistent and misleading.
Blocked steps: none.
Friction points:pred create GroupingBySwapping shows --budget in the parameter list even though the accepted command-line flag is --bound.
Doc suggestions: add a field-name override so the create help prints --bound, and strengthen the CLI help regression test to inspect the parameter list rather than only the example line.
Expected vs Actual Outcome
The expected user path mostly works end-to-end, but the create-help screen contradicts itself. A user who relies on the generated parameter list instead of the example command will be misled.
Issues Found
Important, confirmed: no-flag pred create GroupingBySwapping help advertises --budget instead of the accepted --bound flag.
Suggestions
Map GroupingBySwapping.budget to CLI flag bound in schema-help rendering.
Strengthen the CLI test so it asserts the parameter list advertises --bound and does not advertise --budget.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add the implementation plan for GroupingBySwapping.
Fixes #440