fix(sos): SOS constraints on masked variables (#688) by FBumann · Pull Request #692 · PyPSA/linopy

FBumann · 2026-05-18T20:01:56Z

Stacked on #690. Fixes #688.

Closes the SOS-on-masked-variables bug at its root rather than guarding against it. The defensive _check_sos_unmasked workaround introduced by #689 is removed; the real fix covers both the direct-API path (Gurobi, Xpress) and the LP file writer.

Root cause

Two related failure modes, both rooted in conflating linopy labels with solver-side positions / names:

Direct API: gm.addSOS(...) / problem.addSOS(...) were given linopy labels as if they were 0-based column positions in the active-variable array. Labels equal positions only when no variable in the model is masked anywhere.
LP writer: sos_to_file emitted variable names directly from the raw label array. A masked entry (label = -1) became x-1, which LP parsers reject or (gurobi LP reader) silently corrupt. The set identifier labels.isel(sos_dim=0) was -1 whenever the first slot was masked.

Fix

Direct API (Gurobi, Xpress): shared _sos_set_positions(labels, weights, label_to_pos) helper drops masked entries and resolves linopy labels to active-variable positions via model.variables.label_index.label_to_pos. Both direct builders route through it; empty sets are skipped.
LP writer (sos_to_file): polars pipeline gets a labels != -1 filter before group_by; per-set id is the max of non-masked labels (always a valid label; fully-masked sets get id -1 and are filtered out). Keeps polars vectorization.

The two paths stay separate by design — LP polars vs direct numpy is the right idiom split for each path's throughput needs; unification would force LP to Python-level iteration with no maintenance win for a ~10-line shared concept.

Coverage

test/test_sos_masked.py (new) — 11 fixture configs (1D/2D × all mask placements × with/without unrelated-var mask) × 2 SOS types × every SOS-capable solver × {direct, lp} io_apis ≈ 110 cases. Three-layer oracle per test: status == "ok" + objective == expected + element-wise solution == expected. Asymmetric objective coefficients break permutation symmetry; analytical optimizer in the fixture provides exact oracles using list-position adjacency for SOS2.
Companion sos2 piecewise NaN-padding test converted to a positive assertion (same y_b = 12.5 oracle as the lp variant).
Existing reformulation-with-masked test moved to test_sos_reformulation.py, reworded — reformulation isn't a "workaround" anymore.

TDD discipline

5 commits, each fix preceded by its failing test:

test(sos): regression matrix for masked SOS variables — drops defensive guard so tests surface the real bug. Without fix: 28 LP-path masked cases fail with Name 'x-1' does not exist.
fix(sos): resolve linopy labels to solver column positions in direct builds — direct-path passes.
fix(sos): filter masked SOS members + robust set id in LP writer — LP-path passes.
test(pwl): convert masked-sos2 NaN-padding test to positive assertion.
test(sos): move masked-reformulation test to test_sos_reformulation.

Test plan

pytest test/test_sos_masked.py — 66 active cases pass, 44 skipped.
Full suite passes.
ruff check, ruff format, mypy linopy/{model,solvers,io}.py clean.

🤖 Generated with Claude Code

Add test/test_sos_masked.py covering SOS-with-masked-variables across: - 1D and 2D SOS variables - All mask placements (none, sos_dim, non_sos_dim, both_dims) plus an optional mask on an unrelated variable (the angle that exposes the label-vs-position bug elsewhere in the model) - SOS1 and SOS2 - All SOS-capable solvers × {direct, lp} io_apis Asymmetric objective coefficients [-1,-2,-3,-4] along the SOS dim break permutation symmetry. The fixture's analytical optimizer (matching list-position adjacency for SOS2) supplies an exact (objective, per-slot solution) oracle so wrong indexing surfaces as a visible divergence, not a permutation-equivalent silent pass. Three-layer oracle per test: 1. status == "ok" (catches OOB raises and LP parser rejections) 2. objective.value == expected (catches silent-wrong-answer when the bug constrains the wrong columns) 3. element-wise solution.values == expected (catches the rare permutation-equivalent case where objective happens to match) Also drop the defensive Model._check_sos_unmasked workaround from master (#689) along with its call sites in Solver._build and to_file. With this commit alone the tests run and surface the real bug; subsequent commits in this PR apply the actual fixes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…builds Both Gurobi and Xpress direct API builds were passing linopy variable labels straight to vendor `addSOS` as if they were 0-based column positions in the solver's variable array. Labels equal positions only when no variable in the model is masked anywhere — with any mask, the SOS members get attached to the wrong (or out-of-range) columns. Introduce a shared `_sos_set_positions(labels, weights, label_to_pos)` helper that drops masked entries (label = -1) along with their weights and maps the survivors to active-variable positions via `model.variables.label_index.label_to_pos`. Both `Gurobi._build_solver_model` and `Xpress._build_solver_model` route through it. Empty SOS sets (all members masked, e.g. when a non-SOS dim mask removes an entire set) are now skipped instead of producing an empty vendor call. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The LP writer was emitting variable names from the raw label array. For a masked SOS member (label = -1) this produced `x-1`, which LP parsers either reject outright (cplex, xpress, mosek) or silently corrupt into a wrong SOS set (gurobi LP reader). The set identifier was computed as `labels.isel(sos_dim=0)`, which is -1 whenever the first slot of the set is masked — both an invalid LP identifier and one that collides across distinct sets that all happen to have a masked first slot. Both are fixed in `sos_to_file`: - Per-set id is now the `max` of non-masked labels along the SOS dim, which is always a valid label (any of the set's surviving members would do; max is the simplest stable choice). Fully-masked sets get id -1 and are filtered out before emission. - Masked member rows are dropped from the polars dataframe before the `group_by` aggregation, so masked members never reach the file. The fix preserves the LP path's polars vectorization (no switch to Python-level iteration) and stays separate from the direct-API helper — the two paths have identical-shape fixes implemented in each path's natural idiom. Also drop the now-obsolete `test_direct_api_raises_on_masked_sos` / `test_lp_writer_raises_on_masked_sos` tests in `test_sos_constraints`: they were guarding the defensive `_check_sos_unmasked` workaround that this PR replaces with the real fix. The same coverage (and much more) is now in `test_sos_masked.py`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The companion sos2 test was guarding the defensive _check_sos_unmasked workaround — asserting that piecewise-with-sos2 NaN-padded breakpoints raised NotImplementedError. With the real #688 fix landed, masked SOS lambdas now flow through both direct-API and LP-writer paths correctly, so the sos2 piecewise solve produces the same answer as the lp variant (y_b = 12.5 on the chord (5,10)→(15,15)). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The "reformulation + masked SOS" integration test lived in test_sos_constraints alongside SOS-validation tests but its concern is the reformulation pipeline (the apply/undo path on a masked input), not the SOS constraint itself. Move it to TestSolveWithReformulation in test_sos_reformulation.py where the other reformulation tests live. Also drop the "documented workaround" framing in the docstring — with #688 fixed, reformulation isn't a workaround for masked SOS anymore, just one of two ways to solve such models. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The SOS handling in Gurobi._build_solver_model and Xpress._build_solver_model was ~22 lines each, ~95% identical: same iteration over model.variables.sos, same 1D vs multi-dim stack/groupby branching, same masked-entry filter, same label→position resolution. Only the vendor addSOS call differed. Extract _iter_sos_sets(model) yielding (sos_type, positions, weights) per active SOS set. Each subclass collapses to a 2-line loop with its vendor call. The shared bug-prone parts (multi-dim groupby, mask filtering, label resolution) live in one place; a future SOS-capable direct-API solver (#683, OETC) plugs in with one addSOS line. Also replace `# type: ignore[assignment]` on sos_type/sos_dim with runtime int()/str() casts. The casts both document the actual runtime types and guard against future changes to what gets stored in var.attrs, with no measurable cost. The remaining type: ignore is on the int() call itself (mypy is overly conservative about xarray's Hashable-typed attrs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

FBumann and others added 5 commits May 18, 2026 21:57

FBumann force-pushed the fix/sos-masked-direct-api branch from 03d4d85 to 70c51ef Compare May 18, 2026 20:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sos): SOS constraints on masked variables (#688)#692

fix(sos): SOS constraints on masked variables (#688)#692
FBumann wants to merge 6 commits into
refactor/sos-reformulation-methodsfrom
fix/sos-masked-direct-api

FBumann commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

FBumann commented May 18, 2026

Root cause

Fix

Coverage

TDD discipline

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant