Skip to content

DefaultSolver classification for all 115 problem types #762

@GiggleLiu

Description

@GiggleLiu

Summary

Classify every problem type by its best available solving strategy, to guide the design of a DefaultSolver dispatch mechanism.

Methodology

For each of the 115 registered problem types, we classify:

  1. ILP — Has an existing witness-capable reduction path to ILP (via pred path <Problem> ILP)
  2. ILP (easy) — No existing path, but a standard ILP formulation exists and could be implemented in <100 lines
  3. ILP (complex) — Known ILP formulation but requires >100 lines (big-M, linearization, subtour elimination, etc.)
  4. Poly — Polynomial-time variant (not NP-hard); a specialized exact algorithm exists
  5. Specialized — Has a known exact algorithm (DP, branch-and-bound) that dominates brute-force
  6. Brute-force — Only practical solver is exhaustive enumeration
  7. PSPACE+ — PSPACE-complete or harder; ILP cannot naturally express the problem

Brute-force scalability depends on configuration space size:

  • Binary (vec![2; n]): 2^n configs → practical up to n ≈ 25
  • k-ary (vec![k; n]): k^n configs → practical up to n ≈ 15 for k=3, n ≈ 10 for k=5
  • Permutation-like (vec![n; n]): n^n configs → practical up to n ≈ 8

Tier 1: Has existing ILP reduction path (32 problems)

These problems already have a witness-capable reduction chain to ILP in the codebase. The DefaultSolver should dispatch to ILPSolver::solve_via_reduction().

Problem ILP Path Hops ILP Variant Config Space
ILP 0 (is ILP) varies
BinPacking 2 bool 2^num_items
CircuitSAT 2 bool 2^num_variables
ClosestVectorProblem 3 bool 2^num_basis_vectors
ConsistencyOfDatabaseFrequencyTables 2 bool large
Factoring 3 bool 2^bits
GraphPartitioning 2 bool 2^num_vertices
HamiltonianCircuit 3 bool n^2 binary
IntegralFlowBundles 2 i32 cap^num_arcs
KColoring (KN,K3,K4,K5) 2 bool k^num_vertices
Knapsack 2 bool 2^num_items
KSatisfiability (KN,K3) 4 bool 2^num_variables
LongestCommonSubsequence 2 bool alpha^bound
LongestPath 2 i32 2^num_edges
MaxCut 5 bool 2^num_vertices
MaximumClique 2 bool 2^num_vertices
MaximumIndependentSet 4 bool 2^num_vertices
MaximumMatching 2 bool 2^num_edges
MaximumSetPacking 3 bool 2^num_sets
MinimumDominatingSet 2 bool 2^num_vertices
MinimumFeedbackVertexSet 2 i32 2^num_vertices
MinimumMultiwayCut 2 bool 2^num_vertices
MinimumSetCovering 2 bool 2^num_sets
MinimumVertexCover 3 bool 2^num_vertices
Partition 3 bool 2^num_elements
QUBO 2 bool 2^num_vars
Satisfiability 3 bool 2^num_variables
SequencingToMinimizeWeightedCompletionTime 2 i32 n^n
SpinGlass 4 bool 2^num_spins
SteinerTree 2 bool 2^num_vertices
SubsetSum 4 bool 2^num_elements
TravelingSalesman 2 bool 2^num_edges

Tier 2: Easy ILP formulation — no existing path (<100 lines) (24 problems)

These should be prioritized for new ReduceTo<ILP> implementations. Each has a textbook ILP formulation.

Problem ILP Sketch Est. Lines
MinimumHittingSet Binary x_e per element; ∀ set S: Σ x_e ≥ 1 (dual of set covering) ~60
ExactCoverBy3Sets Binary x_j per triple; ∀ element: Σ x_j = 1 ~60
NAESatisfiability Binary x_i; ∀ clause: 1 ≤ Σ ≤ |clause|−1 ~70
KClique Binary x_v; Σ x_v = k; non-edge exclusion ~65
MinimumFeedbackArcSet Binary y_a + MTZ ordering (adapt existing FVS→ILP) ~80
MultiprocessorScheduling Binary x_{jm}; assignment + makespan variable ~75
PartiallyOrderedKnapsack Standard knapsack ILP + precedence x_i ≤ x_j ~70
MaximalIS MIS constraints + maximality: x_v + Σ_{u∈N(v)} x_u ≥ 1 ~70
UndirectedTwoCommodityIntegralFlow Integer flow per commodity; conservation + shared capacity ~80
DirectedTwoCommodityIntegralFlow Same as above but directed graph ~80
CapacityAssignment Binary x_{l,c}; assignment + demand satisfaction ~75
ExpectedRetrievalCost Binary x_{r,s}; assignment + cost linearization ~75
MultipleCopyFileAllocation Binary x_{f,n}; access cost + storage constraints ~75
MinimumSumMulticenter Binary x_j, y_{ij}; facility location / p-median ~80
MinMaxMulticenter p-center: binary x_j, y_{ij}; minimax with auxiliary z ~85
ShortestWeightConstrainedPath Binary x_e; flow conservation + weight constraint ~75
PartitionIntoTriangles Binary triangle indicators; each vertex in exactly one ~80
PartitionIntoPathsOfLength2 Binary path selection; vertex covering exactly once ~80
SumOfSquaresPartition Binary x_{ig}; partition + equal sum-of-squares ~80
UndirectedFlowLowerBounds Integer flow; conservation + lower/upper bounds ~70
RectilinearPictureCompression Binary rectangle indicators; pixel coverage ~75
PrecedenceConstrainedScheduling Binary x_{jt}; precedence + resource constraints ~85
SchedulingWithIndividualDeadlines Binary x_{jt}; assignment + deadline constraints ~80
SequencingWithinIntervals Binary x_{jt}; interval feasibility + non-overlap ~80

Tier 3: Complex ILP formulation (>100 lines) (34 problems)

These have known ILP formulations but require significant auxiliary variables, linearization, or constraint generation. DefaultSolver would fall back to brute-force unless a dedicated ILP reduction is implemented.

Problem Why Complex BF Scale
AcyclicPartition Color assignment + topological ordering per color class n^n
BalancedCompleteBipartiteSubgraph Bipartite completeness → quadratic → McCormick linearization 2^n
BicliqueCover Biclique validity constraints need product linearization 2^n
BiconnectivityAugmentation 2-connectivity requires flow-based formulation 2^e
BMF Bilinear W=A·B; product linearization over Boolean semiring 2^(r·c)
BottleneckTravelingSalesman TSP structure + minimax linearization n²·2^n
BoundedComponentSpanningForest Spanning forest + component size bounding (flow/labeling) 3^n
ConsecutiveBlockMinimization Column permutation + block counting auxiliaries n!·n
ConsecutiveOnesMatrixAugmentation Column permutation + augmentation variables n!·n
ConsecutiveOnesSubmatrix Row/column selection + "consecutive" linearization 2^c
DisjointConnectingPaths Multi-commodity flow + vertex-disjointness linking 2^e
FlowShopScheduling Permutation variables + machine sequencing n!
HamiltonianPath Position-based assignment (like TSP, ~130 lines) n²·2^n
IntegralFlowHomologousArcs Flow variables + homologous arc coupling cap^a
IntegralFlowWithMultipliers Modified conservation with multipliers cap^a
IsomorphicSpanningTree Tree selection + isomorphism (hard to linearize) n!
LengthBoundedDisjointPaths Multi-commodity flow + length + disjointness 2^(k·n)
LongestCircuit Position-based (like TSP) + circuit length max n²·2^n
MinimumCutIntoBoundedSets Partition variables + cut weight + size bounds 2^n
MinimumTardinessSequencing Position scheduling + max(0,...) linearization n^n
MixedChinesePostman Edge traversal + Euler tour on mixed graph n²·2^e
OptimalLinearArrangement Permutation + |π(u)−π(v)| absolute value linearization n^n
PaintShop Color assignment + color-change counting 2^n
PartialFeedbackEdgeSet Remove exactly k edges; topological ordering with big-M 2^e
PathConstrainedNetworkFlow Integer flow per allowed path; capacity aggregation cap^p
QuadraticAssignment Permutation + product linearization x_{ij}·x_{kl} n!
ResourceConstrainedScheduling Time-indexed x_{jt} + resource capacity + precedence d^n
RootedTreeArrangement Position variables + parent-child ordering n^n
RootedTreeStorageAssignment Assignment + capacity/tree structure u^u
RuralPostman Edge traversal + connectivity on required edges n²·2^n
SequencingToMinimizeMaximumCumulativeCost Permutation + cumulative cost + minimax n!
SequencingToMinimizeWeightedTardiness Permutation + max(0,...) tardiness linearization n!
SequencingWithReleaseTimesAndDeadlines Release + deadline + non-overlap constraints n^n
ShortestCommonSupersequence Position-based alignment for multiple strings α^L
SparseMatrixCompression Row grouping + compatibility constraints k^r
StackerCrane Directed rural postman + pickup/delivery pairing n²·2^a
SteinerTreeInGraphs Flow-based connectivity for Steiner terminals (~150 lines) 2^e
StringToStringCorrection Edit operation selection + alignment constraints (2s+1)^b
StrongConnectivityAugmentation Arc selection + strong connectivity (flow/cut-based) 2^a
SubgraphIsomorphism Binary mapping + edge preservation n_h^n_p
TimetableDesign 3-index assignment + conflict/availability 2^(c·p·t)

Tier 4: Polynomial-time variants (3 variants)

These have polynomial-time algorithms and should use a specialized solver rather than brute-force or ILP.

Variant Complexity Algorithm
MaximumMatching/SimpleGraph/i32 O(n³) Edmonds' blossom algorithm
KColoring/SimpleGraph/K2 O(n+m) BFS bipartiteness check
KSatisfiability/K2 O(n+m) Implication graph / Tarjan SCC

Note: MaximumMatching already has an ILP path, but its polynomial solver is far more efficient.


Tier 5: Specialized exact algorithms (4 problems)

These have structure that specialized algorithms exploit better than generic ILP.

Problem Algorithm Notes
GroupingBySwapping DP on permutations / sorting networks Adjacent-swap structure
KthBestSpanningTree Iterative Lawler/Yen-style enumeration Needs k iterations of MST
MinimumDummyActivitiesPert Series-parallel decomposition PERT network structure
MultipleChoiceBranching DP on branching programs Tree structure

Tier 6: PSPACE-complete or harder (2 problems)

ILP fundamentally cannot express alternating quantifiers. These require game-tree search or QSAT solvers.

Problem Complexity Class Notes
GeneralizedHex PSPACE-complete Two-player game; alpha-beta / MCTS
QuantifiedBooleanFormulas PSPACE-complete QSAT solver (QCDCL, expansion-based)

Tier 7: Obscure / no natural ILP formulation (11 problems)

Database theory and combinatorial structure problems where ILP is not a natural fit. Brute-force is the only current option.

Problem Domain BF Scale Notes
AdditionalKey Database theory 2^attrs FD closure not easily linearized
BoyceCoddNormalFormViolation Database theory 2^attrs Structural FD analysis
ComparativeContainment Set theory 2^universe Set containment relations
ConjunctiveBooleanQuery Database theory domain^vars Query evaluation
ConjunctiveQueryFoldability Database theory complex Query structural property
ConsecutiveSets Combinatorics α^k · s Ordering/labeling
EnsembleComputation Set systems complex Specialized structure
MinimumCardinalityKey Database theory 2^attrs FD closure checking
PrimeAttributeName Database theory 2^attrs Structural FD property
SetBasis Combinatorics 2^(k·u) Representation constraints
TwoDimensionalConsecutiveSets Combinatorics α^α 2D ordering

Proposed DefaultSolver dispatch

DefaultSolver::solve(problem) → {
    1. If problem is a polynomial-time variant → specialized algorithm
    2. If problem has a reduction path to ILP   → ILPSolver::solve_via_reduction()
    3. If problem has Tier 2 ILP formulation    → (after implementing) ILPSolver
    4. Otherwise                                → BruteForce::solve()
}

Recommended next steps

  1. Implement Tier 2 ILP reductions — 24 problems with textbook formulations. Start with the easiest:

    • MinimumHittingSet → ILP (dual of existing SetCovering→ILP, ~60 lines)
    • ExactCoverBy3Sets → ILP (set partitioning, ~60 lines)
    • NAESatisfiability → ILP (~70 lines)
    • KClique → ILP (adapt existing MaximumClique→ILP, ~65 lines)
    • MinimumFeedbackArcSet → ILP (adapt existing FVS→ILP, ~80 lines)
  2. Add poly-time solvers for MaximumMatching (Edmonds'), 2-Coloring (BFS), 2-SAT (SCC)

  3. Incrementally implement Tier 3 ILP reductions for the most commonly used problems (HamiltonianPath, QuadraticAssignment, FlowShopScheduling, etc.)

  4. Research PSPACE solvers for GeneralizedHex and QBF (game-tree search, QSAT)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions