Fix issue with infinite lower bounds and try to bound free variables in barrier by chris-maes · Pull Request #1001 · NVIDIA/cuopt

chris-maes · 2026-03-30T23:55:17Z

Fixes a long-standing issue where variables with bounds of the form -inf <= x_j < u_j were not handled correctly by barrier
We now try to bound free variables (-inf <= x_j <= inf) in barrier presolve. This is helpful as PSLP can sometimes introduce many free variables.

…in barrier

copy-pr-bot · 2026-03-30T23:55:20Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-03-31T00:01:30Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a barrier presolve pass that tightens/removes certain free variables and negates variables with only an upper bound (x' = -x), updates objective/A/Q accordingly, records negated indices and undoes them in solution uncrushing, introduces CLI file-read timing/logging, and adjusts barrier residual/objective logging and normalization.

Changes

Cohort / File(s)	Summary
Presolve declarations `cpp/src/dual_simplex/presolve.hpp`	Added `presolve_info_t::negated_variables` to record indices of variables negated during presolve.
Presolve implementation & solution mapping `cpp/src/dual_simplex/presolve.cpp`	Added initial free-variable counting and a constraint-based tightening pass that can remove certain free variables; implemented negation substitution for variables with only an upper bound (`x' = -x`) by flipping `problem.lower/upper`, negating `problem.objective` entries, multiplying affected `problem.A` column coefficients by -1, updating quadratic `problem.Q.x` terms when exactly one column is negated, removed a duplicate free-variable check, and updated `uncrush_solution` to undo negations for recorded indices.
CLI timing/logging `cpp/cuopt_cli.cpp`	Included `utilities/timer.hpp`, instantiated a timer in `run_single_file`, initialized logger earlier, and logged MPS file read time via the configured logger.
Barrier solver diagnostics `cpp/src/barrier/barrier.cu`	Switched complementarity residual norm computation to use GPU residual vectors, logged `lp.obj_constant`, normalized complementarity residuals using `compute_user_objective(lp, primal_objective)`, and updated per-iteration logging to print user objectives and relative residuals while keeping existing convergence checks.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly addresses the main changes: fixing infinite lower bounds and bounding free variables in barrier presolve.
Description check	✅ Passed	The description relates to the changeset by explaining the two primary fixes: handling -inf bounds and bounding free variables in barrier presolve.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/dual_simplex/presolve.cpp`:
- Around line 926-944: Guard the divisions by a_ij in the bounding logic by
skipping or avoiding division when |a_ij| is near zero: introduce a small
tolerance (e.g. tol) and before any 1.0 / a_ij operations check if
std::abs(a_ij) > tol; if not, do not update problem.upper[j] or problem.lower[j]
(and do not set bounded) or handle via a safe fallback; apply this check around
the four locations using the same tol and keep the existing conditions involving
lower_inf_i, upper_inf_i, lower_activity_i, upper_activity_i, rhs and bounded so
only numerically safe divisions occur.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: de8c026d-5d75-460d-a9a8-fb066f9cc6f1

📥 Commits

Reviewing files that changed from the base of the PR and between d03e197 and b86145c.

📒 Files selected for processing (2)

cpp/src/dual_simplex/presolve.cpp
cpp/src/dual_simplex/presolve.hpp

coderabbitai · 2026-03-31T00:01:33Z

cpp/src/dual_simplex/presolve.cpp

+        if (a_ij > 0) {
+          if (lower_inf_i == 1) {
+            problem.upper[j] = 1.0 / a_ij * (rhs - lower_activity_i);
+            bounded          = true;
+          }
+          if (upper_inf_i == 1) {
+            problem.lower[j] = 1.0 / a_ij * (rhs - upper_activity_i);
+            bounded          = true;
+          }
+        } else if (a_ij < 0) {
+          if (lower_inf_i == 1) {
+            problem.lower[j] = 1.0 / a_ij * (rhs - lower_activity_i);
+            bounded          = true;
+          }
+          if (upper_inf_i == 1) {
+            problem.upper[j] = 1.0 / a_ij * (rhs - upper_activity_i);
+            bounded          = true;
+          }
+        }


⚠️ Potential issue | 🟠 Major

Guard against near-zero coefficients before division.

The divisions by a_ij at lines 928, 932, 937, and 941 lack a tolerance check. A near-zero coefficient can produce numerically unstable or excessively large bounds, degrading solver performance or causing downstream issues.

🛡️ Proposed fix to add tolerance check

+ constexpr f_t coeff_tol = 1e-10; + if (std::abs(a_ij) < coeff_tol) { continue; } + if (a_ij > 0) { if (lower_inf_i == 1) { problem.upper[j] = 1.0 / a_ij * (rhs - lower_activity_i);

Based on learnings: "Guard numerical stability by guarding near-zero norms in cut-related computations. ... functions ... should also handle near-zero norms to avoid instability."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/src/dual_simplex/presolve.cpp` around lines 926 - 944, Guard the divisions by a_ij in the bounding logic by skipping or avoiding division when |a_ij| is near zero: introduce a small tolerance (e.g. tol) and before any 1.0 / a_ij operations check if std::abs(a_ij) > tol; if not, do not update problem.upper[j] or problem.lower[j] (and do not set bounded) or handle via a safe fallback; apply this check around the four locations using the same tol and keep the existing conditions involving lower_inf_i, upper_inf_i, lower_activity_i, upper_activity_i, rhs and bounded so only numerically safe divisions occur.

This is tough. A large bound is ok here, because we are trying to put bounds on variables that are currently infinite. I guess if the bound is so large it might be better to keep it infinite for numerical purposes. But what is considered too large? 1e20? 1e15? 1e16?

@chris-maes, this is a great engineering question. Here's my analysis of the tradeoffs:

Checking the resulting bound vs. checking the coefficient

The two approaches are complementary:

Coefficient tolerance (|a_ij| < tol): Simple, but may be overly conservative — if both a_ij and (rhs - activity) are tiny but in proportion, the computed bound might still be well-scaled.

Bound magnitude threshold (|computed_bound| > BIG_M): More directly addresses the concern — if the derived bound is astronomically large, it's useless to the barrier method and potentially harmful.

For a barrier (interior-point) method the relevant consideration is: using coefficients that span many orders of magnitude is a primary source of numerical challenge in LP solvers. A double-precision floating-point number can only represent 15 precise digits, so round-off errors become inevitable when values span 16 or more orders of magnitude.

Practical threshold

The purpose of this presolve step is to give the barrier a useful finite bound. If the computed bound is, say, 1e20, the logarithmic barrier term -log(u - x) is completely negligible for any x of order 1 — so the variable is still effectively unconstrained from the solver's perspective, providing no benefit.

A practical choice, consistent with how commercial solvers treat "practical infinity":

1e20 is a widely used "big-M" sentinel in optimization. It is large enough that a well-scaled problem would essentially never hit it, but small enough to avoid the worst floating-point pathologies.

1e15 is tighter — closer to the double-precision representational limit (~10^15.6) — and would keep the barrier terms meaningful.

Recommended approach

Rather than (or in addition to) the coefficient check, you could guard with:

constexpr f_t big_bound = 1e20; if (a_ij > 0) { if (lower_inf_i == 1) { const f_t new_upper = 1.0 / a_ij * (rhs - lower_activity_i); if (std::abs(new_upper) < big_bound) { problem.upper[j] = new_upper; bounded = true; } } if (upper_inf_i == 1) { const f_t new_lower = 1.0 / a_ij * (rhs - upper_activity_i); if (std::abs(new_lower) < big_bound) { problem.lower[j] = new_lower; bounded = true; } } } else if (a_ij < 0) { if (lower_inf_i == 1) { const f_t new_lower = 1.0 / a_ij * (rhs - lower_activity_i); if (std::abs(new_lower) < big_bound) { problem.lower[j] = new_lower; bounded = true; } } if (upper_inf_i == 1) { const f_t new_upper = 1.0 / a_ij * (rhs - upper_activity_i); if (std::abs(new_upper) < big_bound) { problem.upper[j] = new_upper; bounded = true; } } }

This way you preserve the semantics of "keep the bound infinite if the derived bound is too large to be useful," which is exactly the intuition you described. The 1e20 value is a reasonable starting point but can be tuned based on the typical scaling of your problems. If you want tighter numerical safety, 1e15 is the more conservative choice given double-precision limits.

🧠 Learnings used

Learnt from: chris-maes Repo: NVIDIA/cuopt PR: 599 File: cpp/src/dual_simplex/cuts.cpp:79-99 Timestamp: 2026-01-22T22:55:36.280Z Learning: Guard numerical stability by guarding near-zero norms in cut-related computations. Specifically, ensure that while cuts with zero norm are prevented from entering the cut pool by existing validation in cpp/src/dual_simplex/cuts.cpp, functions like cut_orthogonality (and related routines in the same directory) should also handle near-zero norms to avoid instability. This guideline applies to all cpp files under cpp/src/dual_simplex (e.g., the dual simplex cuts module) to prevent numerical issues during cut generation and orthogonality calculations.

Learnt from: chris-maes Repo: NVIDIA/cuopt PR: 814 File: cpp/src/dual_simplex/branch_and_bound.hpp:154-159 Timestamp: 2026-01-29T04:24:57.790Z Learning: In files under cpp/src/dual_simplex, document and enforce that mutex_original_lp_ protects original_lp_ specifically during the root cut pass phase. The public API set_new_solution() may be called from external threads during this phase while the main thread adds cuts/slacks to original_lp_. Functions like best_first_thread, exploration_ramp_up, and plunge_from execute after the cut passes complete (after the OpenMP barrier) when original_lp_ is fixed, so reads do not require mutex protection. Apply this guidance to all relevant files (hpp and cpp) in the dual_simplex directory and ensure code reviews verify correct mutex usage timing and safe access patterns around original_lp_.

Learnt from: chris-maes Repo: NVIDIA/cuopt PR: 814 File: cpp/src/dual_simplex/cuts.cpp:1278-1278 Timestamp: 2026-01-30T23:34:22.711Z Learning: In the cuOpt dual simplex solver (cpp/src/dual_simplex/ and related files), slack variable coefficients in the constraint matrix are exactly 1.0 or -1.0. Therefore, exact equality checks like assert(std::abs(lp.A.x[col_start]) == 1.0) are appropriate and should not be flagged as requiring epsilon tolerances. Apply this guidance to all files under cpp/src/dual_simplex that involve slack coefficient validation.

Learnt from: aliceb-nv Repo: NVIDIA/cuopt PR: 986 File: cpp/src/branch_and_bound/branch_and_bound.cpp:8-8 Timestamp: 2026-03-23T11:33:23.998Z Learning: In this repo (NVIDIA/cuopt), treat nvcc as the supported CUDA toolchain; clang-based compilation/support is not required and may fail/break. During code reviews, do NOT request code changes or add blocking comments for errors that appear only under clang (e.g., header-resolution failures such as 'utilities/determinism_log.hpp not found')—these can be toolchain-related rather than real source issues.

chris-maes · 2026-03-31T00:14:33Z

cpp/src/dual_simplex/presolve.hpp


  folding_info_t<i_t, f_t> folding_info;
+
+  std::vector<i_t> new_slacks;


Whoops. This line should be removed.

chris-maes · 2026-03-31T15:21:56Z

@rg20 to review

rg20 · 2026-04-01T18:35:31Z

cpp/src/dual_simplex/presolve.cpp

+
+      // The constraints are in the form:
+      // sum_j a_j x_j = beta
+      for (i_t i : constraints_to_check) {


I think we can repurpose the bound strengthening code here.

cpp/src/dual_simplex/presolve.cpp

Remove `new_slacks` variable.

rg20 · 2026-04-02T03:59:37Z

/ok to test 4432c27

…ig. Use user object for relative complementarity gap. Initialize logger so we get feedback when reading large mps files

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/barrier/barrier.cu`:
- Around line 3493-3497: The complementarity normalization uses
compute_user_objective() which includes lp.obj_constant; remove that constant
from the denominator so an arbitrary objective offset cannot shrink the
relative_complementarity_residual. Change the normalization to use the user
objective without lp.obj_constant (either call a new helper like
compute_user_objective_without_constant or subtract lp.obj_constant from
compute_user_objective(lp, primal_objective) before taking std::abs) when
computing relative_complementarity_residual (and the other occurrence noted),
leaving compute_user_objective() behavior unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 44e4dc72-db04-4343-ab6d-560f43805934

📥 Commits

Reviewing files that changed from the base of the PR and between 4432c27 and d8054d1.

📒 Files selected for processing (3)

cpp/cuopt_cli.cpp
cpp/src/barrier/barrier.cu
cpp/src/dual_simplex/presolve.cpp

✅ Files skipped from review due to trivial changes (1)

cpp/src/dual_simplex/presolve.cpp

coderabbitai · 2026-04-02T23:21:06Z

cpp/src/barrier/barrier.cu

    f_t relative_primal_residual = primal_residual_norm / (1.0 + norm_b);
    f_t relative_dual_residual   = dual_residual_norm / (1.0 + norm_c);
    f_t relative_complementarity_residual =
-      complementarity_residual_norm / (1.0 + std::abs(primal_objective));
+      complementarity_residual_norm /
+      (1.0 + std::abs(compute_user_objective(lp, primal_objective)));


⚠️ Potential issue | 🔴 Critical

Keep the complementarity normalization out of compute_user_objective().

compute_user_objective() in cpp/src/dual_simplex/solve.cpp:101-105 adds lp.obj_constant, but that constant does not change the KKT system. A large objective offset can therefore drive relative_complementarity_residual arbitrarily low and make the optimal/suboptimal exit checks accept an iterate with a still-large complementarity gap.

🛠️ Suggested fix

- f_t relative_complementarity_residual = - complementarity_residual_norm / - (1.0 + std::abs(compute_user_objective(lp, primal_objective))); + const f_t complementarity_scale = 1.0 + std::abs(primal_objective); + f_t relative_complementarity_residual = + complementarity_residual_norm / complementarity_scale; ... - relative_complementarity_residual = - complementarity_residual_norm / - (1.0 + std::abs(compute_user_objective(lp, primal_objective))); + const f_t complementarity_scale = 1.0 + std::abs(primal_objective); + relative_complementarity_residual = + complementarity_residual_norm / complementarity_scale;

As per coding guidelines, "Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results".

Also applies to: 3658-3660

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/src/barrier/barrier.cu` around lines 3493 - 3497, The complementarity normalization uses compute_user_objective() which includes lp.obj_constant; remove that constant from the denominator so an arbitrary objective offset cannot shrink the relative_complementarity_residual. Change the normalization to use the user objective without lp.obj_constant (either call a new helper like compute_user_objective_without_constant or subtract lp.obj_constant from compute_user_objective(lp, primal_objective) before taking std::abs) when computing relative_complementarity_residual (and the other occurrence noted), leaving compute_user_objective() behavior unchanged.

coderabbitai

🧹 Nitpick comments (1)

cpp/src/dual_simplex/presolve.cpp (1)
861-904: Verify iterative bound propagation handles partial bounds correctly.

The activity computation uses current bounds, which may have been modified by previous constraint iterations. This implements iterative implied bound propagation. However, if a variable receives only one bound (e.g., only upper set at line 931, not lower), it transitions from counting as infinite to contributing finite activity for subsequent constraints.

This is mathematically correct behavior for bound propagation, but consider whether a single-pass approach might miss tightening opportunities compared to iterating until fixpoint.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/dual_simplex/presolve.cpp` around lines 861 - 904, The current
single-pass loop over constraints_to_check (inside the block using Arow,
problem.lower/problem.upper and variables like lower_activity_i,
upper_activity_i, lower_inf_i, upper_inf_i, last_free_i) can miss tightening
opportunities because bounds updated earlier in the pass change contribution
status for later rows; change the algorithm to iterate until a fixpoint: repeat
scanning the relevant constraints (or use a worklist/queue of affected
constraints) and recompute lower_activity_i/upper_activity_i for each row
whenever a variable bound changes, marking any tightened bound and re-enqueueing
its incident constraints, and stop when no bound is tightened in a full pass so
all implied bounds are propagated.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cpp/src/dual_simplex/presolve.cpp`:
- Around line 861-904: The current single-pass loop over constraints_to_check
(inside the block using Arow, problem.lower/problem.upper and variables like
lower_activity_i, upper_activity_i, lower_inf_i, upper_inf_i, last_free_i) can
miss tightening opportunities because bounds updated earlier in the pass change
contribution status for later rows; change the algorithm to iterate until a
fixpoint: repeat scanning the relevant constraints (or use a worklist/queue of
affected constraints) and recompute lower_activity_i/upper_activity_i for each
row whenever a variable bound changes, marking any tightened bound and
re-enqueueing its incident constraints, and stop when no bound is tightened in a
full pass so all implied bounds are propagated.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d604ec33-375d-4234-9125-cefe3d8439a7

📥 Commits

Reviewing files that changed from the base of the PR and between d8054d1 and c7816fa.

📒 Files selected for processing (1)

cpp/src/dual_simplex/presolve.cpp

anandhkb · 2026-04-03T03:43:00Z

/ok to test c7816fa

Fix issue with infinite lower bounds and try to bound free variables …

b86145c

…in barrier

chris-maes requested a review from a team as a code owner March 30, 2026 23:55

chris-maes requested review from aliceb-nv and mlubin March 30, 2026 23:55

chris-maes added bug Something isn't working P0 labels Mar 30, 2026

chris-maes added this to the 26.04 milestone Mar 30, 2026

chris-maes changed the base branch from main to release/26.04 March 30, 2026 23:56

chris-maes added the non-breaking Introduces a non-breaking change label Mar 30, 2026

coderabbitai bot reviewed Mar 31, 2026

View reviewed changes

chris-maes commented Mar 31, 2026

View reviewed changes

rg20 reviewed Apr 1, 2026

View reviewed changes

cpp/src/dual_simplex/presolve.cpp Show resolved Hide resolved

rg20 and others added 2 commits April 1, 2026 11:59

Handle the Q matrix in no lower bound presolve

06ae755

Update presolve.hpp

4432c27

Remove `new_slacks` variable.

rg20 approved these changes Apr 2, 2026

View reviewed changes

Fix a bug in complementarity residual. Finite bounds can only be so b…

d8054d1

…ig. Use user object for relative complementarity gap. Initialize logger so we get feedback when reading large mps files

coderabbitai bot reviewed Apr 2, 2026

View reviewed changes

Only use a single implied bound to avoid an extra variable in barrier

c7816fa

coderabbitai bot reviewed Apr 2, 2026

View reviewed changes


		folding_info_t<i_t, f_t> folding_info;

		std::vector<i_t> new_slacks;

Conversation

chris-maes commented Mar 30, 2026

Uh oh!

copy-pr-bot bot commented Mar 30, 2026

Uh oh!

coderabbitai bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chris-maes Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

chris-maes Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

chris-maes commented Mar 31, 2026

Uh oh!

rg20 Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rg20 commented Apr 2, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

anandhkb commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai bot commented Mar 31, 2026 •

edited

Loading

coderabbitai bot Mar 31, 2026 •

edited

Loading