Skip to content

OCPBUGS-77078: fix(mtu): handle MTU defaults for all HyperShift platforms#2933

Open
jparrill wants to merge 1 commit intoopenshift:masterfrom
jparrill:OCPBUGS-77078
Open

OCPBUGS-77078: fix(mtu): handle MTU defaults for all HyperShift platforms#2933
jparrill wants to merge 1 commit intoopenshift:masterfrom
jparrill:OCPBUGS-77078

Conversation

@jparrill
Copy link
Contributor

@jparrill jparrill commented Mar 13, 2026

Summary

In HyperShift the MTU prober job cannot run because CNO has no access to guest worker nodes. Previously only AWS (9001) and Azure (1500) had hardcoded defaults in probeMTU; all other platforms fell through to the prober job and failed silently. Additionally, the GetDefaultMTU() fallback in fillOVNKubernetesDefaults would return the management cluster's pod network MTU — not the hosted cluster's worker node uplink MTU.

This PR adds proper handling for all three MTU scenarios and guards against the incorrect GetDefaultMTU() fallback in HyperShift.

Related PRs

Fixes

MTU Resolution Flow

                     ┌───────────────────────────┐
                     │  Network CR has MTU set?  │
                     │  (--ovn-kubernetes-mtu)   │
                     └────────┬──────────────────┘
                              │
                    ┌─────────┴─────────┐
                    │ YES               │ NO
                    ▼                   ▼
          ┌─────────────────┐  ┌────────────────────┐
          │ NeedMTUProbe()  │  │ NeedMTUProbe()     │
          │ returns false   │  │ returns true       │
          │                 │  │                    │
          │ FillDefaults    │  │ probeMTU() called  │
          │ preserves value │  └──────┬─────────────┘
          └─────────────────┘         │
                              ┌───────┴────────┐
                              │ HyperShift?    │
                              │(HCP != nil)    │
                              └───┬────────┬───┘
                            YES  │        │  NO
                                 ▼        ▼
                    ┌──────────────┐  ┌──────────────────┐
                    │ Platform     │  │ Deploy prober    │
                    │ defaults:    │  │ job on worker    │
                    │ AWS = 9001   │  │ node to detect   │
                    │ Azure = 1500 │  │ host MTU         │
                    │ Other = 1500 │  └──────────────────┘
                    └──────┬───────┘          │
                           └──────┬───────────┘
                                  ▼
                    ┌──────────────────────────┐
                    │ FillDefaults computes:   │
                    │ tunnel MTU = host MTU    │
                    │   - encap overhead       │
                    │   (100 geneve,           │
                    │    +46 if IPsec)         │
                    └──────────────────────────┘

Three scenarios resolved

  1. User explicitly sets MTU (e.g. --ovn-kubernetes-mtu 1300): NeedMTUProbe returns false, probeMTU is never called, FillDefaults preserves the user value as-is. Works on all platforms.

  2. No user MTU, commercial AWS (jumbo frames): probeMTU returns 9001, FillDefaults computes 9001 - 100 = 8901. Same behavior as before.

  3. No user MTU, ISO/classified AWS or other platforms: Previously broken — prober job couldn't run and GetDefaultMTU() returned the wrong MTU. Now probeMTU returns a safe default (1500 for non-AWS/Azure), and fillOVNKubernetesDefaults guards against the GetDefaultMTU() fallback in HyperShift. Users in ISO regions should set MTU explicitly via --ovn-kubernetes-mtu.

Changes

pkg/controller/operconfig/mtu_probe.go

  • Add default case in HyperShift switch returning defaultHCPMTU = 1500 for all platforms without a known uplink MTU
  • Fix Azure log message that incorrectly said "AWS cluster"
  • Improve godoc explaining the full flow

pkg/controller/operconfig/mtu_probe_test.go

  • Add test cases for all HyperShift platforms: OpenStack, KubeVirt, PowerVS, IBMCloud, GCP, None, BareMetal (Agent)
  • Update "Unknown platform on HyperShift" to expect 1500 instead of reading from configmap

pkg/controller/operconfig/operconfig_controller.go

  • Derive isHyperShift from infraStatus.HostedControlPlane != nil
  • Pass isHyperShift to FillDefaults calls
  • Add comment documenting the MTU flow

pkg/network/ovn_kubernetes.go

  • Add isHyperShift bool parameter to fillOVNKubernetesDefaults
  • Replace if-else with switch (4 cases: previous, probed, hypershift-safe-default, standalone-fallback)
  • Guard against GetDefaultMTU() in HyperShift — it returns the management cluster's pod MTU, not the hosted cluster's worker node uplink

pkg/network/render.go

  • Propagate isHyperShift bool through FillDefaults and fillDefaultNetworkDefaults

pkg/network/ovn_kubernetes_test.go

  • Unified table-driven TestFillOVNKubernetesDefaults covering 5 scenarios:
    • Standalone with probed MTU
    • Standalone with IPsec overhead
    • User explicitly sets MTU (standalone)
    • User explicitly sets MTU (HyperShift)
    • HyperShift with no user MTU and no probe

pkg/network/render_test.go

  • Update fillDefaults helper for new isHyperShift parameter

Test Plan

  • Verify tests compile: CGO_ENABLED=0 GOOS=linux go test -c ./pkg/network/ ./pkg/controller/operconfig/
  • CI runs all unit tests on Linux
  • E2E: Deploy a HyperShift hosted cluster on AWS without --ovn-kubernetes-mtu → MTU should be 8901
  • E2E: Deploy a HyperShift hosted cluster with --ovn-kubernetes-mtu 1300 → MTU should be 1300
  • Verify standalone (non-HyperShift) clusters are unaffected

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • MTU handling improved for hosted control plane (HyperShift) deployments: platform-specific safe defaults are used instead of probing, with clearer logs; non-HyperShift behavior remains unchanged.
    • MTU defaulting now consistently accounts for encapsulation/overhead and falls back to a safe 1500 MTU when appropriate.
  • Tests

    • Expanded coverage with multiple HyperShift platform scenarios, validating the safe-default MTU behavior.

@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

Walkthrough

Threads an isHyperShift flag from controller → render → OVN defaulting and MTU probe, implements HyperShift-specific MTU defaults (platform-specific and a safe HostedControlPlane default), and expands tests to cover multiple HyperShift platforms.

Changes

Cohort / File(s) Summary
MTU probe & tests
pkg/controller/operconfig/mtu_probe.go, pkg/controller/operconfig/mtu_probe_test.go
Add defaultHCPMTU (1500), implement HyperShift platform-specific MTU handling (AWS/Azure explicit, others default to 1500), update logs, and expand HyperShift unit tests to assert the safe default across multiple platforms.
Controller integration
pkg/controller/operconfig/operconfig_controller.go
Set isHyperShift from infraStatus.HostedControlPlane and pass it into network.FillDefaults call sites; add MTU probing behavior comments.
Rendering API
pkg/network/render.go, pkg/network/render_test.go
Change FillDefaults signature to accept isHyperShift bool and propagate it into internal defaulting functions and tests.
OVN Kubernetes defaults & tests
pkg/network/ovn_kubernetes.go, pkg/network/ovn_kubernetes_test.go
Add isHyperShift parameter to fillOVNKubernetesDefaults; restructure MTU resolution to: preserve previous MTU → hostMTU minus encapsulation → HyperShift safe default (1500) minus overhead → fallback to GetDefaultMTU; convert tests to table-driven cases covering standalone and HyperShift scenarios.

Sequence Diagram

sequenceDiagram
    participant Controller as Controller\n(operconfig_controller)
    participant Renderer as Renderer\n(render)
    participant OVN as OVN\n(ovn_kubernetes)
    participant MTUProbe as MTUProbe\n(mtu_probe)

    Controller->>Controller: Determine isHyperShift (HostedControlPlane != nil)
    Controller->>Renderer: FillDefaults(conf, previous, hostMTU, isHyperShift)
    Renderer->>OVN: fillOVNKubernetesDefaults(conf, previous, hostMTU, isHyperShift)

    alt sc.MTU is nil
        OVN->>OVN: Check previous MTU
        alt previous MTU exists
            OVN->>OVN: Use previous MTU
        else hostMTU > 0
            OVN->>OVN: Use hostMTU - encapOverhead
        else isHyperShift == true
            OVN->>OVN: Use safe default (1500) - encapOverhead
        else
            OVN->>MTUProbe: GetDefaultMTU()
            MTUProbe-->>OVN: Return probed MTU
            OVN->>OVN: Use probed MTU or error if invalid
        end
    end

    OVN->>Renderer: Return spec with MTU set
    Renderer->>Controller: Return configured spec
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ❓ Inconclusive Unable to verify Ginkgo test code quality without access to actual test files and PR contents. Provide specific test file paths, code content, or PR details to assess single responsibility, setup/cleanup, timeouts, assertion messages, and pattern consistency.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately summarizes the main change: adding MTU default handling for all HyperShift platforms, which is the core objective of this PR.
Stable And Deterministic Test Names ✅ Passed All test names in modified files use stable, deterministic static strings without dynamic identifiers.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.3)

level=error msg="Running error: context loading failed: failed to load packages: failed to load packages: failed to load with go/packages: err: exit status 1: stderr: go: inconsistent vendoring in :\n\tgithub.com/Masterminds/semver@v1.5.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/Masterminds/sprig/v3@v3.2.3: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/containernetworking/cni@v0.8.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/ghodss/yaml@v1.0.1-0.20190212211648-25d852aebe32: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/go-bindata/go-bindata@v3.1.2+incompatible: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/onsi/gomega@v1.38.1: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/ope

... [truncated 17231 characters] ...

ired in go.mod, but not marked as explicit in vendor/modules.txt\n\tk8s.io/gengo/v2@v2.0.0-20250922181213-ec3ebc5fd46b: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tk8s.io/kms@v0.34.1: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tk8s.io/kube-aggregator@v0.34.1: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tsigs.k8s.io/randfill@v1.0.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tsigs.k8s.io/structured-merge-diff/v6@v6.3.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\n\tTo ignore the vendor directory, use -mod=readonly or -mod=mod.\n\tTo sync the vendor directory, run:\n\t\tgo mod vendor\n"


Comment @coderabbitai help to get the list of available commands and usage tips.

@jparrill jparrill changed the title fix(mtu): handle MTU defaults for all HyperShift platforms OCPBUGS-77078: fix(mtu): handle MTU defaults for all HyperShift platforms Mar 13, 2026
@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Mar 13, 2026
@openshift-ci-robot
Copy link
Contributor

@jparrill: This pull request references Jira Issue OCPBUGS-77078, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira (yli2@redhat.com), skipping review request.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

Fixes OCPBUGS-77078: Hardcoded MTU on hosted cluster nodes in AWS is incompatible with ISO/classified regions.

In HyperShift the MTU prober job cannot run because CNO has no access to guest worker nodes. Previously only AWS (9001) and Azure (1500) had hardcoded defaults in probeMTU; all other platforms fell through to the prober job and failed silently. Additionally, the GetDefaultMTU() fallback in fillOVNKubernetesDefaults would return the management cluster's pod network MTU — not the hosted cluster's worker node uplink MTU.

This PR adds proper handling for all three MTU scenarios and guards against the incorrect GetDefaultMTU() fallback in HyperShift.

Companion HyperShift PR: openshift/hypershift#7942

MTU Resolution Flow

                    ┌─────────────────────────┐
                    │  Network CR has MTU set? │
                    │  (--ovn-kubernetes-mtu)  │
                    └────────┬────────────────┘
                             │
                   ┌─────────┴─────────┐
                   │ YES               │ NO
                   ▼                   ▼
         ┌─────────────────┐  ┌────────────────────┐
         │ NeedMTUProbe()  │  │ NeedMTUProbe()      │
         │ returns false   │  │ returns true         │
         │                 │  │                      │
         │ FillDefaults    │  │ probeMTU() called    │
         │ preserves value │  └──────┬───────────────┘
         └─────────────────┘         │
                             ┌───────┴────────┐
                             │ HyperShift?    │
                             │(HCP != nil)    │
                             └───┬────────┬───┘
                           YES  │        │  NO
                                ▼        ▼
                   ┌──────────────┐  ┌──────────────────┐
                   │ Platform     │  │ Deploy prober    │
                   │ defaults:    │  │ job on worker    │
                   │ AWS = 9001   │  │ node to detect   │
                   │ Azure = 1500 │  │ host MTU         │
                   │ Other = 1500 │  └──────────────────┘
                   └──────┬───────┘          │
                          └──────┬───────────┘
                                 ▼
                   ┌──────────────────────────┐
                   │ FillDefaults computes:   │
                   │ tunnel MTU = host MTU    │
                   │   - encap overhead       │
                   │   (100 geneve,           │
                   │    +46 if IPsec)         │
                   └──────────────────────────┘

Three scenarios resolved

  1. User explicitly sets MTU (e.g. --ovn-kubernetes-mtu 1300): NeedMTUProbe returns false, probeMTU is never called, FillDefaults preserves the user value as-is. Works on all platforms.

  2. No user MTU, commercial AWS (jumbo frames): probeMTU returns 9001, FillDefaults computes 9001 - 100 = 8901. Same behavior as before.

  3. No user MTU, ISO/classified AWS or other platforms: Previously broken — prober job couldn't run and GetDefaultMTU() returned the wrong MTU. Now probeMTU returns a safe default (1500 for non-AWS/Azure), and fillOVNKubernetesDefaults guards against the GetDefaultMTU() fallback in HyperShift. Users in ISO regions should set MTU explicitly via --ovn-kubernetes-mtu.

Changes

pkg/controller/operconfig/mtu_probe.go

  • Add default case in HyperShift switch returning defaultHCPMTU = 1500 for all platforms without a known uplink MTU
  • Fix Azure log message that incorrectly said "AWS cluster"
  • Improve godoc explaining the full flow

pkg/controller/operconfig/mtu_probe_test.go

  • Add test cases for all HyperShift platforms: OpenStack, KubeVirt, PowerVS, IBMCloud, GCP, None, BareMetal (Agent)
  • Update "Unknown platform on HyperShift" to expect 1500 instead of reading from configmap

pkg/controller/operconfig/operconfig_controller.go

  • Derive isHyperShift from infraStatus.HostedControlPlane != nil
  • Pass isHyperShift to FillDefaults calls
  • Add comment documenting the MTU flow

pkg/network/ovn_kubernetes.go

  • Add isHyperShift bool parameter to fillOVNKubernetesDefaults
  • Replace if-else with switch (4 cases: previous, probed, hypershift-safe-default, standalone-fallback)
  • Guard against GetDefaultMTU() in HyperShift — it returns the management cluster's pod MTU, not the hosted cluster's worker node uplink

pkg/network/render.go

  • Propagate isHyperShift bool through FillDefaults and fillDefaultNetworkDefaults

pkg/network/ovn_kubernetes_test.go

  • Unified table-driven TestFillOVNKubernetesDefaults covering 5 scenarios:
  • Standalone with probed MTU
  • Standalone with IPsec overhead
  • User explicitly sets MTU (standalone)
  • User explicitly sets MTU (HyperShift)
  • HyperShift with no user MTU and no probe

pkg/network/render_test.go

  • Update fillDefaults helper for new isHyperShift parameter

Test Plan

  • Verify tests compile: CGO_ENABLED=0 GOOS=linux go test -c ./pkg/network/ ./pkg/controller/operconfig/
  • CI runs all unit tests on Linux
  • E2E: Deploy a HyperShift hosted cluster on AWS without --ovn-kubernetes-mtu → MTU should be 8901
  • E2E: Deploy a HyperShift hosted cluster with --ovn-kubernetes-mtu 1300 → MTU should be 1300
  • Verify standalone (non-HyperShift) clusters are unaffected

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 13, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 13, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jparrill
Once this PR has been reviewed and has the lgtm label, please assign tssurya for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
pkg/network/ovn_kubernetes_test.go (1)

887-965: Add a HyperShift+IPsec safety-net test case.

The table already covers most branches; adding isHyperShift=true, hostMTU=0, and IPsecConfig enabled would lock in the expected 1500 - (geneve + ipsec) behavior.

➕ Suggested table entry
 		{
 			name: "When HyperShift with no user MTU and no probed MTU it should use safe default minus geneve overhead",
 			setup: func(conf *operv1.NetworkSpec) {
 				conf.DefaultNetwork.OVNKubernetesConfig = nil
 			},
 			hostMTU:      0,
 			isHyperShift: true,
 			expectedMTU:  1400, // 1500 (safe default) - 100 (geneve overhead)
 		},
+		{
+			name: "When HyperShift safety-net is used with IPsec it should include IPsec overhead",
+			setup: func(conf *operv1.NetworkSpec) {
+				conf.DefaultNetwork.OVNKubernetesConfig.IPsecConfig = &operv1.IPsecConfig{}
+			},
+			hostMTU:      0,
+			isHyperShift: true,
+			expectedMTU:  1354, // 1500 - 100 (geneve) - 46 (IPsec)
+		},
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/network/ovn_kubernetes_test.go` around lines 887 - 965, Add a new table
entry in TestFillOVNKubernetesDefaults to cover HyperShift + IPsec: create a
test case where setup sets conf.DefaultNetwork.OVNKubernetesConfig.IPsecConfig =
&operv1.IPsecConfig{} (or ensures it's non-nil), hostMTU = 0, isHyperShift =
true, and expectedMTU = 1354 (1500 - 100 geneve - 46 ipsec); this ensures
fillOVNKubernetesDefaults (called in the test) validates the safe-default minus
combined geneve and IPsec overhead behavior for OVNKubernetesConfig when
OVNKubernetesConfig and IPsecConfig are present on HyperShift.
pkg/network/ovn_kubernetes.go (1)

1085-1091: Extract the HyperShift safe-default MTU to a named constant.

Using a named constant here avoids drift with other MTU default paths and makes future tuning safer.

♻️ Suggested cleanup
+const defaultHyperShiftHostMTU uint32 = 1500
...
 		case isHyperShift:
 			// In HyperShift, GetDefaultMTU() reads the management cluster's pod
 			// network interface, not the hosted cluster's worker node uplink.
 			// This path should not be reached because probeMTU always returns a
 			// valid value for HyperShift, but guard against it as a safety measure.
 			log.Printf("WARNING: No probed MTU available for HyperShift cluster, using safe default of 1500")
-			mtu = uint32(1500) - getOVNEncapOverhead(conf)
+			mtu = defaultHyperShiftHostMTU - getOVNEncapOverhead(conf)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/network/ovn_kubernetes.go` around lines 1085 - 1091, Replace the
hard-coded 1500 literal used for the HyperShift safe-default MTU with a named
constant (e.g., defaultHyperShiftMTU) to avoid drift and make tuning explicit;
in the isHyperShift branch (where probeMTU can be absent) declare the constant
near related MTU defaults, then change the assignment mtu = uint32(1500) -
getOVNEncapOverhead(conf) to use that constant (e.g., mtu =
uint32(defaultHyperShiftMTU) - getOVNEncapOverhead(conf)) and update the
log.Printf message if desired to reference the constant name for clarity; keep
references to isHyperShift, probeMTU, getOVNEncapOverhead, and mtu so the change
is localized and consistent with other MTU default paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/network/ovn_kubernetes_test.go`:
- Around line 887-965: Add a new table entry in TestFillOVNKubernetesDefaults to
cover HyperShift + IPsec: create a test case where setup sets
conf.DefaultNetwork.OVNKubernetesConfig.IPsecConfig = &operv1.IPsecConfig{} (or
ensures it's non-nil), hostMTU = 0, isHyperShift = true, and expectedMTU = 1354
(1500 - 100 geneve - 46 ipsec); this ensures fillOVNKubernetesDefaults (called
in the test) validates the safe-default minus combined geneve and IPsec overhead
behavior for OVNKubernetesConfig when OVNKubernetesConfig and IPsecConfig are
present on HyperShift.

In `@pkg/network/ovn_kubernetes.go`:
- Around line 1085-1091: Replace the hard-coded 1500 literal used for the
HyperShift safe-default MTU with a named constant (e.g., defaultHyperShiftMTU)
to avoid drift and make tuning explicit; in the isHyperShift branch (where
probeMTU can be absent) declare the constant near related MTU defaults, then
change the assignment mtu = uint32(1500) - getOVNEncapOverhead(conf) to use that
constant (e.g., mtu = uint32(defaultHyperShiftMTU) - getOVNEncapOverhead(conf))
and update the log.Printf message if desired to reference the constant name for
clarity; keep references to isHyperShift, probeMTU, getOVNEncapOverhead, and mtu
so the change is localized and consistent with other MTU default paths.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 67632cbd-22a0-46be-b1c6-73980a18680f

📥 Commits

Reviewing files that changed from the base of the PR and between a57b306 and cf71d8d.

📒 Files selected for processing (7)
  • pkg/controller/operconfig/mtu_probe.go
  • pkg/controller/operconfig/mtu_probe_test.go
  • pkg/controller/operconfig/operconfig_controller.go
  • pkg/network/ovn_kubernetes.go
  • pkg/network/ovn_kubernetes_test.go
  • pkg/network/render.go
  • pkg/network/render_test.go

@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 13, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
pkg/network/ovn_kubernetes_test.go (1)

1202-1210: Add a HyperShift+IPsec MTU defaulting test case.

This table now covers HyperShift safe-default MTU, but it does not assert the IPsec overhead path (1500 - 100 - 46), which is part of the new MTU rules.

🧪 Suggested test-case addition
+		{
+			name: "When HyperShift with IPsec and no user/probed MTU it should use safe default minus geneve+IPsec overhead",
+			setup: func(conf *operv1.NetworkSpec) {
+				conf.DefaultNetwork.OVNKubernetesConfig = &operv1.OVNKubernetesConfig{
+					IPsecConfig: &operv1.IPsecConfig{Mode: operv1.IPsecModeFull},
+				}
+			},
+			hostMTU:      0,
+			isHyperShift: true,
+			expectedMTU:  1354, // 1500 - 100 - 46
+		},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/network/ovn_kubernetes_test.go` around lines 1202 - 1210, Add a new table
test case alongside the existing HyperShift case in ovn_kubernetes_test.go to
assert the IPsec overhead path: create a case named something like "When
HyperShift with no user MTU and no probed MTU with IPsec it should use safe
default minus geneve and ipsec overhead" that leaves
DefaultNetwork.OVNKubernetesConfig nil for HyperShift but ensures IPsec is
enabled in the cluster config used by the test, set hostMTU to 0 and
isHyperShift to true, and set expectedMTU to 1354 (1500 safe default − 100
geneve − 46 IPsec); ensure the test harness that computes MTU (the function
exercised by these table cases) picks up the IPsec flag so this path is
validated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/network/ovn_kubernetes_test.go`:
- Around line 1175-1177: The test's previous function currently returns the same
pointer as conf which aliases the current spec; change it to return an
independent copy to avoid masking defaulting mutations—e.g. in the anonymous
previous func that takes (conf *operv1.NetworkSpec) return conf.DeepCopy() if
operv1.NetworkSpec implements DeepCopy(), otherwise allocate and return a new
operv1.NetworkSpec value copied from *conf (e.g. ns := *conf; return &ns) so
previous no longer points to the same object as conf in the IPsec test case.

---

Nitpick comments:
In `@pkg/network/ovn_kubernetes_test.go`:
- Around line 1202-1210: Add a new table test case alongside the existing
HyperShift case in ovn_kubernetes_test.go to assert the IPsec overhead path:
create a case named something like "When HyperShift with no user MTU and no
probed MTU with IPsec it should use safe default minus geneve and ipsec
overhead" that leaves DefaultNetwork.OVNKubernetesConfig nil for HyperShift but
ensures IPsec is enabled in the cluster config used by the test, set hostMTU to
0 and isHyperShift to true, and set expectedMTU to 1354 (1500 safe default − 100
geneve − 46 IPsec); ensure the test harness that computes MTU (the function
exercised by these table cases) picks up the IPsec flag so this path is
validated.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c6435c97-08ed-480a-87d3-73632557155d

📥 Commits

Reviewing files that changed from the base of the PR and between cf71d8d and afdb5d4.

📒 Files selected for processing (7)
  • pkg/controller/operconfig/mtu_probe.go
  • pkg/controller/operconfig/mtu_probe_test.go
  • pkg/controller/operconfig/operconfig_controller.go
  • pkg/network/ovn_kubernetes.go
  • pkg/network/ovn_kubernetes_test.go
  • pkg/network/render.go
  • pkg/network/render_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/network/ovn_kubernetes.go
  • pkg/network/render_test.go
  • pkg/controller/operconfig/operconfig_controller.go

Comment on lines +1175 to 1177
previous: func(conf *operv1.NetworkSpec) *operv1.NetworkSpec {
return conf
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid aliasing previous and conf in the IPsec test case.

previous currently returns the same pointer as conf, which can hide regressions when defaulting mutates the current spec.

💡 Proposed fix
-			previous: func(conf *operv1.NetworkSpec) *operv1.NetworkSpec {
-				return conf
-			},
+			previous: func(conf *operv1.NetworkSpec) *operv1.NetworkSpec {
+				return conf.DeepCopy()
+			},
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
previous: func(conf *operv1.NetworkSpec) *operv1.NetworkSpec {
return conf
},
previous: func(conf *operv1.NetworkSpec) *operv1.NetworkSpec {
return conf.DeepCopy()
},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/network/ovn_kubernetes_test.go` around lines 1175 - 1177, The test's
previous function currently returns the same pointer as conf which aliases the
current spec; change it to return an independent copy to avoid masking
defaulting mutations—e.g. in the anonymous previous func that takes (conf
*operv1.NetworkSpec) return conf.DeepCopy() if operv1.NetworkSpec implements
DeepCopy(), otherwise allocate and return a new operv1.NetworkSpec value copied
from *conf (e.g. ns := *conf; return &ns) so previous no longer points to the
same object as conf in the IPsec test case.

…7078)

In HyperShift the MTU prober job cannot run because CNO has no access
to guest worker nodes. Previously only AWS (9001) and Azure (1500) had
hardcoded defaults; other platforms fell through to the prober and
failed silently.

MTU resolution flow:
1. User sets tunnel MTU (--ovn-kubernetes-mtu) → preserved as-is
2. No user MTU → probeMTU returns platform default (AWS=9001,
   Azure=1500, others=1500), FillDefaults subtracts encap overhead
3. Re-reconcile → FillDefaults carries forward the previous value

Changes:
- probeMTU: add default case returning 1500 for non-AWS/Azure HyperShift
  platforms, fix Azure log message that incorrectly said "AWS cluster"
- fillOVNKubernetesDefaults: accept isHyperShift bool, use switch with
  4 cases (previous, probed, hypershift-safe-default, standalone-fallback)
  to prevent GetDefaultMTU() from returning the management cluster MTU
- FillDefaults/fillDefaultNetworkDefaults: propagate isHyperShift
- Tests: table-driven tests for fillOVNKubernetesDefaults covering all
  scenarios, platform coverage for probeMTU

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Juan Manuel Parrilla Madrid <jparrill@redhat.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
pkg/network/ovn_kubernetes_test.go (1)

1175-1177: ⚠️ Potential issue | 🟡 Minor

Avoid aliasing previous with the spec under test.

Line 1175 returns conf directly, so previous is mutated alongside the object being defaulted. That can hide regressions in the preserved-MTU path; return a copy instead.

💡 Proposed fix
 			previous: func(conf *operv1.NetworkSpec) *operv1.NetworkSpec {
-				return conf
+				return conf.DeepCopy()
 			},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/network/ovn_kubernetes_test.go` around lines 1175 - 1177, The test helper
that sets previous currently returns the same pointer (previous: func(conf
*operv1.NetworkSpec) *operv1.NetworkSpec { return conf }), which aliases the
spec under test and allows mutations to hide regressions; change that function
to return an independent copy of conf (e.g., use conf.DeepCopy() or otherwise
clone the NetworkSpec) so previous is not mutated during defaulting and the
preserved-MTU path is correctly tested.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@pkg/network/ovn_kubernetes_test.go`:
- Around line 1175-1177: The test helper that sets previous currently returns
the same pointer (previous: func(conf *operv1.NetworkSpec) *operv1.NetworkSpec {
return conf }), which aliases the spec under test and allows mutations to hide
regressions; change that function to return an independent copy of conf (e.g.,
use conf.DeepCopy() or otherwise clone the NetworkSpec) so previous is not
mutated during defaulting and the preserved-MTU path is correctly tested.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 85715fb7-b78c-4c76-ba36-c4a05f6f90c1

📥 Commits

Reviewing files that changed from the base of the PR and between afdb5d4 and 07fc2e0.

📒 Files selected for processing (7)
  • pkg/controller/operconfig/mtu_probe.go
  • pkg/controller/operconfig/mtu_probe_test.go
  • pkg/controller/operconfig/operconfig_controller.go
  • pkg/network/ovn_kubernetes.go
  • pkg/network/ovn_kubernetes_test.go
  • pkg/network/render.go
  • pkg/network/render_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/controller/operconfig/mtu_probe.go
  • pkg/controller/operconfig/operconfig_controller.go
  • pkg/network/render.go

@jparrill
Copy link
Contributor Author

jparrill commented Mar 13, 2026

just FWIW, the 26 Security checks were failing before our PR.

@jparrill
Copy link
Contributor Author

/retest-required

1 similar comment
@jparrill
Copy link
Contributor Author

/retest-required

@jparrill
Copy link
Contributor Author

Hey @kyrtapz do you mind review?

}

func fillOVNKubernetesDefaults(conf, previous *operv1.NetworkSpec, hostMTU int) {
func fillOVNKubernetesDefaults(conf, previous *operv1.NetworkSpec, hostMTU int, isHyperShift bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a need for changing this, it should be enough to change the prober logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Humm maybe, the scenario we need to address with this PR is when the prober fails. other ones I think are mostly covered. Maybe we can set a default if the prober fails from the beginning?

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 16, 2026

@jparrill: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/security 07fc2e0 link false /test security
ci/prow/e2e-metal-ipi-ovn-dualstack-bgp 07fc2e0 link true /test e2e-metal-ipi-ovn-dualstack-bgp
ci/prow/e2e-metal-ipi-ovn-dualstack-bgp-local-gw 07fc2e0 link true /test e2e-metal-ipi-ovn-dualstack-bgp-local-gw

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants