Skip to content

OCPBUGS-59176: fix several failing tests in custom-dns jobs#31129

Open
sadasu wants to merge 2 commits intoopenshift:mainfrom
sadasu:bug-59176
Open

OCPBUGS-59176: fix several failing tests in custom-dns jobs#31129
sadasu wants to merge 2 commits intoopenshift:mainfrom
sadasu:bug-59176

Conversation

@sadasu
Copy link
Copy Markdown
Contributor

@sadasu sadasu commented May 5, 2026

Some e2e tests are failing with the job "gcp-custom-dns" for featuregate "GCPClusterHostedDNSInstall" which is promoted to GA in 4.20. In the "custom-dns" cluster OpenShift will start static CoreDNS pods to provide DNS resolution for API, Internal API and Ingress services that are essential for cluster creation. After cluster deployment is completed, the customer will update their external DNS solution with the same assigned LB IP addresses used for the configuration of the internal CoreDNS instance.

The failing tests like http2 and grpc tests use dedicated ingresscontrollers, and gateway also has separated LB and dnsrecord, so the default wildcard created by the new static CoreDNS won't work for those tests.

Make below changes to fix the failing tests:

  • Update makeHTTPClient() DialContext to target LoadBalancer IP address directly when DNS is unmanaged, fixing both http2 and gatewayapi httproute tests.
  • Update grpc Dial() to target LoadBalancer IP address directly when DNS is unmanaged, fixing the grpc test.
  • Extract getLoadBalancerAddress() as a reusable helper from assertGatewayLoadbalancerReady() for use across http2, grpc and gatewayapi tests.
  • Update gatewayapi assertDNSRecordStatus() to check the Published condition per-zone based on whether DNS is managed or unmanaged.
  • Add isDNSManaged() helper that checks the DNSManaged condition on the default ingresscontroller, defaulting to managed=true when the condition is absent.
  • Add getClusterBaseDomainName() and update http2/grpc/h2spec shard ingresscontroller to generate domain based on cluster baseDomain instead of ingress domain (e.g. "e2e-test-xxx.apps.baseDomain"), avoiding custom ingresscontroller DNS overlapping with default wildcard "*.apps.baseDomain".

Summary by CodeRabbit

  • Tests
    • Router connectivity tests now always validate route connectivity using both route hostnames and gateway load‑balancer addresses.
    • DNS record checks distinguish managed vs unmanaged zones and expect published status accordingly.
    • HTTP/2 and gRPC tests can optionally target explicit load‑balancer addresses when DNS is unmanaged for more reliable dialing.
    • Domain resolution now uses the cluster base domain for shard/route lookups.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. labels May 5, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@sadasu: This pull request references Jira Issue OCPBUGS-59176, which is invalid:

  • expected the bug to target either version "5.0." or "openshift-5.0.", but it targets "4.20.z" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Some e2e tests are failing with the job "gcp-custom-dns" for featuregate "GCPClusterHostedDNSInstall" which is promoted to GA in 4.20. In the "custom-dns" cluster OpenShift will start static CoreDNS pods to provide DNS resolution for API, Internal API and Ingress services that are essential for cluster creation. After cluster deployment is completed, the customer will update their external DNS solution with the same assigned LB IP addresses used for the configuration of the internal CoreDNS instance.

The failing tests like http2 and grpc tests use dedicated ingresscontrollers, and gateway also has separated LB and dnsrecord, so the default wildcard created by the new static CoreDNS won't work for those tests.

Make below changes to fix the failing tests:

  • Update makeHTTPClient() DialContext to target LoadBalancer IP address directly when DNS is unmanaged, fixing both http2 and gatewayapi httproute tests.
  • Update grpc Dial() to target LoadBalancer IP address directly when DNS is unmanaged, fixing the grpc test.
  • Extract getLoadBalancerAddress() as a reusable helper from assertGatewayLoadbalancerReady() for use across http2, grpc and gatewayapi tests.
  • Update gatewayapi assertDNSRecordStatus() to check the Published condition per-zone based on whether DNS is managed or unmanaged.
  • Add isDNSManaged() helper that checks the DNSManaged condition on the default ingresscontroller, defaulting to managed=true when the condition is absent.
  • Add getClusterBaseDomainName() and update http2/grpc/h2spec shard ingresscontroller to generate domain based on cluster baseDomain instead of ingress domain (e.g. "e2e-test-xxx.apps.baseDomain"), avoiding custom ingresscontroller DNS overlapping with default wildcard "*.apps.baseDomain".

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label May 5, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

Walkthrough

Tests and helpers were changed to detect whether cluster DNS is managed, prefer the cluster base domain, resolve load‑balancer addresses only when DNS is unmanaged, and route HTTP/gRPC client connections either via DNS or by dialing the LB address.

Changes

DNS-aware routing, LB resolution, and client dialing

Layer / File(s) Summary
Domain & DNS Detection
test/extended/router/http2.go
Adds getClusterBaseDomainName(oc, timeout) and isDNSManaged(oc, timeout) to obtain the cluster base domain and determine whether the default IngressController reports DNSManaged (defaults to managed if condition absent).
Load Balancer Address Resolution
test/extended/router/gatewayapicontroller.go
Adds getLoadBalancerAddress(oc, serviceName) that polls Service.status.loadBalancer.ingress and returns hostname/IP; refactors assertGatewayLoadbalancerReady to call it.
DNSRecord Verification
test/extended/router/gatewayapicontroller.go
Rewrites assertDNSRecordStatus to query DNS management and poll DNSRecord zones, expecting Published=True when managed and not True when unmanaged.
HTTP Client Dialing
test/extended/router/http2.go, test/extended/router/idle.go
makeHTTPClient(..., lbAddress) added; when lbAddress non-empty, DialContext rewrites connections to lbAddress:originalPort. Call sites updated to pass empty lbAddress where no override is needed.
gRPC Dialing
test/extended/router/grpc-interop/clientconn.go, test/extended/router/grpc-interop.go
DialParams gains Target string; Dial installs a context dialer to connect to Target:<original-port> when set. grpcExecTestCases signature updated to accept lbAddress and passes it to DialParams.
Test Integration / Routing Changes
test/extended/router/gatewayapicontroller.go, test/extended/router/grpc-interop.go, test/extended/router/h2spec.go, test/extended/router/http2.go
Tests now use the cluster base domain, call isDNSManaged() and compute lbAddress only when DNS unmanaged, pass lbAddress into clients; assertHttpRouteConnection updated to accept CLI and gateway service name and is always invoked with gateway service + hostname.
Minor cleanups
test/extended/router/gatewayapicontroller.go
Removed unused crypto/tls import and centralized load balancer address extraction to getLoadBalancerAddress.

Sequence Diagram(s)

sequenceDiagram
    participant Test as Router Test
    participant DNS as DNS API
    participant Ingress as IngressController
    participant LB as Load Balancer Service
    participant Client as HTTP/gRPC Client
    participant Route as Router Endpoint

    Test->>DNS: getClusterBaseDomainName()
    DNS-->>Test: baseDomain

    Test->>Ingress: isDNSManaged()
    Ingress-->>Test: DNSManaged status

    alt DNS unmanaged
        Test->>LB: getLoadBalancerAddress(serviceName)
        LB-->>Test: lbAddress (IP/hostname)
    else DNS managed
        Note over Test: lbAddress = ""
    end

    alt lbAddress provided
        Test->>Client: makeHTTPClient(..., lbAddress) / DialParams.Target=lbAddress
        Note over Client: custom dialer uses lbAddress:port
        Client->>LB: tcp connect to lbAddress:port
        LB-->>Client: connection
    else DNS resolution
        Client->>DNS: resolve route hostname
        DNS-->>Client: route IP
        Client->>Route: connect to route IP:port
        Route-->>Client: connection
    end

    Client->>Route: HTTP/gRPC request
    Route-->>Client: response
    Test->>Client: verify success
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 73.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main change: fixing failing tests in custom-dns jobs related to a specific bug number.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names are stable and deterministic. Test titles use static descriptive strings with no dynamic content. Dynamic values appear only in test bodies within g.By() statements.
Test Structure And Quality ✅ Passed Tests meet all quality requirements: single responsibility, proper setup/cleanup patterns, comprehensive timeouts, meaningful assertion messages, and codebase consistency.
Microshift Test Compatibility ✅ Passed No new Ginkgo tests added. Only helper functions modified. Existing tests have [apigroup:...] tags protecting them on MicroShift.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This PR modifies existing test helpers and logic only. No new Ginkgo test blocks are added. The check applies only to new tests.
Topology-Aware Scheduling Compatibility ✅ Passed All modified files are e2e test utilities in test/extended/router/, not deployment manifests, operator code, or controllers. No scheduling constraints were added.
Ote Binary Stdout Contract ✅ Passed PR modifies only test files. All logging uses e2e.Logf() (test framework) or returns errors. No process-level functions. All helper functions called from test blocks only.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR modifies existing e2e tests, not adding new ones. Code properly handles IPv6 with net.JoinHostPort(). Existing tests have IPv6 detection. No external connectivity required.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 5, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sadasu
Once this PR has been reviewed and has the lgtm label, please assign candita for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot requested review from gcs278 and rikatz May 5, 2026 14:55
@sadasu
Copy link
Copy Markdown
Contributor Author

sadasu commented May 5, 2026

This is basically the fix added via #30131 with rebase conflicts fixed.

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 5, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@sadasu: This pull request references Jira Issue OCPBUGS-59176, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @melvinjoseph86

Details

In response to this:

Some e2e tests are failing with the job "gcp-custom-dns" for featuregate "GCPClusterHostedDNSInstall" which is promoted to GA in 4.20. In the "custom-dns" cluster OpenShift will start static CoreDNS pods to provide DNS resolution for API, Internal API and Ingress services that are essential for cluster creation. After cluster deployment is completed, the customer will update their external DNS solution with the same assigned LB IP addresses used for the configuration of the internal CoreDNS instance.

The failing tests like http2 and grpc tests use dedicated ingresscontrollers, and gateway also has separated LB and dnsrecord, so the default wildcard created by the new static CoreDNS won't work for those tests.

Make below changes to fix the failing tests:

  • Update makeHTTPClient() DialContext to target LoadBalancer IP address directly when DNS is unmanaged, fixing both http2 and gatewayapi httproute tests.
  • Update grpc Dial() to target LoadBalancer IP address directly when DNS is unmanaged, fixing the grpc test.
  • Extract getLoadBalancerAddress() as a reusable helper from assertGatewayLoadbalancerReady() for use across http2, grpc and gatewayapi tests.
  • Update gatewayapi assertDNSRecordStatus() to check the Published condition per-zone based on whether DNS is managed or unmanaged.
  • Add isDNSManaged() helper that checks the DNSManaged condition on the default ingresscontroller, defaulting to managed=true when the condition is absent.
  • Add getClusterBaseDomainName() and update http2/grpc/h2spec shard ingresscontroller to generate domain based on cluster baseDomain instead of ingress domain (e.g. "e2e-test-xxx.apps.baseDomain"), avoiding custom ingresscontroller DNS overlapping with default wildcard "*.apps.baseDomain".

Summary by CodeRabbit

  • Tests
  • Improved gateway and HTTP route validation testing to properly handle managed vs. unmanaged DNS scenarios.
  • Enhanced load balancer address resolution in router connectivity tests.
  • Refactored test infrastructure to use cluster base domain for more accurate domain resolution across different DNS configurations.
  • Better test coverage for gRPC and HTTP/2 routing scenarios with varied DNS management states.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sadasu
Copy link
Copy Markdown
Contributor Author

sadasu commented May 5, 2026

/jira refresh

@openshift-ci openshift-ci Bot requested a review from melvinjoseph86 May 5, 2026 14:56
@openshift-ci-robot
Copy link
Copy Markdown

@sadasu: This pull request references Jira Issue OCPBUGS-59176, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @melvinjoseph86

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/extended/router/gatewayapicontroller.go`:
- Around line 406-407: The test currently always runs assertHttpRouteConnection
which for unmanaged-DNS paths uses getLoadBalancerAddress and therefore requires
a LoadBalancer; guard these checks by first verifying the platform supports load
balancers (check loadBalancerSupported) and only call assertHttpRouteConnection
/ getLoadBalancerAddress when loadBalancerSupported is true (while preserving
the test coverage for unmanaged-DNS on platforms where an LB exists); apply the
same guard around the other occurrences referenced (the block around lines
1079-1097) so tests on platforms without LB support skip the connectivity check
instead of hanging.

In `@test/extended/router/http2.go`:
- Around line 644-671: The new isDNSManaged(oc *exutil.CLI, timeout
time.Duration) conflicts with the existing isDNSManaged(oc *exutil.CLI) in
package router; rename the new function (e.g., isDNSManagedWithTimeout or
isDNSManagedTimeout) and update all callers to the new name and signature:
change calls at http2.go:504, grpc-interop.go:333, and gatewayapicontroller.go
lines referenced (around 807 and 1080) to pass the timeout argument and call the
renamed function; ensure the function body and returned types remain the same
and only the identifier and call sites are adjusted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: d894b75a-2375-4917-96ed-96cf3881c44f

📥 Commits

Reviewing files that changed from the base of the PR and between 954d227 and 867e786.

📒 Files selected for processing (6)
  • test/extended/router/gatewayapicontroller.go
  • test/extended/router/grpc-interop.go
  • test/extended/router/grpc-interop/clientconn.go
  • test/extended/router/h2spec.go
  • test/extended/router/http2.go
  • test/extended/router/idle.go

Comment on lines +406 to 407
assertHttpRouteConnection(oc, gw+"-openshift-default", defaultRoutename)
})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Unmanaged-DNS path now implicitly requires a LoadBalancer.

This always executes connectivity checks, and for unmanaged DNS it always resolves through getLoadBalancerAddress(...). On platforms where loadBalancerSupported == false, this can hang/fail waiting for an LB that never appears.

Suggested guard (preserve unmanaged-DNS coverage where LB exists)
-assertHttpRouteConnection(oc, gw+"-openshift-default", defaultRoutename)
+if managedDNS || loadBalancerSupported {
+    assertHttpRouteConnection(oc, gw+"-openshift-default", defaultRoutename)
+} else {
+    g.Skip("Skipping HTTPRoute connectivity check: unmanaged DNS and no LoadBalancer support")
+}

Also applies to: 1079-1097

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/extended/router/gatewayapicontroller.go` around lines 406 - 407, The
test currently always runs assertHttpRouteConnection which for unmanaged-DNS
paths uses getLoadBalancerAddress and therefore requires a LoadBalancer; guard
these checks by first verifying the platform supports load balancers (check
loadBalancerSupported) and only call assertHttpRouteConnection /
getLoadBalancerAddress when loadBalancerSupported is true (while preserving the
test coverage for unmanaged-DNS on platforms where an LB exists); apply the same
guard around the other occurrences referenced (the block around lines 1079-1097)
so tests on platforms without LB support skip the connectivity check instead of
hanging.

Comment on lines +644 to +671
func isDNSManaged(oc *exutil.CLI, timeout time.Duration) (bool, error) {
var status operatorv1.ConditionStatus
conditionFound := false

if err := wait.PollUntilContextTimeout(context.Background(), time.Second, timeout, true, func(ctx context.Context) (bool, error) {
ic, err := oc.AdminOperatorClient().OperatorV1().IngressControllers("openshift-ingress-operator").Get(ctx, "default", metav1.GetOptions{})
if err != nil {
e2e.Logf("Failed to get default ingresscontroller: %v, retrying...", err)
return false, nil
}
for _, condition := range ic.Status.Conditions {
if condition.Type == string(operatorv1.DNSManagedIngressConditionType) {
status = condition.Status
conditionFound = true
return true, nil
}
}
// DNSManaged condition not present, assume DNS is managed (default behavior)
e2e.Logf("DNSManaged condition not found on default ingresscontroller, assuming DNS is managed")
return true, nil
}); err != nil {
return false, err
}
if !conditionFound {
return true, nil
}
return status != operatorv1.ConditionFalse, nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify duplicate symbol definitions and mixed call signatures.
set -euo pipefail

echo "== Function definitions named isDNSManaged =="
rg -n '^\s*func\s+isDNSManaged\s*\(' test/extended/router

echo
echo "== Call sites of isDNSManaged =="
rg -n '\bisDNSManaged\s*\(' test/extended/router/http2.go test/extended/router/grpc-interop.go test/extended/router/gatewayapicontroller.go

Repository: openshift/origin

Length of output: 998


isDNSManaged is redeclared in package router and will break compilation.

Line 644 introduces func isDNSManaged(oc *exutil.CLI, timeout time.Duration) (bool, error), but test/extended/router/gatewayapicontroller.go:620 already defines func isDNSManaged(oc *exutil.CLI) bool. Go does not support function overloading—both function definitions in the same package cause a redeclaration compile error.

Rename the new function to avoid collision:

-func isDNSManaged(oc *exutil.CLI, timeout time.Duration) (bool, error) {
+func getDefaultIngressDNSManagedStatus(oc *exutil.CLI, timeout time.Duration) (bool, error) {

Update call sites in http2.go:504, grpc-interop.go:333, and gatewayapicontroller.go:807, 1080.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/extended/router/http2.go` around lines 644 - 671, The new
isDNSManaged(oc *exutil.CLI, timeout time.Duration) conflicts with the existing
isDNSManaged(oc *exutil.CLI) in package router; rename the new function (e.g.,
isDNSManagedWithTimeout or isDNSManagedTimeout) and update all callers to the
new name and signature: change calls at http2.go:504, grpc-interop.go:333, and
gatewayapicontroller.go lines referenced (around 807 and 1080) to pass the
timeout argument and call the renamed function; ensure the function body and
returned types remain the same and only the identifier and call sites are
adjusted.

Some e2e tests are failing with the job "gcp-custom-dns" for featuregate
"GCPClusterHostedDNSInstall" which is promoted to GA in 4.20. In the
"custom-dns" cluster OpenShift will start static CoreDNS pods to provide
DNS resolution for API, Internal API and Ingress services that are
essential for cluster creation. After cluster deployment is completed,
the customer will update their external DNS solution with the same
assigned LB IP addresses used for the configuration of the internal
CoreDNS instance.

The failing tests like http2 and grpc tests use dedicated
ingresscontrollers, and gateway also has separated LB and dnsrecord, so
the default wildcard created by the new static CoreDNS won't work for
those tests.

Make below changes to fix the failing tests:

- Update makeHTTPClient() DialContext to target LoadBalancer IP address
  directly when DNS is unmanaged, fixing both http2 and gatewayapi
  httproute tests.
- Update grpc Dial() to target LoadBalancer IP address directly when DNS
  is unmanaged, fixing the grpc test.
- Extract getLoadBalancerAddress() as a reusable helper from
  assertGatewayLoadbalancerReady() for use across http2, grpc and
  gatewayapi tests.
- Update gatewayapi assertDNSRecordStatus() to check the Published
  condition per-zone based on whether DNS is managed or unmanaged.
- Add isDNSManaged() helper that checks the DNSManaged condition on the
  default ingresscontroller, defaulting to managed=true when the
  condition is absent.
- Add getClusterBaseDomainName() and update http2/grpc/h2spec shard
  ingresscontroller to generate domain based on cluster baseDomain
  instead of ingress domain (e.g. "e2e-test-xxx.apps.baseDomain"),
  avoiding custom ingresscontroller DNS overlapping with default
  wildcard "*.apps.baseDomain".
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/extended/router/gatewayapi_upgrade.go`:
- Around line 141-143: The test currently gates connectivity checks behind
t.managedDNS so the unmanaged-DNS code path in assertHttpRouteConnection never
runs; change the condition around the pre- and post-upgrade HTTP checks
(currently checking t.loadBalancerSupported && t.managedDNS) to always run when
t.loadBalancerSupported is true, and call assertHttpRouteConnection with the
appropriate gwServiceName for the unmanaged case (use
gatewayName+"-openshift-default" for managed and pass the gwServiceName argument
when t.managedDNS is false) so both managed and unmanaged DNS paths are
validated; update both occurrences near where assertHttpRouteConnection is
invoked (lines around the current calls that build
gatewayName+"-openshift-default" and the later call that should pass
gwServiceName) accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 5038e62e-a26c-4934-bb58-fc38fb07f58e

📥 Commits

Reviewing files that changed from the base of the PR and between 867e786 and 5b8c8e3.

📒 Files selected for processing (7)
  • test/extended/router/gatewayapi_upgrade.go
  • test/extended/router/gatewayapicontroller.go
  • test/extended/router/grpc-interop.go
  • test/extended/router/grpc-interop/clientconn.go
  • test/extended/router/h2spec.go
  • test/extended/router/http2.go
  • test/extended/router/idle.go
✅ Files skipped from review due to trivial changes (1)
  • test/extended/router/h2spec.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • test/extended/router/grpc-interop/clientconn.go
  • test/extended/router/gatewayapicontroller.go
  • test/extended/router/grpc-interop.go

Comment on lines 141 to +143
if t.loadBalancerSupported && t.managedDNS {
g.By("Verifying HTTP connectivity before upgrade")
assertHttpRouteConnection(t.hostname)
assertHttpRouteConnection(t.oc, t.gatewayName+"-openshift-default", t.hostname)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Unmanaged-DNS connectivity validation is still gated out

Line 141 and Line 185 still require t.managedDNS, so the new unmanaged-DNS path in assertHttpRouteConnection (enabled by passing gwServiceName on Line 143/187) never runs in this upgrade test. This leaves the exact scenario from custom-dns jobs unvalidated here.

Suggested fix
-	if t.loadBalancerSupported && t.managedDNS {
+	if t.loadBalancerSupported {
 		g.By("Verifying HTTP connectivity before upgrade")
 		assertHttpRouteConnection(t.oc, t.gatewayName+"-openshift-default", t.hostname)
 		e2e.Logf("HTTPRoute connectivity verified before upgrade")
 	}
...
-	if t.loadBalancerSupported && t.managedDNS {
+	if t.loadBalancerSupported {
 		g.By("Verifying HTTP connectivity after upgrade")
 		assertHttpRouteConnection(t.oc, t.gatewayName+"-openshift-default", t.hostname)
 	}

Also applies to: 185-187

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/extended/router/gatewayapi_upgrade.go` around lines 141 - 143, The test
currently gates connectivity checks behind t.managedDNS so the unmanaged-DNS
code path in assertHttpRouteConnection never runs; change the condition around
the pre- and post-upgrade HTTP checks (currently checking
t.loadBalancerSupported && t.managedDNS) to always run when
t.loadBalancerSupported is true, and call assertHttpRouteConnection with the
appropriate gwServiceName for the unmanaged case (use
gatewayName+"-openshift-default" for managed and pass the gwServiceName argument
when t.managedDNS is false) so both managed and unmanaged DNS paths are
validated; update both occurrences near where assertHttpRouteConnection is
invoked (lines around the current calls that build
gatewayName+"-openshift-default" and the later call that should pass
gwServiceName) accordingly.

Remove duplicate 'var err error' declaration that caused compilation error.
The err variable is already declared at line 601, so reuse it at line 612.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
test/extended/router/gatewayapicontroller.go (1)

406-406: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Skip the unmanaged-DNS connectivity path when the platform has no LoadBalancer support.

This now falls back to getLoadBalancerAddress(...) for every unmanaged-DNS run. On VSphere/BareMetal/Equinix, loadBalancerSupported is false, so the test waits for a LoadBalancer ingress that will never appear instead of skipping cleanly.

🛠️ Minimal guard
-		assertHttpRouteConnection(oc, gw+"-openshift-default", defaultRoutename)
+		if managedDNS || loadBalancerSupported {
+			assertHttpRouteConnection(oc, gw+"-openshift-default", defaultRoutename)
+		} else {
+			g.Skip("Skipping HTTPRoute connectivity check: unmanaged DNS and no LoadBalancer support")
+		}

Also applies to: 1072-1090

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/extended/router/gatewayapicontroller.go` at line 406, The test calls
assertHttpRouteConnection which triggers getLoadBalancerAddress for the
unmanaged-DNS path and blocks on a LoadBalancer ingress even on platforms
without LoadBalancer support; update the test to check the platform capability
(the loadBalancerSupported flag or equivalent) before invoking the unmanaged-DNS
connectivity path and skip calling assertHttpRouteConnection /
getLoadBalancerAddress when loadBalancerSupported is false (also apply the same
guard to the other occurrences around lines 1072-1090) so the test cleanly skips
the LoadBalancer-dependent checks on VSphere/BareMetal/Equinix.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@test/extended/router/gatewayapicontroller.go`:
- Line 406: The test calls assertHttpRouteConnection which triggers
getLoadBalancerAddress for the unmanaged-DNS path and blocks on a LoadBalancer
ingress even on platforms without LoadBalancer support; update the test to check
the platform capability (the loadBalancerSupported flag or equivalent) before
invoking the unmanaged-DNS connectivity path and skip calling
assertHttpRouteConnection / getLoadBalancerAddress when loadBalancerSupported is
false (also apply the same guard to the other occurrences around lines
1072-1090) so the test cleanly skips the LoadBalancer-dependent checks on
VSphere/BareMetal/Equinix.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 74cc852c-6939-4f81-8656-465de3a70ce0

📥 Commits

Reviewing files that changed from the base of the PR and between 5b8c8e3 and a36edae.

📒 Files selected for processing (1)
  • test/extended/router/gatewayapicontroller.go

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 5, 2026

@sadasu: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-vsphere-ovn-upi a36edae link true /test e2e-vsphere-ovn-upi
ci/prow/e2e-aws-csi a36edae link true /test e2e-aws-csi
ci/prow/e2e-metal-ipi-ovn-ipv6 a36edae link true /test e2e-metal-ipi-ovn-ipv6
ci/prow/e2e-vsphere-ovn a36edae link true /test e2e-vsphere-ovn
ci/prow/e2e-aws-ovn-serial-2of2 a36edae link true /test e2e-aws-ovn-serial-2of2

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants