Skip to content

add networkpolicy e2e#416

Open
zhouying7780 wants to merge 3 commits intoopenshift:masterfrom
zhouying7780:networkpolicy-e2e
Open

add networkpolicy e2e#416
zhouying7780 wants to merge 3 commits intoopenshift:masterfrom
zhouying7780:networkpolicy-e2e

Conversation

@zhouying7780
Copy link

@zhouying7780 zhouying7780 commented Mar 12, 2026

Summary by CodeRabbit

  • Tests
    • Added comprehensive end-to-end tests for network policy enforcement, including connectivity validation under varied policy sequences.
    • Added end-to-end tests validating controller-manager network policies across namespaces and reconciliation workflows.
    • Updated test package structure to reflect test organization changes.

@coderabbitai
Copy link

coderabbitai bot commented Mar 12, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4e952f25-b2d7-4279-8227-78153060fbfe

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Added two new end-to-end test files that validate NetworkPolicy behavior and controller reconciliation, and changed the package declaration of one existing test file from e2e_test to e2e.

Changes

Cohort / File(s) Summary
Package Declaration Update
test/e2e/main_test.go
Changed package declaration from e2e_test to e2e. No other logic changes.
Network Policy Enforcement Tests
test/e2e/network_policy_enforcement_test.go
New E2E test implementing TestGenericNetworkPolicyEnforcement and TestControllerManagerNetworkPolicyEnforcement. Adds helpers for creating hardened agnhost/netexec server and client pods, dual-stack pod IP extraction/formatting, ephemeral connectivity checks (agnhost connect), NetworkPolicy builders (default-deny, ingress-allow, egress-allow), service ClusterIP extraction, and polling/wait helpers for pod readiness and completion.
Network Policy Validation & Reconcile Tests
test/e2e/network_policy_test.go
New E2E test implementing policy inspection and reconciliation checks (TestControllerManagerNetworkPolicies, TestControllerManagerNetworkPolicyReconcile). Captures and validates NetworkPolicy specs across target namespaces, asserts default-deny and allow rules (ports/egress), deletes/patches policies, polls for controller restoration of original .spec, and adds event polling/logging utilities.

Sequence Diagram(s)

sequenceDiagram
    participant Test as Test Runner
    participant K8s as Kubernetes API
    participant NP as NetworkPolicy Controller
    participant Pod as Netexec Pod

    Test->>K8s: Create namespace & deploy netexec pod
    Test->>Pod: Wait for pod readiness
    Pod-->>Test: Pod IPs returned
    Test->>K8s: Apply default-deny NetworkPolicy
    Test->>Pod: Run connectivity checks
    Pod-->>Test: Connectivity denied
    Test->>K8s: Apply ingress-allow NetworkPolicy
    NP->>K8s: Reconcile policies
    Test->>Pod: Run connectivity checks
    Pod-->>Test: Connectivity still denied (egress blocked)
    Test->>K8s: Apply egress-allow NetworkPolicy
    NP->>K8s: Reconcile policies
    Test->>Pod: Run connectivity checks
    Pod-->>Test: Connectivity allowed
    Test->>K8s: Clean up resources
Loading
sequenceDiagram
    participant Test as Test Runner
    participant K8s as Kubernetes API
    participant Reconciler as Policy Reconciler
    participant Events as Event Store

    Test->>K8s: Fetch existing NetworkPolicies (capture specs)
    Test->>K8s: Delete or patch NetworkPolicy
    Reconciler->>K8s: Detect change/missing policy
    Reconciler->>K8s: Restore original policy/spec
    K8s-->>Test: Policy restored
    Test->>K8s: Verify spec matches original
    Test->>K8s: Query policy-related events
    Events-->>Test: Return reconciliation events
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.91% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning Tests use Go *testing.T, not Ginkgo. Review-identified issues remain unfixed: empty serverIPs silently succeeds, premature PodIPs checks, selector assertions lack value verification, missing UID checks in reconciliation tests. Fix four identified issues from review comments, refactor tests for single responsibility, implement consistent cleanup patterns using t.Cleanup() or defer statements, verify timeout configurations.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'add networkpolicy e2e' is directly related to the main changes: it adds new end-to-end tests for NetworkPolicy enforcement and validation, which are the primary additions in this changeset.
Stable And Deterministic Test Names ✅ Passed PR introduces standard Go tests using testing package with stable, deterministic function names. Ginkgo test names requirement does not apply here.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from mrbesseh and sayan-biswas March 12, 2026 09:35
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e/network_policy_enforcement_test.go`:
- Around line 412-417: In waitForPodCompletion, avoid dereferencing
completed.Status.ContainerStatuses[0].State.Terminated without a nil check:
first verify completed.Status.ContainerStatuses is non-empty (already done) and
then ensure completed.Status.ContainerStatuses[0].State != nil and .Terminated
!= nil before reading ExitCode; if Terminated is nil return a clear error
(including the pod name/namespace and the container's current State or Reason)
so pod failures don’t panic and the caller sees why termination info is missing.
- Around line 153-156: The denied-port assertions are probing ports
80/443/6443/9090 against allowedServerIPs (which only has the 8443 listener) so
they never hit a socket; fix by either removing these checks or starting
dedicated listeners bound to allowedServerIPs on each port before the
expectConnectivity calls. Concretely, update the test to (a) create
servers/listeners for ports []int32{80,443,6443,9090} and register their IPs in
the allowedServerIPs set used by expectConnectivity, or (b) delete the loop
using expectConnectivity with operatorLabels/OperatorNamespace for those ports
so probes are only against actual listeners.

In `@test/e2e/network_policy_test.go`:
- Around line 147-152: The getKubeConfig function currently hardcodes
loadingRules.ExplicitPath to "/home/yinzhou/kubeconfig", which breaks CI and
other developers' machines; remove that hardcoded override and instead let
clientcmd.NewDefaultClientConfigLoadingRules() use its normal search behavior or
set ExplicitPath conditionally from the KUBECONFIG environment variable if
present. Update getKubeConfig to not force a local path (or read
os.Getenv("KUBECONFIG") and only set loadingRules.ExplicitPath when non-empty)
so the client-go default loading rules work across environments.
- Around line 164-176: The helper requireDefaultDenyAll currently only checks an
empty PodSelector and presence of both PolicyTypes but misses validating that
the rule lists are empty; update requireDefaultDenyAll to also assert that
len(policy.Spec.Ingress) == 0 and len(policy.Spec.Egress) == 0 (or
nil-equivalent) so policies with permissive rules like ingress: [{}] or egress:
[{}] will fail; keep the existing checks for PodSelector and PolicyTypes and
include clear t.Errorf messages referencing policy.Namespace and policy.Name
when either rule list is non-empty.
- Around line 103-109: Capture the original NetworkPolicy objects and register a
t.Cleanup immediately after the getNetworkPolicyT calls to restore them before
any destructive operations; specifically, after obtaining
expectedControllerManagerDefaultDeny, expectedControllerManagerPolicy,
expectedRouteControllerManagerDefaultDeny, expectedRouteControllerManagerPolicy,
expectedOperatorDefaultDeny, and expectedOperatorPolicy, add a cleanup that uses
kubeClient (or the existing helper used elsewhere) to re-create/update those
NetworkPolicy resources in util.TargetNamespace,
util.RouteControllerTargetNamespace, and util.OperatorNamespace so tests always
restore the originals even if a t.Fatalf occurs.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 08708fc6-c54c-4944-9cb0-304df2d9db48

📥 Commits

Reviewing files that changed from the base of the PR and between 8e04fd3 and af9f97b.

📒 Files selected for processing (3)
  • test/e2e/main_test.go
  • test/e2e/network_policy_enforcement_test.go
  • test/e2e/network_policy_test.go

Comment on lines +103 to +109
t.Log("Capturing expected NetworkPolicy specs")
expectedControllerManagerDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.TargetNamespace, defaultDenyAllPolicyName)
expectedControllerManagerPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.TargetNamespace, controllerManagerPolicyName)
expectedRouteControllerManagerDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.RouteControllerTargetNamespace, defaultDenyAllPolicyName)
expectedRouteControllerManagerPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.RouteControllerTargetNamespace, routeControllerManagerPolicyName)
expectedOperatorDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.OperatorNamespace, defaultDenyAllPolicyName)
expectedOperatorPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.OperatorNamespace, operatorPolicyName)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Register rollback cleanup before the destructive steps.

From here on, the test deletes and patches live NetworkPolicy objects in shared operator namespaces. If reconciliation is the regression, the first t.Fatalf leaves those namespaces in a broken state and can poison the rest of the e2e run. Capture these originals and restore them in t.Cleanup before starting the mutations.

Proposed fix
 	expectedControllerManagerDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.TargetNamespace, defaultDenyAllPolicyName)
 	expectedControllerManagerPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.TargetNamespace, controllerManagerPolicyName)
 	expectedRouteControllerManagerDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.RouteControllerTargetNamespace, defaultDenyAllPolicyName)
 	expectedRouteControllerManagerPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.RouteControllerTargetNamespace, routeControllerManagerPolicyName)
 	expectedOperatorDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.OperatorNamespace, defaultDenyAllPolicyName)
 	expectedOperatorPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.OperatorNamespace, operatorPolicyName)
+
+	expectedPolicies := []*networkingv1.NetworkPolicy{
+		expectedControllerManagerDefaultDeny.DeepCopy(),
+		expectedControllerManagerPolicy.DeepCopy(),
+		expectedRouteControllerManagerDefaultDeny.DeepCopy(),
+		expectedRouteControllerManagerPolicy.DeepCopy(),
+		expectedOperatorDefaultDeny.DeepCopy(),
+		expectedOperatorPolicy.DeepCopy(),
+	}
+	t.Cleanup(func() {
+		for _, expected := range expectedPolicies {
+			current, err := kubeClient.NetworkingV1().NetworkPolicies(expected.Namespace).Get(ctx, expected.Name, metav1.GetOptions{})
+			if err == nil {
+				current.Spec = expected.Spec
+				_, _ = kubeClient.NetworkingV1().NetworkPolicies(expected.Namespace).Update(ctx, current, metav1.UpdateOptions{})
+				continue
+			}
+			clone := expected.DeepCopy()
+			clone.ResourceVersion = ""
+			_, _ = kubeClient.NetworkingV1().NetworkPolicies(expected.Namespace).Create(ctx, clone, metav1.CreateOptions{})
+		}
+	})
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
t.Log("Capturing expected NetworkPolicy specs")
expectedControllerManagerDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.TargetNamespace, defaultDenyAllPolicyName)
expectedControllerManagerPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.TargetNamespace, controllerManagerPolicyName)
expectedRouteControllerManagerDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.RouteControllerTargetNamespace, defaultDenyAllPolicyName)
expectedRouteControllerManagerPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.RouteControllerTargetNamespace, routeControllerManagerPolicyName)
expectedOperatorDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.OperatorNamespace, defaultDenyAllPolicyName)
expectedOperatorPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.OperatorNamespace, operatorPolicyName)
t.Log("Capturing expected NetworkPolicy specs")
expectedControllerManagerDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.TargetNamespace, defaultDenyAllPolicyName)
expectedControllerManagerPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.TargetNamespace, controllerManagerPolicyName)
expectedRouteControllerManagerDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.RouteControllerTargetNamespace, defaultDenyAllPolicyName)
expectedRouteControllerManagerPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.RouteControllerTargetNamespace, routeControllerManagerPolicyName)
expectedOperatorDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.OperatorNamespace, defaultDenyAllPolicyName)
expectedOperatorPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.OperatorNamespace, operatorPolicyName)
expectedPolicies := []*networkingv1.NetworkPolicy{
expectedControllerManagerDefaultDeny.DeepCopy(),
expectedControllerManagerPolicy.DeepCopy(),
expectedRouteControllerManagerDefaultDeny.DeepCopy(),
expectedRouteControllerManagerPolicy.DeepCopy(),
expectedOperatorDefaultDeny.DeepCopy(),
expectedOperatorPolicy.DeepCopy(),
}
t.Cleanup(func() {
for _, expected := range expectedPolicies {
current, err := kubeClient.NetworkingV1().NetworkPolicies(expected.Namespace).Get(ctx, expected.Name, metav1.GetOptions{})
if err == nil {
current.Spec = expected.Spec
_, _ = kubeClient.NetworkingV1().NetworkPolicies(expected.Namespace).Update(ctx, current, metav1.UpdateOptions{})
continue
}
clone := expected.DeepCopy()
clone.ResourceVersion = ""
_, _ = kubeClient.NetworkingV1().NetworkPolicies(expected.Namespace).Create(ctx, clone, metav1.CreateOptions{})
}
})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/network_policy_test.go` around lines 103 - 109, Capture the original
NetworkPolicy objects and register a t.Cleanup immediately after the
getNetworkPolicyT calls to restore them before any destructive operations;
specifically, after obtaining expectedControllerManagerDefaultDeny,
expectedControllerManagerPolicy, expectedRouteControllerManagerDefaultDeny,
expectedRouteControllerManagerPolicy, expectedOperatorDefaultDeny, and
expectedOperatorPolicy, add a cleanup that uses kubeClient (or the existing
helper used elsewhere) to re-create/update those NetworkPolicy resources in
util.TargetNamespace, util.RouteControllerTargetNamespace, and
util.OperatorNamespace so tests always restore the originals even if a t.Fatalf
occurs.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
test/e2e/network_policy_test.go (1)

103-109: ⚠️ Potential issue | 🔴 Critical

Register rollback cleanup before the first delete/patch.

From here on the test mutates live NetworkPolicy objects in shared namespaces. Any t.Fatalf after the first mutation can leave the cluster poisoned for the rest of the e2e run unless the originals are restored in t.Cleanup.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/network_policy_test.go` around lines 103 - 109, Register a rollback
via t.Cleanup before performing any delete/patch: after capturing the originals
with getNetworkPolicyT (the variables expectedControllerManagerDefaultDeny,
expectedControllerManagerPolicy, expectedRouteControllerManagerDefaultDeny,
expectedRouteControllerManagerPolicy, expectedOperatorDefaultDeny,
expectedOperatorPolicy) call t.Cleanup and in that cleanup restore each saved
NetworkPolicy back to its namespace using the kube client (update or replace the
policy objects using the same ctx and kubeClient) so the original specs are
reapplied if the test fails or exits.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e/network_policy_enforcement_test.go`:
- Around line 160-173: The test currently treats an empty server IP list from
serviceClusterIPs as a successful connectivity check because expectConnectivity
silently accepts empty serverIPs; update expectConnectivity (and any helper
wrappers) to validate that the serverIPs/slice parameter is non-empty and fail
fast (e.g., call t.Fatalf or return an error) when empty so DNS checks don't
become no-ops; specifically add a check at the start of expectConnectivity to
reject empty serverIPs with a clear error mentioning the caller context and
ensure callers (where serviceClusterIPs is used) handle or propagate that
failure rather than proceeding silently.
- Around line 68-76: Remove the pre-checks that fatally abort when Status.PodIPs
is empty and instead rely on the existing podIPs(server) fallback to
Status.PodIP; specifically, delete the len(server.Status.PodIPs) == 0 t.Fatalf
guard around retrieving server IPs (and the analogous guard for the client pod)
and replace with calling serverIPs := podIPs(server) (and clientIPs :=
podIPs(client)) followed by a single check that podIPs returned at least one IP
before proceeding and logging; reference the server variable, client variable,
and the podIPs() helper to locate where to change the code.

In `@test/e2e/network_policy_test.go`:
- Around line 51-54: The test currently only asserts label keys; update it to
also assert the expected selector values so the policy points at the correct
pods: add assertions that controllerManagerPolicy's pod selector has
app="openshift-controller-manager-a" and controller-manager="true" (e.g. call a
helper like requirePodSelectorHasLabelValue(t, controllerManagerPolicy, "app",
"openshift-controller-manager-a") and requirePodSelectorHasLabelValue(t,
controllerManagerPolicy, "controller-manager", "true") or extend
requirePodSelectorHasLabel to accept an expected value), and make the same
change for the other policy checked at lines 66-69 so both label keys and values
are validated.
- Around line 258-276: In restoreNetworkPolicy capture the original object's
identity before deleting (e.g., call
client.NetworkingV1().NetworkPolicies(namespace).Get(...) once to read
original.ResourceVersion or original.UID into a variable like
originalRV/originalUID), then delete and change the wait.PollImmediate predicate
to require that a returned current object is a different instance
(current.ResourceVersion != originalRV or current.UID != originalUID) and that
equality.Semantic.DeepEqual(expected.Spec, current.Spec) both hold; treat Get
errors as transient (return false, nil) but do not accept the first read if
identity matches the original so the function only succeeds after a new policy
instance with the expected spec is observed.

---

Duplicate comments:
In `@test/e2e/network_policy_test.go`:
- Around line 103-109: Register a rollback via t.Cleanup before performing any
delete/patch: after capturing the originals with getNetworkPolicyT (the
variables expectedControllerManagerDefaultDeny, expectedControllerManagerPolicy,
expectedRouteControllerManagerDefaultDeny, expectedRouteControllerManagerPolicy,
expectedOperatorDefaultDeny, expectedOperatorPolicy) call t.Cleanup and in that
cleanup restore each saved NetworkPolicy back to its namespace using the kube
client (update or replace the policy objects using the same ctx and kubeClient)
so the original specs are reapplied if the test fails or exits.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f8fc5718-d5cb-49df-9bfd-2593b1497168

📥 Commits

Reviewing files that changed from the base of the PR and between af9f97b and a2564b7.

📒 Files selected for processing (3)
  • test/e2e/main_test.go
  • test/e2e/network_policy_enforcement_test.go
  • test/e2e/network_policy_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/main_test.go

Comment on lines +68 to +76
server, err := kubeClient.CoreV1().Pods(nsName).Get(context.TODO(), serverName, metav1.GetOptions{})
if err != nil {
t.Fatalf("failed to get server pod: %v", err)
}
if len(server.Status.PodIPs) == 0 {
t.Fatalf("server pod has no IPs")
}
serverIPs := podIPs(server)
t.Logf("server pod %s/%s ips=%v", nsName, serverName, serverIPs)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't bypass the PodIP fallback.

podIPs() already falls back to Status.PodIP, but these guards Fatalf first when PodIPs is empty. That makes valid single-stack pods fail this test unnecessarily.

🐛 Proposed fix
-	if len(server.Status.PodIPs) == 0 {
-		t.Fatalf("server pod has no IPs")
-	}
 	serverIPs := podIPs(server)
+	if len(serverIPs) == 0 {
+		t.Fatalf("server pod %s/%s has no IPs", nsName, serverName)
+	}
 	t.Logf("server pod %s/%s ips=%v", nsName, serverName, serverIPs)
-	if len(created.Status.PodIPs) == 0 {
-		t.Fatalf("server pod has no IPs")
-	}
-
 	ips := podIPs(created)
+	if len(ips) == 0 {
+		t.Fatalf("server pod %s/%s has no IPs", namespace, name)
+	}
 	t.Logf("server pod %s/%s ips=%v", namespace, name, ips)

Also applies to: 223-231

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/network_policy_enforcement_test.go` around lines 68 - 76, Remove the
pre-checks that fatally abort when Status.PodIPs is empty and instead rely on
the existing podIPs(server) fallback to Status.PodIP; specifically, delete the
len(server.Status.PodIPs) == 0 t.Fatalf guard around retrieving server IPs (and
the analogous guard for the client pod) and replace with calling serverIPs :=
podIPs(server) (and clientIPs := podIPs(client)) followed by a single check that
podIPs returned at least one IP before proceeding and logging; reference the
server variable, client variable, and the podIPs() helper to locate where to
change the code.

Comment on lines +160 to +173
t.Log("Verifying operator egress to DNS")
dnsSvc, err := kubeClient.CoreV1().Services("openshift-dns").Get(context.TODO(), "dns-default", metav1.GetOptions{})
if err != nil {
t.Fatalf("failed to get DNS service: %v", err)
}
dnsIPs := serviceClusterIPs(dnsSvc)
t.Logf("expecting allow from %s to DNS %v:53", util.OperatorNamespace, dnsIPs)
expectConnectivity(t, kubeClient, util.OperatorNamespace, operatorLabels, dnsIPs, 53, true)

t.Log("Verifying controller manager pods egress to DNS")
expectConnectivity(t, kubeClient, util.TargetNamespace, controllerManagerLabels, dnsIPs, 53, true)

t.Log("Verifying route controller manager pods egress to DNS")
expectConnectivity(t, kubeClient, util.RouteControllerTargetNamespace, routeControllerManagerLabels, dnsIPs, 53, true)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Reject empty target IP lists.

expectConnectivity silently succeeds when serverIPs is empty. That turns the DNS assertions into a no-op if serviceClusterIPs(dnsSvc) returns no usable address.

💡 Proposed fix
 func expectConnectivity(t *testing.T, kubeClient kubernetes.Interface, namespace string, clientLabels map[string]string, serverIPs []string, port int32, shouldSucceed bool) {
 	t.Helper()
+	if len(serverIPs) == 0 {
+		t.Fatalf("no target IPs available for connectivity check in %s", namespace)
+	}
 
 	for _, ip := range serverIPs {
 		family := "IPv4"

Also applies to: 346-356

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/network_policy_enforcement_test.go` around lines 160 - 173, The test
currently treats an empty server IP list from serviceClusterIPs as a successful
connectivity check because expectConnectivity silently accepts empty serverIPs;
update expectConnectivity (and any helper wrappers) to validate that the
serverIPs/slice parameter is non-empty and fail fast (e.g., call t.Fatalf or
return an error) when empty so DNS checks don't become no-ops; specifically add
a check at the start of expectConnectivity to reject empty serverIPs with a
clear error mentioning the caller context and ensure callers (where
serviceClusterIPs is used) handle or propagate that failure rather than
proceeding silently.

Comment on lines +51 to +54
// Note: actual pod selector has two labels: app=openshift-controller-manager-a and controller-manager=true
requirePodSelectorHasLabel(t, controllerManagerPolicy, "app")
requirePodSelectorHasLabel(t, controllerManagerPolicy, "controller-manager")
requireIngressPort(t, controllerManagerPolicy, corev1.ProtocolTCP, 8443)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Assert the selector values too.

These checks still pass if the policy selector points at the wrong pods with the right keys. The expected values are already documented in the comments, so the test should lock them down.

🎯 Proposed fix
-	requirePodSelectorHasLabel(t, controllerManagerPolicy, "app")
-	requirePodSelectorHasLabel(t, controllerManagerPolicy, "controller-manager")
+	requirePodSelectorLabel(t, controllerManagerPolicy, "app", "openshift-controller-manager-a")
+	requirePodSelectorLabel(t, controllerManagerPolicy, "controller-manager", "true")
@@
-	requirePodSelectorHasLabel(t, routeControllerManagerPolicy, "app")
-	requirePodSelectorHasLabel(t, routeControllerManagerPolicy, "route-controller-manager")
+	requirePodSelectorLabel(t, routeControllerManagerPolicy, "app", "route-controller-manager")
+	requirePodSelectorLabel(t, routeControllerManagerPolicy, "route-controller-manager", "true")

Also applies to: 66-69

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/network_policy_test.go` around lines 51 - 54, The test currently
only asserts label keys; update it to also assert the expected selector values
so the policy points at the correct pods: add assertions that
controllerManagerPolicy's pod selector has app="openshift-controller-manager-a"
and controller-manager="true" (e.g. call a helper like
requirePodSelectorHasLabelValue(t, controllerManagerPolicy, "app",
"openshift-controller-manager-a") and requirePodSelectorHasLabelValue(t,
controllerManagerPolicy, "controller-manager", "true") or extend
requirePodSelectorHasLabel to accept an expected value), and make the same
change for the other policy checked at lines 66-69 so both label keys and values
are validated.

Comment on lines +258 to +276
func restoreNetworkPolicy(t *testing.T, ctx context.Context, client kubernetes.Interface, expected *networkingv1.NetworkPolicy) {
t.Helper()
namespace := expected.Namespace
name := expected.Name
t.Logf("deleting NetworkPolicy %s/%s", namespace, name)
if err := client.NetworkingV1().NetworkPolicies(namespace).Delete(ctx, name, metav1.DeleteOptions{}); err != nil {
t.Fatalf("failed to delete NetworkPolicy %s/%s: %v", namespace, name, err)
}
err := wait.PollImmediate(5*time.Second, 10*time.Minute, func() (bool, error) {
current, err := client.NetworkingV1().NetworkPolicies(namespace).Get(ctx, name, metav1.GetOptions{})
if err != nil {
return false, nil
}
return equality.Semantic.DeepEqual(expected.Spec, current.Spec), nil
})
if err != nil {
t.Fatalf("timed out waiting for NetworkPolicy %s/%s spec to be restored: %v", namespace, name, err)
}
t.Logf("NetworkPolicy %s/%s spec restored after delete", namespace, name)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Wait for a new policy instance before calling it restored.

The first poll can still read the original object, and DeepEqual(expected.Spec, current.Spec) will pass immediately. That lets the delete/reconcile check succeed without ever proving the delete was observed.

🐛 Proposed fix
 	err := wait.PollImmediate(5*time.Second, 10*time.Minute, func() (bool, error) {
 		current, err := client.NetworkingV1().NetworkPolicies(namespace).Get(ctx, name, metav1.GetOptions{})
 		if err != nil {
 			return false, nil
 		}
+		if current.UID == expected.UID {
+			return false, nil
+		}
 		return equality.Semantic.DeepEqual(expected.Spec, current.Spec), nil
 	})
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func restoreNetworkPolicy(t *testing.T, ctx context.Context, client kubernetes.Interface, expected *networkingv1.NetworkPolicy) {
t.Helper()
namespace := expected.Namespace
name := expected.Name
t.Logf("deleting NetworkPolicy %s/%s", namespace, name)
if err := client.NetworkingV1().NetworkPolicies(namespace).Delete(ctx, name, metav1.DeleteOptions{}); err != nil {
t.Fatalf("failed to delete NetworkPolicy %s/%s: %v", namespace, name, err)
}
err := wait.PollImmediate(5*time.Second, 10*time.Minute, func() (bool, error) {
current, err := client.NetworkingV1().NetworkPolicies(namespace).Get(ctx, name, metav1.GetOptions{})
if err != nil {
return false, nil
}
return equality.Semantic.DeepEqual(expected.Spec, current.Spec), nil
})
if err != nil {
t.Fatalf("timed out waiting for NetworkPolicy %s/%s spec to be restored: %v", namespace, name, err)
}
t.Logf("NetworkPolicy %s/%s spec restored after delete", namespace, name)
func restoreNetworkPolicy(t *testing.T, ctx context.Context, client kubernetes.Interface, expected *networkingv1.NetworkPolicy) {
t.Helper()
namespace := expected.Namespace
name := expected.Name
t.Logf("deleting NetworkPolicy %s/%s", namespace, name)
if err := client.NetworkingV1().NetworkPolicies(namespace).Delete(ctx, name, metav1.DeleteOptions{}); err != nil {
t.Fatalf("failed to delete NetworkPolicy %s/%s: %v", namespace, name, err)
}
err := wait.PollImmediate(5*time.Second, 10*time.Minute, func() (bool, error) {
current, err := client.NetworkingV1().NetworkPolicies(namespace).Get(ctx, name, metav1.GetOptions{})
if err != nil {
return false, nil
}
if current.UID == expected.UID {
return false, nil
}
return equality.Semantic.DeepEqual(expected.Spec, current.Spec), nil
})
if err != nil {
t.Fatalf("timed out waiting for NetworkPolicy %s/%s spec to be restored: %v", namespace, name, err)
}
t.Logf("NetworkPolicy %s/%s spec restored after delete", namespace, name)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/network_policy_test.go` around lines 258 - 276, In
restoreNetworkPolicy capture the original object's identity before deleting
(e.g., call client.NetworkingV1().NetworkPolicies(namespace).Get(...) once to
read original.ResourceVersion or original.UID into a variable like
originalRV/originalUID), then delete and change the wait.PollImmediate predicate
to require that a returned current object is a different instance
(current.ResourceVersion != originalRV or current.UID != originalUID) and that
equality.Semantic.DeepEqual(expected.Spec, current.Spec) both hold; treat Get
errors as transient (return false, nil) but do not accept the first read if
identity matches the original so the function only succeeds after a new policy
instance with the expected spec is observed.

@zhouying7780
Copy link
Author

/testwith openshift/cluster-openshift-controller-manager-operator/main/e2e-aws #414

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 18, 2026

@zhouying7780, testwith: could not generate prow job. ERROR:

could not determine ci op config from metadata: got unexpected http 404 status code from configresolver: failed to get config: could not find any config for branch main on repo openshift/cluster-openshift-controller-manager-operator

@gangwgr
Copy link
Contributor

gangwgr commented Mar 18, 2026

/testwith openshift/cluster-openshift-controller-manager-operator/master/e2e-aws-ovn #414

2 similar comments
@zhouying7780
Copy link
Author

/testwith openshift/cluster-openshift-controller-manager-operator/master/e2e-aws-ovn #414

@zhouying7780
Copy link
Author

/testwith openshift/cluster-openshift-controller-manager-operator/master/e2e-aws-ovn #414

@zhouying7780
Copy link
Author

/testwith openshift/cluster-openshift-controller-manager-operator/master/e2e-aws-operator #414

@zhouying7780
Copy link
Author

/test e2e-aws-ovn

@zhouying7780
Copy link
Author

/testwith openshift/cluster-openshift-controller-manager-operator/master/e2e-aws-operator #414

1 similar comment
@zhouying7780
Copy link
Author

/testwith openshift/cluster-openshift-controller-manager-operator/master/e2e-aws-operator #414

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (4)
test/e2e/network_policy_test.go (3)

262-267: ⚠️ Potential issue | 🟠 Major

Wait for a recreated policy, not just a matching spec.

Deletion is asynchronous, so the first poll can still read the pre-delete object and pass immediately on spec equality. Require a new UID/resource version before calling the policy restored.

Proposed fix
 	err := wait.PollImmediate(5*time.Second, 10*time.Minute, func() (bool, error) {
 		current, err := client.NetworkingV1().NetworkPolicies(namespace).Get(ctx, name, metav1.GetOptions{})
 		if err != nil {
 			return false, nil
 		}
+		if current.UID == expected.UID {
+			return false, nil
+		}
 		return equality.Semantic.DeepEqual(expected.Spec, current.Spec), nil
 	})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/network_policy_test.go` around lines 262 - 267, The poll currently
returns success as soon as the fetched policy's Spec equals expected.Spec, which
can incorrectly match the pre-delete object; modify the wait.PollImmediate
lambda used with client.NetworkingV1().NetworkPolicies(...).Get so it also
verifies the policy has been recreated by checking a changed identity (compare
current.UID and/or current.ResourceVersion against
expected.UID/expected.ResourceVersion) before returning true; i.e., only return
true when equality.Semantic.DeepEqual(expected.Spec, current.Spec) is true AND
the UID/resourceVersion differs from the original expected object.

99-105: ⚠️ Potential issue | 🔴 Critical

Register rollback cleanup before mutating shared namespaces.

From here onward the test deletes and patches live NetworkPolicy objects in shared operator namespaces. Any later Fatalf can leave those namespaces broken for the rest of the e2e run unless the originals are restored in t.Cleanup.

Proposed fix
 	expectedControllerManagerDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.TargetNamespace, defaultDenyAllPolicyName)
 	expectedControllerManagerPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.TargetNamespace, controllerManagerPolicyName)
 	expectedRouteControllerManagerDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.RouteControllerTargetNamespace, defaultDenyAllPolicyName)
 	expectedRouteControllerManagerPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.RouteControllerTargetNamespace, routeControllerManagerPolicyName)
 	expectedOperatorDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.OperatorNamespace, defaultDenyAllPolicyName)
 	expectedOperatorPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.OperatorNamespace, operatorPolicyName)
+
+	expectedPolicies := []*networkingv1.NetworkPolicy{
+		expectedControllerManagerDefaultDeny.DeepCopy(),
+		expectedControllerManagerPolicy.DeepCopy(),
+		expectedRouteControllerManagerDefaultDeny.DeepCopy(),
+		expectedRouteControllerManagerPolicy.DeepCopy(),
+		expectedOperatorDefaultDeny.DeepCopy(),
+		expectedOperatorPolicy.DeepCopy(),
+	}
+	t.Cleanup(func() {
+		for _, expected := range expectedPolicies {
+			current, err := kubeClient.NetworkingV1().NetworkPolicies(expected.Namespace).Get(ctx, expected.Name, metav1.GetOptions{})
+			if err == nil {
+				current.Spec = expected.Spec
+				_, _ = kubeClient.NetworkingV1().NetworkPolicies(expected.Namespace).Update(ctx, current, metav1.UpdateOptions{})
+				continue
+			}
+
+			clone := expected.DeepCopy()
+			clone.ResourceVersion = ""
+			_, _ = kubeClient.NetworkingV1().NetworkPolicies(expected.Namespace).Create(ctx, clone, metav1.CreateOptions{})
+		}
+	})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/network_policy_test.go` around lines 99 - 105, After capturing the
originals with getNetworkPolicyT (expectedControllerManagerDefaultDeny,
expectedControllerManagerPolicy, expectedRouteControllerManagerDefaultDeny,
expectedRouteControllerManagerPolicy, expectedOperatorDefaultDeny,
expectedOperatorPolicy) register a t.Cleanup immediately to restore those
NetworkPolicy objects if the test fails; implement the cleanup to re-apply or
update each NetworkPolicy in its namespace (use util.TargetNamespace,
util.RouteControllerTargetNamespace, util.OperatorNamespace) via the kubeClient
so any subsequent Fatalf won't leave shared namespaces mutated—place this
t.Cleanup right after the captured expectations and have it call the same
update/create logic that the test uses to revert each captured expected*
variable back to the cluster.

48-52: ⚠️ Potential issue | 🟠 Major

Assert the selector values, not just the keys.

These checks still pass if the policies target the wrong pods. The companion enforcement test already assumes exact values for these labels, so the validation test should lock them down here too.

Proposed fix
-	requirePodSelectorHasLabel(t, controllerManagerPolicy, "controller-manager")
+	requirePodSelectorLabel(t, controllerManagerPolicy, "app", "openshift-controller-manager-a")
+	requirePodSelectorLabel(t, controllerManagerPolicy, "controller-manager", "true")
 	requireIngressPort(t, controllerManagerPolicy, corev1.ProtocolTCP, 8443)
 	logEgressAllowAll(t, controllerManagerPolicy)
@@
-	requirePodSelectorHasLabel(t, routeControllerManagerPolicy, "route-controller-manager")
+	requirePodSelectorLabel(t, routeControllerManagerPolicy, "app", "route-controller-manager")
+	requirePodSelectorLabel(t, routeControllerManagerPolicy, "route-controller-manager", "true")
 	requireIngressPort(t, routeControllerManagerPolicy, corev1.ProtocolTCP, 8443)
 	logEgressAllowAll(t, routeControllerManagerPolicy)

Also applies to: 61-65

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/network_policy_test.go` around lines 48 - 52, The test currently
only verifies label keys for the controller-manager NetworkPolicy, allowing
wrong target pods; update the assertions to check exact label values as well.
Change the calls that use requirePodSelectorHasLabel(t, controllerManagerPolicy,
"controller-manager") to assert the expected value (e.g.,
requirePodSelectorHasLabel(t, controllerManagerPolicy, "controller-manager",
"<expected-value>") or replace with a new helper
requirePodSelectorHasLabelValue(t, controllerManagerPolicy,
"controller-manager", "<expected-value>")), and make the same change for the
other similar block (lines 61-65) so the selector matches the exact label value
assumed by the enforcement test; keep requireIngressPort and other checks
unchanged.
test/e2e/network_policy_enforcement_test.go (1)

68-75: ⚠️ Potential issue | 🟠 Major

Let podIPs() handle single-stack pods before failing.

These guards reject valid pods that only populate Status.PodIP, even though podIPs() already falls back to that field. That makes the e2e dependent on how the cluster reports pod IPs.

Proposed fix
-	if len(server.Status.PodIPs) == 0 {
-		t.Fatalf("server pod has no IPs")
-	}
 	serverIPs := podIPs(server)
+	if len(serverIPs) == 0 {
+		t.Fatalf("server pod %s/%s has no IPs", nsName, serverName)
+	}
 	t.Logf("server pod %s/%s ips=%v", nsName, serverName, serverIPs)
@@
-	if len(created.Status.PodIPs) == 0 {
-		t.Fatalf("server pod has no IPs")
-	}
-
 	ips := podIPs(created)
+	if len(ips) == 0 {
+		t.Fatalf("server pod %s/%s has no IPs", namespace, name)
+	}
 	t.Logf("server pod %s/%s ips=%v", namespace, name, ips)

Also applies to: 223-231

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/network_policy_enforcement_test.go` around lines 68 - 75, The test
currently fails fast if a Pod's Status.PodIPs is empty even though podIPs(pod)
already falls back to Status.PodIP; update the checks around retrieving server
and the other pod (the second block around lines 223-231) to accept single-stack
pods by first attempting to use server.Status.PodIPs and, if empty, falling back
to server.Status.PodIP (same for the other pod variable) instead of immediately
t.Fatalf; rely on podIPs(pod) for the final IP list and only fail if both PodIPs
and PodIP are unset. Ensure you update both occurrences and reference the
podIPs(pod) helper when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@test/e2e/network_policy_enforcement_test.go`:
- Around line 68-75: The test currently fails fast if a Pod's Status.PodIPs is
empty even though podIPs(pod) already falls back to Status.PodIP; update the
checks around retrieving server and the other pod (the second block around lines
223-231) to accept single-stack pods by first attempting to use
server.Status.PodIPs and, if empty, falling back to server.Status.PodIP (same
for the other pod variable) instead of immediately t.Fatalf; rely on podIPs(pod)
for the final IP list and only fail if both PodIPs and PodIP are unset. Ensure
you update both occurrences and reference the podIPs(pod) helper when making the
change.

In `@test/e2e/network_policy_test.go`:
- Around line 262-267: The poll currently returns success as soon as the fetched
policy's Spec equals expected.Spec, which can incorrectly match the pre-delete
object; modify the wait.PollImmediate lambda used with
client.NetworkingV1().NetworkPolicies(...).Get so it also verifies the policy
has been recreated by checking a changed identity (compare current.UID and/or
current.ResourceVersion against expected.UID/expected.ResourceVersion) before
returning true; i.e., only return true when
equality.Semantic.DeepEqual(expected.Spec, current.Spec) is true AND the
UID/resourceVersion differs from the original expected object.
- Around line 99-105: After capturing the originals with getNetworkPolicyT
(expectedControllerManagerDefaultDeny, expectedControllerManagerPolicy,
expectedRouteControllerManagerDefaultDeny, expectedRouteControllerManagerPolicy,
expectedOperatorDefaultDeny, expectedOperatorPolicy) register a t.Cleanup
immediately to restore those NetworkPolicy objects if the test fails; implement
the cleanup to re-apply or update each NetworkPolicy in its namespace (use
util.TargetNamespace, util.RouteControllerTargetNamespace,
util.OperatorNamespace) via the kubeClient so any subsequent Fatalf won't leave
shared namespaces mutated—place this t.Cleanup right after the captured
expectations and have it call the same update/create logic that the test uses to
revert each captured expected* variable back to the cluster.
- Around line 48-52: The test currently only verifies label keys for the
controller-manager NetworkPolicy, allowing wrong target pods; update the
assertions to check exact label values as well. Change the calls that use
requirePodSelectorHasLabel(t, controllerManagerPolicy, "controller-manager") to
assert the expected value (e.g., requirePodSelectorHasLabel(t,
controllerManagerPolicy, "controller-manager", "<expected-value>") or replace
with a new helper requirePodSelectorHasLabelValue(t, controllerManagerPolicy,
"controller-manager", "<expected-value>")), and make the same change for the
other similar block (lines 61-65) so the selector matches the exact label value
assumed by the enforcement test; keep requireIngressPort and other checks
unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 92aaef4a-c820-4d5e-9bb9-1ff0505b82fe

📥 Commits

Reviewing files that changed from the base of the PR and between bc2e905 and c3f8c1e.

📒 Files selected for processing (3)
  • test/e2e/main_test.go
  • test/e2e/network_policy_enforcement_test.go
  • test/e2e/network_policy_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/main_test.go

@zhouying7780
Copy link
Author

/testwith openshift/cluster-openshift-controller-manager-operator/master/e2e-aws-operator #414

1 similar comment
@zhouying7780
Copy link
Author

/testwith openshift/cluster-openshift-controller-manager-operator/master/e2e-aws-operator #414

@zhouying7780
Copy link
Author

/testwith openshift/cluster-openshift-controller-manager-operator/master/e2e-aws-operator #414

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
test/e2e/network_policy_test.go (3)

48-65: ⚠️ Potential issue | 🟠 Major

Assert the full pod selectors for these allow policies.

These checks still pass when the selector has the right keys but points at the wrong pods. Use requirePodSelectorLabel for the expected values on controller-manager / route-controller-manager and the companion app label instead of only checking key presence.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/network_policy_test.go` around lines 48 - 65, The tests currently
call requirePodSelectorHasLabel which only verifies the presence of a label key;
update the assertions for the two allow policies so they assert exact label
key/value pairs instead: replace the requirePodSelectorHasLabel checks for
controller-manager and route-controller-manager with requirePodSelectorLabel
calls that assert both the "controller-manager"/"route-controller-manager" label
value and the companion "app" label value on the NetworkPolicies fetched via
getNetworkPolicyT (referencing controllerManagerPolicyName and
routeControllerManagerPolicyName); keep the rest of the checks (ingress port,
default deny) unchanged.

99-105: ⚠️ Potential issue | 🔴 Critical

Register rollback cleanup before the destructive steps.

Everything below deletes or patches live NetworkPolicy objects in shared namespaces. If any later t.Fatalf fires, the cluster can stay dirty and poison the rest of the e2e run. Capture deep copies here and restore them in t.Cleanup before starting the delete/patch sequence.

Suggested fix
 	expectedRouteControllerManagerPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.RouteControllerTargetNamespace, routeControllerManagerPolicyName)
 	expectedOperatorDefaultDeny := getNetworkPolicyT(t, ctx, kubeClient, util.OperatorNamespace, defaultDenyAllPolicyName)
 	expectedOperatorPolicy := getNetworkPolicyT(t, ctx, kubeClient, util.OperatorNamespace, operatorPolicyName)
+
+	expectedPolicies := []*networkingv1.NetworkPolicy{
+		expectedControllerManagerDefaultDeny.DeepCopy(),
+		expectedControllerManagerPolicy.DeepCopy(),
+		expectedRouteControllerManagerDefaultDeny.DeepCopy(),
+		expectedRouteControllerManagerPolicy.DeepCopy(),
+		expectedOperatorDefaultDeny.DeepCopy(),
+		expectedOperatorPolicy.DeepCopy(),
+	}
+	t.Cleanup(func() {
+		for _, expected := range expectedPolicies {
+			current, err := kubeClient.NetworkingV1().NetworkPolicies(expected.Namespace).Get(ctx, expected.Name, metav1.GetOptions{})
+			if err == nil {
+				current.Spec = expected.Spec
+				_, _ = kubeClient.NetworkingV1().NetworkPolicies(expected.Namespace).Update(ctx, current, metav1.UpdateOptions{})
+				continue
+			}
+			clone := expected.DeepCopy()
+			clone.ResourceVersion = ""
+			_, _ = kubeClient.NetworkingV1().NetworkPolicies(expected.Namespace).Create(ctx, clone, metav1.CreateOptions{})
+		}
+	})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/network_policy_test.go` around lines 99 - 105, Before performing
deletes/patches, register a rollback via t.Cleanup that restores the original
NetworkPolicy objects: capture deep copies of the policies returned by
getNetworkPolicyT (expectedControllerManagerDefaultDeny,
expectedControllerManagerPolicy, expectedRouteControllerManagerDefaultDeny,
expectedRouteControllerManagerPolicy, expectedOperatorDefaultDeny,
expectedOperatorPolicy) and in t.Cleanup use the kube client to reapply those
deep-copied NetworkPolicy objects into their original namespaces
(util.TargetNamespace, util.RouteControllerTargetNamespace,
util.OperatorNamespace) so any test failure after this point will restore the
original resources.

266-272: ⚠️ Potential issue | 🟠 Major

Wait for a new policy instance before calling it restored.

The first Get after Delete can still return the pre-delete object, and DeepEqual(expected.Spec, current.Spec) will pass immediately. Require a different UID or resourceVersion before accepting restoration.

Suggested fix
 	err := wait.PollImmediate(5*time.Second, timeout, func() (bool, error) {
 		current, err := client.NetworkingV1().NetworkPolicies(namespace).Get(ctx, name, metav1.GetOptions{})
 		if err != nil {
 			return false, nil
 		}
+		if current.UID == expected.UID {
+			return false, nil
+		}
 		return equality.Semantic.DeepEqual(expected.Spec, current.Spec), nil
 	})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/network_policy_test.go` around lines 266 - 272, The wait loop
accepts the first Get after Delete because it only compares specs; capture the
original identity (UID or ResourceVersion) of the pre-delete policy and require
the fetched object to have a different UID or ResourceVersion before treating it
as "restored". Concretely: before deleting the NetworkPolicy, read and save the
current UID or ResourceVersion from
client.NetworkingV1().NetworkPolicies(namespace).Get(...); then in the
wait.PollImmediate anonymous func that calls Get, return true only when
equality.Semantic.DeepEqual(expected.Spec, current.Spec) is true AND current.UID
!= previousUID (or current.ResourceVersion != previousResourceVersion) so you
ignore stale objects and only accept a truly new instance.
🧹 Nitpick comments (1)
test/e2e/network_policy_test.go (1)

335-364: Keep best-effort event logging off the critical path.

When no relevant event is emitted, this helper still spends up to two minutes listing all Events across three namespaces even though the result is only logged. A single snapshot, field-filtered query, or much shorter retry window would avoid adding a fixed tail to every run. As per coding guidelines, "-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/network_policy_test.go` around lines 335 - 364, The helper
logNetworkPolicyEvents currently blocks up to 2 minutes by repeatedly listing
all Events across namespaces; change it to a best-effort, non-blocking snapshot
using field selectors and a short/no retry loop. Replace the wait.PollImmediate
loop in logNetworkPolicyEvents with a single (or at most one short retry) call
per namespace to client.CoreV1().Events(namespace).List(...) using a
FieldSelector that filters involvedObject.kind=NetworkPolicy and
involvedObject.name=<policyName> (or combine kind/name when supported), log any
returned events, and return immediately; alternatively run the existing polling
in a goroutine so it doesn't add a fixed tail to callers. Ensure you update
references to events.Items scanning logic and keep t.Helper()/t.Logf usage
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e/network_policy_test.go`:
- Around line 293-295: The polling predicate passed to wait.PollImmediate must
not call the fatal test helper getNetworkPolicyT because it aborts the test on
transient Get errors; instead, perform a regular client.Get inside the predicate
(use the same namespace/name used before), handle non-fatal errors by returning
false,nil to let the poll retry, and only compare ((original.Spec) to the
fetched object's Spec using equality.Semantic.DeepEqual before returning
true,nil; keep references to wait.PollImmediate, getNetworkPolicyT (remove usage
in predicate), original.Spec and equality.Semantic.DeepEqual to locate and
replace the code.
- Around line 236-243: The helper logEgressAllowAll currently only logs
presence/absence of the allow-all egress rule but is used as a validation by its
callers; change it to fail the test when the allow-all rule is missing by
replacing the non-failing log path with a test failure (e.g., call t.Fatalf or
t.FailNow) so callers cannot continue on a missing rule; update the function
logEgressAllowAll (and ensure it still calls t.Logf on the success path) and
reference the existing helper hasEgressAllowAll(policy.Spec.Egress) to determine
when to fail.

---

Duplicate comments:
In `@test/e2e/network_policy_test.go`:
- Around line 48-65: The tests currently call requirePodSelectorHasLabel which
only verifies the presence of a label key; update the assertions for the two
allow policies so they assert exact label key/value pairs instead: replace the
requirePodSelectorHasLabel checks for controller-manager and
route-controller-manager with requirePodSelectorLabel calls that assert both the
"controller-manager"/"route-controller-manager" label value and the companion
"app" label value on the NetworkPolicies fetched via getNetworkPolicyT
(referencing controllerManagerPolicyName and routeControllerManagerPolicyName);
keep the rest of the checks (ingress port, default deny) unchanged.
- Around line 99-105: Before performing deletes/patches, register a rollback via
t.Cleanup that restores the original NetworkPolicy objects: capture deep copies
of the policies returned by getNetworkPolicyT
(expectedControllerManagerDefaultDeny, expectedControllerManagerPolicy,
expectedRouteControllerManagerDefaultDeny, expectedRouteControllerManagerPolicy,
expectedOperatorDefaultDeny, expectedOperatorPolicy) and in t.Cleanup use the
kube client to reapply those deep-copied NetworkPolicy objects into their
original namespaces (util.TargetNamespace, util.RouteControllerTargetNamespace,
util.OperatorNamespace) so any test failure after this point will restore the
original resources.
- Around line 266-272: The wait loop accepts the first Get after Delete because
it only compares specs; capture the original identity (UID or ResourceVersion)
of the pre-delete policy and require the fetched object to have a different UID
or ResourceVersion before treating it as "restored". Concretely: before deleting
the NetworkPolicy, read and save the current UID or ResourceVersion from
client.NetworkingV1().NetworkPolicies(namespace).Get(...); then in the
wait.PollImmediate anonymous func that calls Get, return true only when
equality.Semantic.DeepEqual(expected.Spec, current.Spec) is true AND current.UID
!= previousUID (or current.ResourceVersion != previousResourceVersion) so you
ignore stale objects and only accept a truly new instance.

---

Nitpick comments:
In `@test/e2e/network_policy_test.go`:
- Around line 335-364: The helper logNetworkPolicyEvents currently blocks up to
2 minutes by repeatedly listing all Events across namespaces; change it to a
best-effort, non-blocking snapshot using field selectors and a short/no retry
loop. Replace the wait.PollImmediate loop in logNetworkPolicyEvents with a
single (or at most one short retry) call per namespace to
client.CoreV1().Events(namespace).List(...) using a FieldSelector that filters
involvedObject.kind=NetworkPolicy and involvedObject.name=<policyName> (or
combine kind/name when supported), log any returned events, and return
immediately; alternatively run the existing polling in a goroutine so it doesn't
add a fixed tail to callers. Ensure you update references to events.Items
scanning logic and keep t.Helper()/t.Logf usage unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 87ae6e7b-0b27-4b5d-a72c-1a678e99b98e

📥 Commits

Reviewing files that changed from the base of the PR and between c3f8c1e and c185bf7.

📒 Files selected for processing (1)
  • test/e2e/network_policy_test.go

Comment on lines +236 to +243
func logEgressAllowAll(t *testing.T, policy *networkingv1.NetworkPolicy) {
t.Helper()
if hasEgressAllowAll(policy.Spec.Egress) {
t.Logf("networkpolicy %s/%s: egress allow-all rule present", policy.Namespace, policy.Name)
return
}
t.Logf("networkpolicy %s/%s: no egress allow-all rule", policy.Namespace, policy.Name)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make the egress allow-all check fail when the rule is missing.

All three callers use this as validation, but this helper only logs. A policy can lose its allow-all egress rule and the test still passes.

Suggested fix
-func logEgressAllowAll(t *testing.T, policy *networkingv1.NetworkPolicy) {
+func requireEgressAllowAll(t *testing.T, policy *networkingv1.NetworkPolicy) {
 	t.Helper()
-	if hasEgressAllowAll(policy.Spec.Egress) {
-		t.Logf("networkpolicy %s/%s: egress allow-all rule present", policy.Namespace, policy.Name)
-		return
-	}
-	t.Logf("networkpolicy %s/%s: no egress allow-all rule", policy.Namespace, policy.Name)
+	if !hasEgressAllowAll(policy.Spec.Egress) {
+		t.Errorf("%s/%s: expected egress allow-all rule", policy.Namespace, policy.Name)
+	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/network_policy_test.go` around lines 236 - 243, The helper
logEgressAllowAll currently only logs presence/absence of the allow-all egress
rule but is used as a validation by its callers; change it to fail the test when
the allow-all rule is missing by replacing the non-failing log path with a test
failure (e.g., call t.Fatalf or t.FailNow) so callers cannot continue on a
missing rule; update the function logEgressAllowAll (and ensure it still calls
t.Logf on the success path) and reference the existing helper
hasEgressAllowAll(policy.Spec.Egress) to determine when to fail.

@zhouying7780
Copy link
Author

/testwith openshift/cluster-openshift-controller-manager-operator/master/e2e-aws-operator #414

@zhouying7780
Copy link
Author

/testwith openshift/cluster-openshift-controller-manager-operator/master/e2e-aws-operator #414

@zhouying7780
Copy link
Author

/testwith openshift/cluster-openshift-controller-manager-operator/master/e2e-aws-operator #414

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 24, 2026

@zhouying7780: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-operator 9218409 link true /test e2e-aws-operator

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@dusk125
Copy link
Contributor

dusk125 commented Mar 25, 2026

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 25, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 25, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dusk125, zhouying7780
Once this PR has been reviewed and has the lgtm label, please assign ingvagabund for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants