Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -86,5 +86,21 @@
"source": "openshift:payload:cluster-version-operator",
"lifecycle": "blocking",
"environmentSelector": {}
},
{
"name": "[Jira:\"Cluster Version Operator\"] cluster-version-operator cvo drops invalid conditional edges",
"labels": {
"46422": {},
"ConnectedOnly": {},
"Lifecycle:informing": {},
"Low": {},
"Serial": {}
},
"resources": {
"isolation": {}
},
"source": "openshift:payload:cluster-version-operator",
"lifecycle": "informing",
"environmentSelector": {}
}
]
18 changes: 18 additions & 0 deletions pkg/payload/precondition/clusterversion/upgradeable.go
Original file line number Diff line number Diff line change
Expand Up @@ -180,3 +180,21 @@ func GetCurrentVersion(history []configv1.UpdateHistory) string {
}
return ""
}

// GetCurrentVersionAndImage determines and returns the cluster's current version and image by iterating
// through the provided update history until it finds the first version with update State of Completed.
// If a Completed version is not found the version of the oldest history entry, which is the originally
// installed version, is returned. If history is empty the empty string is returned.
func GetCurrentVersionAndImage(history []configv1.UpdateHistory) (string, string) {
for _, h := range history {
if h.State == configv1.CompletedUpdate {
klog.V(2).Infof("Cluster current version=%s", h.Version)
return h.Version, h.Image
}
}
// Empty history should only occur if method is called early in startup before history is populated.
if len(history) != 0 {
return history[len(history)-1].Version, history[len(history)-1].Image
}
return "", ""
Comment thread
coderabbitai[bot] marked this conversation as resolved.
}
251 changes: 251 additions & 0 deletions test/cvo/cvo.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,26 @@ package cvo

import (
"context"
"fmt"
"strings"
"time"

"github.com/blang/semver/v4"
g "github.com/onsi/ginkgo/v2"
o "github.com/onsi/gomega"

corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/client-go/kubernetes"
"k8s.io/client-go/rest"

ote "github.com/openshift-eng/openshift-tests-extension/pkg/ginkgo"
configv1 "github.com/openshift/api/config/v1"
clientconfigv1 "github.com/openshift/client-go/config/clientset/versioned"

"github.com/openshift/cluster-version-operator/pkg/external"
"github.com/openshift/cluster-version-operator/pkg/payload/precondition/clusterversion"
"github.com/openshift/cluster-version-operator/test/oc"
ocapi "github.com/openshift/cluster-version-operator/test/oc/api"
"github.com/openshift/cluster-version-operator/test/util"
Expand Down Expand Up @@ -99,4 +110,244 @@ var _ = g.Describe(`[Jira:"Cluster Version Operator"] cluster-version-operator`,
sccAnnotation := cvoPod.Annotations["openshift.io/scc"]
o.Expect(sccAnnotation).To(o.Equal("hostaccess"), "Expected the annotation 'openshift.io/scc annotation' on pod %s to have the value 'hostaccess', but got %s", cvoPod.Name, sccAnnotation)
})

g.It("cvo drops invalid conditional edges", ote.Informing(), g.Label("46422", "Low", "ConnectedOnly", "Serial"), func() {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Test name should follow the required format.

The test name "cvo drops invalid conditional edges" doesn't match the required format [Jira:"Cluster Version Operator"] description. Since the parent Describe already has the Jira prefix, this may be intentional, but the coding guidelines specify this format for individual test names.

As per coding guidelines: "Ensure test names follow the [Jira:\"Cluster Version Operator\"] description format"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/cvo/cvo.go` at line 114, The test case name string in the g.It call (the
test starting with g.It("cvo drops invalid conditional edges", ...)) does not
follow the required format; update the test name to the mandated format by
prefixing it with the Jira tag used in this suite (e.g., `[Jira:"Cluster Version
Operator"] description`) so it reads like `[Jira:"Cluster Version Operator"] cvo
drops invalid conditional edges` (or similar descriptive text) while keeping the
same labels and callback (ote.Informing(), g.Label(...), func()) intact.

ctx := context.Background()
err := util.SkipIfHypershift(ctx, restCfg)
o.Expect(err).NotTo(o.HaveOccurred(), "Failed to determine if cluster is HyperShift")
err = util.SkipIfMicroshift(ctx, restCfg)
o.Expect(err).NotTo(o.HaveOccurred(), "Failed to determine if cluster is MicroShift")
err = util.SkipIfNetworkRestricted(ctx, restCfg, util.FauxinnatiAPIURL)
o.Expect(err).NotTo(o.HaveOccurred(), "Failed to check network connectivity")
Comment on lines +114 to +121
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This test still hard-requires external egress.

Lines 114-121 keep the case labeled ConnectedOnly and skip on util.FauxinnatiAPIURL, even though the code under test only talks to the in-cluster update services created later. In disconnected or egress-restricted jobs, this new path never runs, which undercuts the point of adding in-cluster Cincinnati here. Drop the external gate (and likely the label) or gate only on reachability to url1/url2.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/cvo/cvo.go` around lines 114 - 121, The test g.It("cvo drops invalid
conditional edges", ...) still gates execution on
util.SkipIfNetworkRestricted(ctx, restCfg, util.FauxinnatiAPIURL) and is labeled
"ConnectedOnly", which forces external egress even though the code under test
only uses in-cluster Cincinnati; remove or narrow that external gate by deleting
the util.SkipIfNetworkRestricted call (and the "ConnectedOnly" label) so the
test runs in egress-restricted environments, or replace it with a targeted
reachability check only to the in-cluster URLs (url1/url2) used by the test
logic; update the test invocation and any labels accordingly to reference
g.It(...) and util.FauxinnatiAPIURL/url1/url2 so reviewers can find the change.


ns, err := kubeClient.CoreV1().Namespaces().Create(ctx, &corev1.Namespace{
ObjectMeta: metav1.ObjectMeta{
GenerateName: "test-ns-46422-",
},
}, metav1.CreateOptions{})
o.Expect(err).NotTo(o.HaveOccurred())
defer func() {
_ = kubeClient.CoreV1().Namespaces().Delete(ctx, ns.Name, metav1.DeleteOptions{})
}()

clientconfigv1, err := clientconfigv1.NewForConfig(restCfg)
o.Expect(err).NotTo(o.HaveOccurred())
cv, err := clientconfigv1.ConfigV1().ClusterVersions().Get(ctx, "version", metav1.GetOptions{})
o.Expect(err).NotTo(o.HaveOccurred())
o.Expect(cv).NotTo(o.BeNil())

oldUpstream := cv.Spec.Upstream
oldChannel := cv.Spec.Channel
defer func() {
_, err := util.PatchUpstream(ctx, clientconfigv1, string(oldUpstream), oldChannel)
o.Expect(err).NotTo(o.HaveOccurred())
}()

versionString, image := clusterversion.GetCurrentVersionAndImage(cv.Status.History)
o.Expect(versionString).NotTo(o.BeEmpty())
o.Expect(image).NotTo(o.BeEmpty())

ver, err := semver.Make(versionString)
o.Expect(err).NotTo(o.HaveOccurred())
channel := fmt.Sprintf("candidate-%d.%d", ver.Major, ver.Minor)

nodes := []util.Node{
{Version: versionString, Payload: image, Channel: channel},
}
edges := []util.Edge{}
conditionalEdges := []util.ConditionalEdge{
{
Edge: util.StringEdge{
From: versionString,
To: "",
},
Risks: []util.Risk{
{
Url: "https://bugzilla.redhat.com/show_bug.cgi?id=123456",
Name: "Bug 123456",
Message: "Empty target node",
Rule: map[string]interface{}{"type": "Always"},
},
},
},
}
g.By("create upstream graph template with invalid conditional edges")
buf, err := util.CreateGraphTemplate(nodes, edges, conditionalEdges)
o.Expect(err).NotTo(o.HaveOccurred())

label1 := map[string]string{"app": "test-update-service1"}
g.By("run update service with the graph template")
deployment1, cm1, err := util.RunUpdateService(ctx, kubeClient, ns.Name, buf.String(), label1)
o.Expect(err).NotTo(o.HaveOccurred())
logger.Info(fmt.Sprintf("deployment1: %s, %s, %d", deployment1.Spec.Template.Spec.Containers[0].Image, deployment1.Spec.Template.Spec.Containers[0].Ports[0].Name, deployment1.Spec.Template.Spec.Containers[0].Ports[0].ContainerPort))
defer func() {
if deployment1 != nil {
_ = util.DeleteDeployment(ctx, kubeClient, ns.Name, deployment1.Name)
}
if cm1 != nil {
_ = util.DeleteConfigMap(ctx, kubeClient, ns.Name, cm1.Name)
}
}()

service1, url1, err := util.CreateService(ctx, kubeClient, ns.Name, deployment1, 46422)
o.Expect(err).NotTo(o.HaveOccurred())
logger.Info("Update Service URL: " + url1.String())
logger.Info(fmt.Sprintf("Service: %s, %d, %v", service1.Spec.Ports[0].Name, &service1.Spec.Ports[0].Port, service1.Spec.Ports[0].TargetPort))
defer func() { _ = util.CleanupService(ctx, kubeClient, ns.Name, service1.Name) }()

policyName := service1.Name + "-policy"
_, err = util.CreateNetworkPolicy(ctx, kubeClient, ns.Name, policyName, label1)
o.Expect(err).NotTo(o.HaveOccurred())
pollErr := wait.PollUntilContextTimeout(ctx, 5*time.Second, 60*time.Second, true, func(ctx context.Context) (bool, error) {
if res, getErr := util.NetworkRestricted(ctx, restCfg, url1.String()); getErr != nil {
return false, getErr
} else if res {
return false, fmt.Errorf("expected network to be available, but it is not")
}
return true, nil
})
o.Expect(pollErr).NotTo(o.HaveOccurred(), fmt.Sprintf("Failed to verify network connectivity to the update service at %s: %v", url1.String(), pollErr))
defer func() {
_ = util.DeleteNetworkPolicy(ctx, kubeClient, ns.Name, policyName)
}()

g.By("patch upstream with null target node conditional edge")
o.Expect(err).NotTo(o.HaveOccurred())
_, err = util.PatchUpstream(ctx, clientconfigv1, url1.String(), channel)
o.Expect(err).NotTo(o.HaveOccurred())
pollErr = wait.PollUntilContextTimeout(ctx, 5*time.Second, 10*time.Minute, true, func(ctx context.Context) (bool, error) {
cv, err := clientconfigv1.ConfigV1().ClusterVersions().Get(ctx, "version", metav1.GetOptions{})
if err != nil {
return false, err
}
for _, condition := range cv.Status.Conditions {
if condition.Type == configv1.ClusterStatusConditionType("RetrievedUpdates") {
if condition.Status != configv1.ConditionFalse {
return false, nil
}
if !strings.Contains(condition.Message, "no node for conditional update") {
return false, nil
}
return true, nil
}
}
return false, nil
})
o.Expect(pollErr).NotTo(o.HaveOccurred(), fmt.Sprintf("Failed to verify the cluster version condition for the null target node conditional edge: %v", pollErr))
_, err = util.PatchUpstream(ctx, clientconfigv1, string(oldUpstream), oldChannel)
o.Expect(err).NotTo(o.HaveOccurred())
defer func() {
_, err = util.PatchUpstream(ctx, clientconfigv1, string(oldUpstream), oldChannel)
o.Expect(err).NotTo(o.HaveOccurred())
}()
err = util.CleanupService(ctx, kubeClient, ns.Name, service1.Name)
o.Expect(err).NotTo(o.HaveOccurred())
err = util.DeleteDeployment(ctx, kubeClient, ns.Name, deployment1.Name)
o.Expect(err).NotTo(o.HaveOccurred())
err = util.DeleteConfigMap(ctx, kubeClient, ns.Name, cm1.Name)
o.Expect(err).NotTo(o.HaveOccurred())

g.By("prepare new graph file")
targetVersion := fmt.Sprintf("%d.%d.999", ver.Major, ver.Minor)
nodes = append(nodes, util.Node{
Version: targetVersion,
Payload: "quay.io/openshift-release-dev/ocp-release@sha256:fc88d0bf145c81d9c9e5b8a1cfcaa7cbbacfa698f0c3a252fbbdbfbb1b2",
Channel: channel,
})
edges = []util.Edge{}
conditionalEdges = []util.ConditionalEdge{
{
Edge: util.StringEdge{
From: versionString,
To: targetVersion,
},
Risks: []util.Risk{
{
Url: "// example.com",
Name: "InvalidURL",
Message: "Invalid URL.",
Rule: map[string]interface{}{"type": "PromQL", "promql": map[string]interface{}{"promql": "cluster_installer"}},
},
{
Url: "https://bug.example.com/b",
Name: "TypeNull",
Message: "MatchingRules type is null.",
Rule: "{\"type\": \"\"}",
},
{
Url: "https://bug.example.com/c",
Name: "InvalidMatchingRulesType",
Message: "MatchingRules type is invalid, support Always and PromQL.",
Rule: map[string]interface{}{"type": "nonexist", "promql": map[string]interface{}{"promql": "group(cluster_version_available_updates{channel=\"buggy\"})\nor\n0 * group(cluster_version_available_updates{channel!=\"buggy\"})"}},
},
{
Url: "https://bug.example.com/d",
Name: "InvalidPromQLQueryReturnValue",
Message: "PromQL query return value is not supported, support 0 and 1.",
Rule: map[string]interface{}{"type": "PromQL", "promql": map[string]interface{}{"promql": "max(cluster_version)"}},
},
{
Url: "https://bug.example.com/d",
Name: "InvalidPromQLQuery",
Message: "Invalid PromQL Query.",
Rule: map[string]interface{}{"type": "PromQL", "promql": map[string]interface{}{"promql": "cluster_infrastructure_provider{type=~\"VSphere|None\"}"}},
},
},
},
}
g.By("create upstream graph template with multi risks conditional edges")
buf, err = util.CreateGraphTemplate(nodes, edges, conditionalEdges)
o.Expect(err).NotTo(o.HaveOccurred())

g.By("run update service with the multi risks graph template")
label2 := map[string]string{"app": "test-update-service2"}
deployment2, cm2, err := util.RunUpdateService(ctx, kubeClient, ns.Name, buf.String(), label2)
o.Expect(err).NotTo(o.HaveOccurred())
defer func() {
if deployment2 != nil {
_ = util.DeleteDeployment(ctx, kubeClient, ns.Name, deployment2.Name)
}
if cm2 != nil {
_ = util.DeleteConfigMap(ctx, kubeClient, ns.Name, cm2.Name)
}
}()

service2, url2, err := util.CreateService(ctx, kubeClient, ns.Name, deployment2, 46422)
o.Expect(err).NotTo(o.HaveOccurred())
logger.Info("Update Service URL: " + url2.String())
logger.Info(fmt.Sprintf("Service: %s, %d, %v", service2.Spec.Ports[0].Name, &service2.Spec.Ports[0].Port, service2.Spec.Ports[0].TargetPort))
pollErr = wait.PollUntilContextTimeout(ctx, 5*time.Second, 60*time.Second, true, func(ctx context.Context) (bool, error) {
if res, getErr := util.NetworkRestricted(ctx, restCfg, url2.String()); getErr != nil {
return false, getErr
} else if res {
return false, fmt.Errorf("expected network to be available, but it is not")
}
return true, nil
})
o.Expect(pollErr).NotTo(o.HaveOccurred(), fmt.Sprintf("Failed to verify network connectivity to the update service at %s: %v", url2.String(), pollErr))
defer func() { _ = util.CleanupService(ctx, kubeClient, ns.Name, service2.Name) }()

g.By("patch upstream with multi risks conditional edge")
_, err = util.PatchUpstream(ctx, clientconfigv1, url2.String(), channel)
o.Expect(err).NotTo(o.HaveOccurred())
pollErr = wait.PollUntilContextTimeout(ctx, 5*time.Second, 10*time.Minute, true, func(ctx context.Context) (bool, error) {
cv, err := clientconfigv1.ConfigV1().ClusterVersions().Get(ctx, "version", metav1.GetOptions{})
if err != nil {
return false, err
}
if cv.Status.AvailableUpdates != nil {
return false, nil
}
for _, condition := range cv.Status.Conditions {
if condition.Type == configv1.ClusterStatusConditionType("RetrievedUpdates") {
if condition.Status != configv1.ConditionTrue {
return false, nil
}
return true, nil
}
}
return true, nil
Comment on lines +333 to +349
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find and examine the full test function containing lines 333-349
fd cvo.go -p test/cvo | head -5

Repository: openshift/cluster-version-operator

Length of output: 93


🏁 Script executed:

# Read the relevant section with context
cat -n test/cvo/cvo.go | sed -n '300,360p'

Repository: openshift/cluster-version-operator

Length of output: 2882


🏁 Script executed:

# Search for the test name and broader context
rg -n "func Test" test/cvo/cvo.go | grep -A2 -B2 333 || rg -n "func Test" test/cvo/cvo.go

Repository: openshift/cluster-version-operator

Length of output: 60


🏁 Script executed:

# Get more context around the test - find the test function name
cat -n test/cvo/cvo.go | sed -n '240,260p'

Repository: openshift/cluster-version-operator

Length of output: 1097


🏁 Script executed:

# Look for ConditionalUpdates and AvailableUpdates usage in this test file
rg -n "ConditionalUpdates|AvailableUpdates" test/cvo/cvo.go | head -20

Repository: openshift/cluster-version-operator

Length of output: 123


🏁 Script executed:

# Check what the patch function does - look for PatchUpstream
rg -n "PatchUpstream" test/cvo/ -A 5 | head -40

Repository: openshift/cluster-version-operator

Length of output: 2107


🏁 Script executed:

# Find the complete test function - search backwards from line 333
cat -n test/cvo/cvo.go | sed -n '100,360p'

Repository: openshift/cluster-version-operator

Length of output: 13078


🏁 Script executed:

# Search for the test function definition
rg -n "g.It\(|It\(" test/cvo/cvo.go | head -20

Repository: openshift/cluster-version-operator

Length of output: 517


🏁 Script executed:

# Look for configuration/status struct definitions
rg -n "ConditionalUpdates|AvailableUpdates" . -t go --max-count=15

Repository: openshift/cluster-version-operator

Length of output: 11896


Add verification that malformed conditional edges were dropped and fix success condition logic.

Line 349 returns success when the RetrievedUpdates condition is not found, which doesn't validate the test's core purpose: verifying invalid edges are dropped. The poll should check that ConditionalUpdates is empty and require the RetrievedUpdates condition to actually be present and true. Additionally, line 338's check for AvailableUpdates != nil should explicitly check for length to avoid ambiguity with empty slices.

♻️ Suggested fix
 		pollErr = wait.PollUntilContextTimeout(ctx, 5*time.Second, 10*time.Minute, true, func(ctx context.Context) (bool, error) {
 			cv, err := clientconfigv1.ConfigV1().ClusterVersions().Get(ctx, "version", metav1.GetOptions{})
 			if err != nil {
 				return false, err
 			}
-			if cv.Status.AvailableUpdates != nil {
+			if len(cv.Status.AvailableUpdates) != 0 {
 				return false, nil
 			}
+			if len(cv.Status.ConditionalUpdates) != 0 {
+				return false, nil
+			}
 			for _, condition := range cv.Status.Conditions {
 				if condition.Type == configv1.ClusterStatusConditionType("RetrievedUpdates") {
 					if condition.Status != configv1.ConditionTrue {
 						return false, nil
 					}
 					return true, nil
 				}
 			}
-			return true, nil
+			return false, nil
 		})
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pollErr = wait.PollUntilContextTimeout(ctx, 5*time.Second, 10*time.Minute, true, func(ctx context.Context) (bool, error) {
cv, err := clientconfigv1.ConfigV1().ClusterVersions().Get(ctx, "version", metav1.GetOptions{})
if err != nil {
return false, err
}
if cv.Status.AvailableUpdates != nil {
return false, nil
}
for _, condition := range cv.Status.Conditions {
if condition.Type == configv1.ClusterStatusConditionType("RetrievedUpdates") {
if condition.Status != configv1.ConditionTrue {
return false, nil
}
return true, nil
}
}
return true, nil
pollErr = wait.PollUntilContextTimeout(ctx, 5*time.Second, 10*time.Minute, true, func(ctx context.Context) (bool, error) {
cv, err := clientconfigv1.ConfigV1().ClusterVersions().Get(ctx, "version", metav1.GetOptions{})
if err != nil {
return false, err
}
if len(cv.Status.AvailableUpdates) != 0 {
return false, nil
}
if len(cv.Status.ConditionalUpdates) != 0 {
return false, nil
}
for _, condition := range cv.Status.Conditions {
if condition.Type == configv1.ClusterStatusConditionType("RetrievedUpdates") {
if condition.Status != configv1.ConditionTrue {
return false, nil
}
return true, nil
}
}
return false, nil
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/cvo/cvo.go` around lines 333 - 349, The polling logic incorrectly treats
missing RetrievedUpdates as success and uses a nil check for AvailableUpdates;
update the wait.PollUntilContextTimeout callback to first treat any non-empty
AvailableUpdates as not-ready (use len(cv.Status.AvailableUpdates) > 0), then
require that a RetrievedUpdates condition is present and has Status ==
configv1.ConditionTrue, and finally verify that cv.Status.ConditionalUpdates is
empty before returning success; if RetrievedUpdates is not found or
ConditionalUpdates is non-empty, return false,nil so the poll continues. Use the
existing symbols cv, cv.Status.AvailableUpdates, cv.Status.ConditionalUpdates,
and the loop checking condition.Type ==
configv1.ClusterStatusConditionType("RetrievedUpdates") to implement this.

})
o.Expect(pollErr).NotTo(o.HaveOccurred(), fmt.Sprintf("Failed to verify the cluster version condition for the multi risks target node conditional edge: %v", pollErr))
})
})
Loading