Skip to content

CONSOLE-5118: Add OLS integration for cluster update workflows#16131

Open
fao89 wants to merge 1 commit intoopenshift:mainfrom
fao89:CONSOLE-5118
Open

CONSOLE-5118: Add OLS integration for cluster update workflows#16131
fao89 wants to merge 1 commit intoopenshift:mainfrom
fao89:CONSOLE-5118

Conversation

@fao89
Copy link
Member

@fao89 fao89 commented Mar 11, 2026

Implements OpenShift Lightspeed AI assistance integration into cluster settings page to provide contextual help during cluster update workflows.

Features:

  • UpdateWorkflowOLSButton component with 4 workflow phase support
  • Precheck: Pre-update validation and readiness assessment
  • Failure: Error analysis and troubleshooting guidance
  • Status: Real-time update progress monitoring assistance
  • Success: Post-update verification and validation help

The integration uses the official lightspeed-console plugin API to open the OLS chatbox with context-aware prompts and cluster data attachments. Each workflow phase provides tailored AI prompts and exports relevant cluster resources (ClusterVersion) as YAML for comprehensive analysis.
https://github.com/openshift/lightspeed-console?tab=readme-ov-file#opening-the-openshift-lightspeed-ui-from-other-console-pages-and-plugins

Summary by CodeRabbit

  • New Features

    • Introduced LightSpeed AI assistant integration in cluster settings with contextual "Ask Lightspeed" buttons for update workflows.
    • Buttons appear during precheck, failure, status, and success phases with detailed cluster and update context.
  • Localization

    • Added translations for LightSpeed-related UI labels and button text.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 11, 2026
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 11, 2026

@fao89: This pull request references CONSOLE-5118 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Implements OpenShift Lightspeed AI assistance integration into cluster settings page to provide contextual help during cluster update workflows.

Features:

  • UpdateWorkflowOLSButton component with 4 workflow phase support
  • Precheck: Pre-update validation and readiness assessment
  • Failure: Error analysis and troubleshooting guidance
  • Status: Real-time update progress monitoring assistance
  • Success: Post-update verification and validation help

The integration uses the official lightspeed-console plugin API to open the OLS chatbox with context-aware prompts and cluster data attachments. Each workflow phase provides tailored AI prompts and exports relevant cluster resources (ClusterVersion, operators, nodes, MCPs) as YAML for comprehensive analysis.
https://github.com/openshift/lightspeed-console?tab=readme-ov-file#opening-the-openshift-lightspeed-ui-from-other-console-pages-and-plugins

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from rhamilto and spadgett March 11, 2026 19:01
@openshift-ci openshift-ci bot added component/core Related to console core functionality kind/i18n Indicates issue or PR relates to internationalization or has content that needs to be translated labels Mar 11, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 11, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: fao89
Once this PR has been reviewed and has the lgtm label, please assign rawagner for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 11, 2026

@fao89: This pull request references CONSOLE-5118 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Implements OpenShift Lightspeed AI assistance integration into cluster settings page to provide contextual help during cluster update workflows.

Features:

  • UpdateWorkflowOLSButton component with 4 workflow phase support
  • Precheck: Pre-update validation and readiness assessment
  • Failure: Error analysis and troubleshooting guidance
  • Status: Real-time update progress monitoring assistance
  • Success: Post-update verification and validation help

The integration uses the official lightspeed-console plugin API to open the OLS chatbox with context-aware prompts and cluster data attachments. Each workflow phase provides tailored AI prompts and exports relevant cluster resources (ClusterVersion, operators, nodes, MCPs) as YAML for comprehensive analysis.
https://github.com/openshift/lightspeed-console?tab=readme-ov-file#opening-the-openshift-lightspeed-ui-from-other-console-pages-and-plugins

Summary by CodeRabbit

  • New Features

  • Introduced LightSpeed AI assistant integration in cluster settings with contextual "Ask Lightspeed" buttons for update workflows.

  • Buttons appear during precheck, failure, status, and success phases with detailed cluster and update context.

  • Localization

  • Added translations for LightSpeed-related UI labels and button text.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 11, 2026

📝 Walkthrough

Walkthrough

This pull request integrates LightSpeed/OpenShift Console (OLS) support into the cluster settings UI. A new UpdateWorkflowOLSButton component is introduced that conditionally renders when OLS extensions are available via the LIGHTSPEED_CONSOLE flag. The button triggers OLS workflows across four phases: precheck, failure, status, and success, with rich YAML attachments containing cluster update plans, health data, failure analysis, and verification guidance. Data dependencies are expanded to load clusterOperators and nodes resources. Five new localization keys are added for button labels across different update scenarios, and telemetry integration is wired throughout.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding OpenShift Lightspeed (OLS) integration to cluster update workflows, which aligns with the substantial additions to cluster-settings component and localization strings.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@frontend/public/components/cluster-settings/cluster-settings.tsx`:
- Around line 1494-1511: The precheck prompt in the case 'precheck' block uses
availableUpdates[0] (targetVersion) which may be undefined or misleading before
a user selects a release; change the logic in the cluster-settings.tsx precheck
handler to stop hard-coding availableUpdates[0]: if there is a user-selected
target (e.g., selectedRelease or selectedUpdate state), use that version
(targetVersion) in the prompt; otherwise construct a version-agnostic prompt
that either lists all candidate versions (map availableUpdates to their .version
strings) or explicitly states that no single target is selected and asks for
guidance across the candidate set; update the same pattern referenced around
lines 1646-1649 as well.
- Around line 979-982: The current render condition treats
ClusterUpdateStatus.ErrorRetrieving like a failure and shows the "Add Failure
OLS Button"; remove ClusterUpdateStatus.ErrorRetrieving from the failure branch
so the button only appears for ClusterUpdateStatus.Failing and
ClusterUpdateStatus.UpdatingAndFailing, and add a separate branch for
ClusterUpdateStatus.ErrorRetrieving that either hides the button or renders a
distinct retrieval-error UI/prompt (create a small conditional branch near the
existing button rendering using the status variable and
ClusterUpdateStatus.ErrorRetrieving to present a dedicated message/component
instead of the failure workflow).
- Around line 1921-1933: The health check for operators wrongly uses
op.status?.conditions?.every(...) which treats unrelated conditions (e.g.,
Upgradeable) as failures; update the logic in the operatorHealth calculation
(and the similar block around allOperatorsHealthy) to explicitly pick conditions
by type ('Available','Progressing','Degraded') from op.status.conditions (use
find or a small lookup) and then evaluate only those three for the healthy
determination, leaving allOperatorsUpdated logic (getClusterOperatorVersion and
getCurrentVersionSafe) unchanged; modify the code paths that compute
operatorHealth.healthy and allOperatorsHealthy to reference these explicit
condition checks for each clusterOperators element.
- Around line 900-909: Gate the ClusterOperator and Node watches behind the
OLS-availability flag so they only mount when the OLS action is actually shown:
keep the hook calls (useK8sWatchResource) in place but pass null (or an explicit
disabled value) instead of the resource descriptor when the existing OLS
availability flag (e.g., canShowOlsAction / isOlsAvailable / whatever boolean
you already compute to decide showing the OLS action) is false; update the two
calls that create clusterOperators/clusterOperatorsLoaded and nodes/nodesLoaded
(currently using ClusterOperatorModel and NodeModel) to use a conditional
argument (shouldWatch ? { isList: true, kind:
referenceForModel(ClusterOperatorModel) } : null) and similarly for NodeModel so
the watches only start when OLS is available.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bcb67d8d-d4a9-420b-b922-e44673df8830

📥 Commits

Reviewing files that changed from the base of the PR and between 21101de and 972b7c0.

📒 Files selected for processing (2)
  • frontend/public/components/cluster-settings/cluster-settings.tsx
  • frontend/public/locales/en/public.json

Comment on lines +900 to +909
// Enhanced data fetching for OLS integration
const [clusterOperators, clusterOperatorsLoaded] = useK8sWatchResource<ClusterOperator[]>({
isList: true,
kind: referenceForModel(ClusterOperatorModel),
});

const [nodes, nodesLoaded] = useK8sWatchResource<K8sResourceKind[]>({
isList: true,
kind: referenceForModel(NodeModel),
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Gate the new cluster-scoped watches behind OLS availability.

These ClusterOperator and Node list watches now start on every visit to Cluster Settings, even when Lightspeed is disabled or the extension is absent. That adds avoidable load on large clusters and widens the RBAC surface for a page that previously did not need these resources just to render. Please only mount the watches when the OLS action can actually be shown.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/public/components/cluster-settings/cluster-settings.tsx` around
lines 900 - 909, Gate the ClusterOperator and Node watches behind the
OLS-availability flag so they only mount when the OLS action is actually shown:
keep the hook calls (useK8sWatchResource) in place but pass null (or an explicit
disabled value) instead of the resource descriptor when the existing OLS
availability flag (e.g., canShowOlsAction / isOlsAvailable / whatever boolean
you already compute to decide showing the OLS action) is false; update the two
calls that create clusterOperators/clusterOperatorsLoaded and nodes/nodesLoaded
(currently using ClusterOperatorModel and NodeModel) to use a conditional
argument (shouldWatch ? { isList: true, kind:
referenceForModel(ClusterOperatorModel) } : null) and similarly for NodeModel so
the watches only start when OLS is available.

Comment on lines +979 to +982
{/* Add Failure OLS Button */}
{(status === ClusterUpdateStatus.Failing ||
status === ClusterUpdateStatus.UpdatingAndFailing ||
status === ClusterUpdateStatus.ErrorRetrieving) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Treat ErrorRetrieving as a separate workflow.

ClusterUpdateStatus.ErrorRetrieving means the console could not fetch update recommendations, not that a cluster upgrade failed. Reusing the failure workflow here will generate misleading prompts and attachments, including cases where the target version is missing. Consider hiding the button for this state or using a dedicated retrieval-error prompt.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/public/components/cluster-settings/cluster-settings.tsx` around
lines 979 - 982, The current render condition treats
ClusterUpdateStatus.ErrorRetrieving like a failure and shows the "Add Failure
OLS Button"; remove ClusterUpdateStatus.ErrorRetrieving from the failure branch
so the button only appears for ClusterUpdateStatus.Failing and
ClusterUpdateStatus.UpdatingAndFailing, and add a separate branch for
ClusterUpdateStatus.ErrorRetrieving that either hides the button or renders a
distinct retrieval-error UI/prompt (create a small conditional branch near the
existing button rendering using the status variable and
ClusterUpdateStatus.ErrorRetrieving to present a dedicated message/component
instead of the failure workflow).

Comment on lines +1494 to +1511
case 'precheck': {
const targetVersion = availableUpdates[0]?.version;
const recentFailures =
cv.status?.history?.slice(0, 5).filter((h) => h.state !== 'Completed').length || 0;

return `I'm planning to update my ${environment} OpenShift cluster (${clusterSize}) from ${currentVersion} to ${targetVersion} via ${updateChannel} channel.

Recent update history shows ${recentFailures} non-successful attempts in the last 5 updates.

Please provide comprehensive pre-update guidance including:
1. Specific prerequisites and compatibility checks for this version jump
2. Resource requirements and capacity planning
3. Backup and rollback strategies
4. Risk assessment based on my cluster configuration
5. Estimated update duration and maintenance window planning
6. Any known issues or breaking changes for this update path

Focus on actionable steps I should take before starting the update.`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't hard-code availableUpdates[0] as the precheck target.

This action is rendered before the user selects a release, but the prompt and attachment pretend a specific target version was chosen. On clusters with multiple supported updates, OLS can analyze the wrong upgrade path and return incorrect prerequisite guidance. Either make the precheck flow version-specific after selection, or keep it version-agnostic and send the full candidate set.

Also applies to: 1646-1649

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/public/components/cluster-settings/cluster-settings.tsx` around
lines 1494 - 1511, The precheck prompt in the case 'precheck' block uses
availableUpdates[0] (targetVersion) which may be undefined or misleading before
a user selects a release; change the logic in the cluster-settings.tsx precheck
handler to stop hard-coding availableUpdates[0]: if there is a user-selected
target (e.g., selectedRelease or selectedUpdate state), use that version
(targetVersion) in the prompt; otherwise construct a version-agnostic prompt
that either lists all candidate versions (map availableUpdates to their .version
strings) or explicitly states that no single target is selected and asks for
guidance across the candidate set; update the same pattern referenced around
lines 1646-1649 as well.

Comment on lines +1921 to +1933
allOperatorsUpdated: clusterOperators.every(
(op) => getClusterOperatorVersion(op) === getCurrentVersionSafe(),
),
operatorHealth: {
total: clusterOperators.length,
healthy: clusterOperators.filter((op) =>
op.status?.conditions?.every(
(c) =>
(c.type === 'Available' && c.status === 'True') ||
(c.type === 'Degraded' && c.status === 'False') ||
(c.type === 'Progressing' && c.status === 'False'),
),
).length,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Limit the success health check to the relevant operator conditions.

Using status.conditions.every(...) across the whole condition array treats extra conditions like Upgradeable as failures, so a healthy cluster can be reported as unhealthy after a successful update. Look up Available, Progressing, and Degraded explicitly before computing allOperatorsHealthy and healthy.

Suggested fix
+const isOperatorHealthy = (op: ClusterOperator) => {
+  const available = op.status?.conditions?.find((c) => c.type === 'Available')?.status;
+  const progressing = op.status?.conditions?.find((c) => c.type === 'Progressing')?.status;
+  const degraded = op.status?.conditions?.find((c) => c.type === 'Degraded')?.status;
+  return available === 'True' && progressing === 'False' && degraded === 'False';
+};
+
 ...
-            allOperatorsHealthy: clusterOperators.every((op) =>
-              op.status?.conditions?.every(
-                (c) =>
-                  (c.type === 'Available' && c.status === 'True') ||
-                  (c.type === 'Degraded' && c.status === 'False') ||
-                  (c.type === 'Progressing' && c.status === 'False'),
-              ),
-            ),
+            allOperatorsHealthy: clusterOperators.every(isOperatorHealthy),
 ...
-              healthy: clusterOperators.filter((op) =>
-                op.status?.conditions?.every(
-                  (c) =>
-                    (c.type === 'Available' && c.status === 'True') ||
-                    (c.type === 'Degraded' && c.status === 'False') ||
-                    (c.type === 'Progressing' && c.status === 'False'),
-                ),
-              ).length,
+              healthy: clusterOperators.filter(isOperatorHealthy).length,

Also applies to: 1979-1987

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/public/components/cluster-settings/cluster-settings.tsx` around
lines 1921 - 1933, The health check for operators wrongly uses
op.status?.conditions?.every(...) which treats unrelated conditions (e.g.,
Upgradeable) as failures; update the logic in the operatorHealth calculation
(and the similar block around allOperatorsHealthy) to explicitly pick conditions
by type ('Available','Progressing','Degraded') from op.status.conditions (use
find or a small lookup) and then evaluate only those three for the healthy
determination, leaving allOperatorsUpdated logic (getClusterOperatorVersion and
getCurrentVersionSafe) unchanged; modify the code paths that compute
operatorHealth.healthy and allOperatorsHealthy to reference these explicit
condition checks for each clusterOperators element.

@fao89
Copy link
Member Author

fao89 commented Mar 12, 2026

/test e2e-gcp-console

@fao89
Copy link
Member Author

fao89 commented Mar 12, 2026

CI Status Update

All core builds are passing consistently:

  • ✅ Frontend, backend, analysis, images all PASS
  • ❌ Only e2e-gcp-console failing with infrastructure issues

The e2e failures appear to be environment-related (resource constraints, networking issues, cluster instability) rather than code-related. The test logs show good initial execution but infrastructure failures.

Requesting maintainer review for merge consideration given the clean core builds.

/cc @console-maintainers

Implements OpenShift Lightspeed AI assistance integration into cluster
settings page to provide contextual help during cluster update workflows.

Features:
- UpdateWorkflowOLSButton component with 4 workflow phase support
- Precheck: Pre-update validation and readiness assessment
- Failure: Error analysis and troubleshooting guidance
- Status: Real-time update progress monitoring assistance
- Success: Post-update verification and validation help

The integration uses the official lightspeed-console plugin API to open
the OLS chatbox with context-aware prompts and cluster data attachments.
Each workflow phase provides tailored AI prompts and exports relevant
cluster resources (ClusterVersion) as YAML for
comprehensive analysis.

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Signed-off-by: Fabricio Aguiar <fabricio.aguiar@gmail.com>
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 13, 2026

@fao89: This pull request references CONSOLE-5118 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Implements OpenShift Lightspeed AI assistance integration into cluster settings page to provide contextual help during cluster update workflows.

Features:

  • UpdateWorkflowOLSButton component with 4 workflow phase support
  • Precheck: Pre-update validation and readiness assessment
  • Failure: Error analysis and troubleshooting guidance
  • Status: Real-time update progress monitoring assistance
  • Success: Post-update verification and validation help

The integration uses the official lightspeed-console plugin API to open the OLS chatbox with context-aware prompts and cluster data attachments. Each workflow phase provides tailored AI prompts and exports relevant cluster resources (ClusterVersion) as YAML for comprehensive analysis.
https://github.com/openshift/lightspeed-console?tab=readme-ov-file#opening-the-openshift-lightspeed-ui-from-other-console-pages-and-plugins

Summary by CodeRabbit

  • New Features

  • Introduced LightSpeed AI assistant integration in cluster settings with contextual "Ask Lightspeed" buttons for update workflows.

  • Buttons appear during precheck, failure, status, and success phases with detailed cluster and update context.

  • Localization

  • Added translations for LightSpeed-related UI labels and button text.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 13, 2026

@fao89: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 15, 2026
@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/core Related to console core functionality jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. kind/i18n Indicates issue or PR relates to internationalization or has content that needs to be translated needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants