Skip to content

feat(SREP-4160): add hcp backup command#869

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
Ajpantuso:apantuso/SREP-4160
Mar 20, 2026
Merged

feat(SREP-4160): add hcp backup command#869
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
Ajpantuso:apantuso/SREP-4160

Conversation

@Ajpantuso
Copy link
Contributor

Summary

Adds an osdctl hcp backup command to abstract the triggering of a velero backup for an HCP cluster.

Testing

This has been tested in integration to produce backups with and without labels/annotations applied.

@coderabbitai
Copy link

coderabbitai bot commented Mar 19, 2026

Walkthrough

Adds an osdctl hcp backup Cobra command and a new cmd/hcp/backup package that resolves HCP clusters, validates an existing Velero schedule in openshift-adp, selects a Velero pod, performs an elevated exec to run velero backup create --from-schedule, and accepts optional labels/annotations.

Changes

Cohort / File(s) Summary
Command & CLI
cmd/hcp/backup/backup.go, cmd/hcp/backup/flags.go, cmd/hcp/backup/description.txt
Adds NewCmdBackup() Cobra entry with embedded long help, required --cluster-id and --reason, repeatable --label/--annotation flags, and wiring for logger/OCM connection and runner execution.
Runner, helpers & printer
cmd/hcp/backup/runner.go
Implements defaultBackupRunner, KubeClient abstraction and production exec path, backup ID regex, sorted label/annotation formatting, schedule validation, Velero pod selection, exec-based velero backup create --from-schedule invocation, and printer abstraction.
Client resolver & builder
cmd/hcp/backup/client_builder.go, cmd/hcp/backup/client_builder.go*
Adds ClusterInfo, ClusterResolver (OCM-based), KubeClientBuilder with WithClusterID/WithElevation options and a production backplane builder that constructs elevated/non-elevated kube clients.
Options & DI
cmd/hcp/backup/options.go
Introduces defaultBackupRunnerConfig, multiple With* configuration options including WithResolver and WithBuilder, and injection points for logger and printer.
Tests
cmd/hcp/backup/runner_test.go
Extensive unit tests covering config defaults and overrides, resolver/builder interactions, schedule validation, pod selection edge-cases, exec behavior and output parsing, flag parsing, and printer behavior using test doubles.
Command registration
cmd/hcp/cmd.go
Registers new backup subcommand under hcp via hcp.AddCommand(backup.NewCmdBackup()).
Documentation
docs/README.md, docs/osdctl_hcp.md, docs/osdctl_hcp_backup.md
Adds and updates docs to document osdctl hcp backup usage, flags, control flow (unprivileged validate then elevated exec), examples, and monitoring steps for created Backup CR.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from RaphaelBut and Tafhim March 19, 2026 19:49
@Ajpantuso Ajpantuso force-pushed the apantuso/SREP-4160 branch from 3278a89 to 968e45b Compare March 19, 2026 19:54
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
cmd/hcp/backup/runner_test.go (1)

432-442: Use the subtest *testing.T inside execFn.

Several table entries assert inside execFn, but this contract makes those closures capture the parent t from TestRun. With subtests on Line 691 running in parallel, failures get reported on the parent test instead of the individual case. Pass the current subtest t into execFn, or move those assertions into the t.Run body.

Also applies to: 690-716

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/hcp/backup/runner_test.go` around lines 432 - 442, The table-driven tests
define execFn closures that call testing assertions but capture the parent
TestRun's t, causing failures from parallel subtests to be reported on the
parent; update the test harness so each subtest passes its local *testing.T into
the closure (change execFn signature to accept t *testing.T and update each
table entry to use that t) or move all assertions out of execFn into the t.Run
body for each test case; adjust callers of execFn within TestRun (the t.Run
loop) to call execFn(t, ...) and update the execFn-type in the tests slice
accordingly (or relocate assertions) so failures are reported on the specific
subtest.
cmd/hcp/backup/backup.go (1)

33-48: Reuse the existing OCM connection for management-cluster lookup.

utils.GetManagementCluster() opens its own OCM connection, so this path dials OCM twice after Line 34. Passing the already-open ocmConn into the management-cluster lookup would keep the flow on one session and avoid the extra auth/setup round trip.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/hcp/backup/backup.go` around lines 33 - 48, The code opens an OCM
connection (ocmConn) then calls utils.GetManagementCluster(clusterID), which
creates a second OCM connection; change the lookup to reuse the existing
connection by updating utils.GetManagementCluster to accept an existing
connection (e.g., func GetManagementCluster(conn *ocm.Connection, clusterID
string) ...) and modify this call site to: mc, err :=
utils.GetManagementCluster(ocmConn, clusterID). Ensure the new function uses the
provided conn instead of opening a new one and keep the existing defer
ocmConn.Close() as-is.
cmd/hcp/backup/runner.go (1)

161-162: Unused configuration field: VeleroDeployment.

VeleroDeployment is configured but never referenced in the runner logic. Pod discovery relies on VeleroLabelKey/VeleroLabelValue instead. Consider removing the field and its corresponding WithVeleroDeployment option to avoid confusion, or document why it's reserved for future use.

Also applies to: 176-177

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/hcp/backup/runner.go` around lines 161 - 162, The VeleroDeployment config
field is unused in the runner logic (pod discovery uses
VeleroLabelKey/VeleroLabelValue), so either remove the VeleroDeployment field
and its builder option WithVeleroDeployment or clearly document it as reserved;
update the ADPNamespace/VeleroLabelKey/VeleroLabelValue usage to remain
unchanged, delete references to VeleroDeployment and WithVeleroDeployment from
the config struct and option builders (or add a comment on VeleroDeployment
explaining its future-reserved purpose) to prevent confusion.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/hcp/backup/description.txt`:
- Around line 14-17: Help text currently tells users to run "oc get backup
<backup-id> -n openshift-adp" but omits the required step of ensuring their
kubectl/oc context is pointed at the resolved management cluster; update the
description text to instruct users to switch/login to the resolved management
cluster before checking status (for example, "oc config use-context
<management-cluster-context>" or an appropriate "oc login" command for the
management cluster), and insert that step immediately before the example "oc get
backup ..." line so the follow-up flow is complete when using "ocm backplane
elevate".

---

Nitpick comments:
In `@cmd/hcp/backup/backup.go`:
- Around line 33-48: The code opens an OCM connection (ocmConn) then calls
utils.GetManagementCluster(clusterID), which creates a second OCM connection;
change the lookup to reuse the existing connection by updating
utils.GetManagementCluster to accept an existing connection (e.g., func
GetManagementCluster(conn *ocm.Connection, clusterID string) ...) and modify
this call site to: mc, err := utils.GetManagementCluster(ocmConn, clusterID).
Ensure the new function uses the provided conn instead of opening a new one and
keep the existing defer ocmConn.Close() as-is.

In `@cmd/hcp/backup/runner_test.go`:
- Around line 432-442: The table-driven tests define execFn closures that call
testing assertions but capture the parent TestRun's t, causing failures from
parallel subtests to be reported on the parent; update the test harness so each
subtest passes its local *testing.T into the closure (change execFn signature to
accept t *testing.T and update each table entry to use that t) or move all
assertions out of execFn into the t.Run body for each test case; adjust callers
of execFn within TestRun (the t.Run loop) to call execFn(t, ...) and update the
execFn-type in the tests slice accordingly (or relocate assertions) so failures
are reported on the specific subtest.

In `@cmd/hcp/backup/runner.go`:
- Around line 161-162: The VeleroDeployment config field is unused in the runner
logic (pod discovery uses VeleroLabelKey/VeleroLabelValue), so either remove the
VeleroDeployment field and its builder option WithVeleroDeployment or clearly
document it as reserved; update the ADPNamespace/VeleroLabelKey/VeleroLabelValue
usage to remain unchanged, delete references to VeleroDeployment and
WithVeleroDeployment from the config struct and option builders (or add a
comment on VeleroDeployment explaining its future-reserved purpose) to prevent
confusion.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c5787a97-ef8e-45a0-a749-70533e3666aa

📥 Commits

Reviewing files that changed from the base of the PR and between f286d63 and 968e45b.

📒 Files selected for processing (10)
  • cmd/hcp/backup/backup.go
  • cmd/hcp/backup/description.txt
  • cmd/hcp/backup/flags.go
  • cmd/hcp/backup/options.go
  • cmd/hcp/backup/runner.go
  • cmd/hcp/backup/runner_test.go
  • cmd/hcp/cmd.go
  • docs/README.md
  • docs/osdctl_hcp.md
  • docs/osdctl_hcp_backup.md

@Ajpantuso Ajpantuso force-pushed the apantuso/SREP-4160 branch from 968e45b to 54f45c0 Compare March 19, 2026 20:27
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/hcp/backup/backup.go`:
- Around line 71-76: The privileged exec client (created via
newKubeClientForCluster and assigned to execClient) is being created before
schedule validation (runner.Run), causing unnecessary elevated logins; move the
elevated login/execClient creation so it only occurs after runner.Run
successfully validates the schedule (or refactor to create the exec client
lazily inside the runner where needed), ensuring newKubeClientForCluster is
invoked only on success and removing the pre-validation logger.Infof("Logging
into management cluster ... (elevated)...") and execClient creation before
scheduler validation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8441c787-9132-4977-b126-b20f08c6d083

📥 Commits

Reviewing files that changed from the base of the PR and between 968e45b and 54f45c0.

📒 Files selected for processing (10)
  • cmd/hcp/backup/backup.go
  • cmd/hcp/backup/description.txt
  • cmd/hcp/backup/flags.go
  • cmd/hcp/backup/options.go
  • cmd/hcp/backup/runner.go
  • cmd/hcp/backup/runner_test.go
  • cmd/hcp/cmd.go
  • docs/README.md
  • docs/osdctl_hcp.md
  • docs/osdctl_hcp_backup.md
✅ Files skipped from review due to trivial changes (5)
  • docs/osdctl_hcp.md
  • cmd/hcp/cmd.go
  • cmd/hcp/backup/description.txt
  • cmd/hcp/backup/flags.go
  • cmd/hcp/backup/runner.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • cmd/hcp/backup/options.go

Comment on lines +71 to +76
// Privileged client — pod listing and exec
logger.Infof("Logging into management cluster %s (elevated)...", mc.ID())
execClient, err := newKubeClientForCluster(ocmConn, mc.ID(), flags.reason)
if err != nil {
return fmt.Errorf("logging into management cluster %s (elevated): %w", mc.ID(), err)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Delay privileged login until after unprivileged schedule validation.

At Line 72, elevation is performed before runner.Run executes schedule validation, so a missing/invalid schedule still triggers backplane-cluster-admin login. This adds unnecessary privileged access attempts and audit noise. Create the elevated client only after schedule validation succeeds (or make exec client creation lazy inside the runner).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/hcp/backup/backup.go` around lines 71 - 76, The privileged exec client
(created via newKubeClientForCluster and assigned to execClient) is being
created before schedule validation (runner.Run), causing unnecessary elevated
logins; move the elevated login/execClient creation so it only occurs after
runner.Run successfully validates the schedule (or refactor to create the exec
client lazily inside the runner where needed), ensuring newKubeClientForCluster
is invoked only on success and removing the pre-validation logger.Infof("Logging
into management cluster ... (elevated)...") and execClient creation before
scheduler validation.

@Ajpantuso Ajpantuso force-pushed the apantuso/SREP-4160 branch from 54f45c0 to 559e950 Compare March 19, 2026 20:49
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
cmd/hcp/backup/runner.go (1)

13-23: Use local buffers instead of importing cmd/cluster.

This code only needs stdout/stderr collectors. Pulling in github.com/openshift/osdctl/cmd/cluster here creates an unnecessary dependency edge between two command packages.

♻️ Proposed refactor
 import (
+	"bytes"
 	"context"
 	"errors"
 	"fmt"
 	"io"
 	"os"
 	"regexp"
 	"sort"
 	"strings"
 
-	"github.com/openshift/osdctl/cmd/cluster"
 	logrus "github.com/sirupsen/logrus"
 	corev1 "k8s.io/api/core/v1"
 	"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
 	"k8s.io/apimachinery/pkg/runtime/schema"
@@
-	stdoutCapture := &cluster.LogCapture{}
-	stderrCapture := &cluster.LogCapture{}
+	var stdoutCapture bytes.Buffer
+	var stderrCapture bytes.Buffer
 
 	err = executor.StreamWithContext(ctx, remotecommand.StreamOptions{
 		Stdin:  nil,
-		Stdout: stdoutCapture,
-		Stderr: stderrCapture,
+		Stdout: &stdoutCapture,
+		Stderr: &stderrCapture,
 		Tty:    false,
 	})
 	if err != nil {
-		return "", fmt.Errorf("executing command in pod %q: %w\nstderr: %s", pod, err, stderrCapture.GetStdOut())
+		return "", fmt.Errorf("executing command in pod %q: %w\nstderr: %s", pod, err, stderrCapture.String())
 	}
 
-	return stdoutCapture.GetStdOut(), nil
+	return stdoutCapture.String(), nil
 }

Also applies to: 121-134

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/hcp/backup/runner.go` around lines 13 - 23, Remove the unnecessary import
of "github.com/openshift/osdctl/cmd/cluster" and replace its stdout/stderr
collectors with local buffers: create bytes.Buffer instances (e.g., stdoutBuf,
stderrBuf) and use them as io.Writers where the code previously passed cluster
package collectors (for example in functions that perform exec to pod or build
remotecommand.StreamOptions / remotecommand.Executor usage such as any
execToPod/runBackup helpers). Ensure you import "bytes" and wire stdoutBuf and
stderrBuf into the remotecommand.StreamOptions or the exec call instead of
referencing cluster package symbols so the command packages no longer depend on
each other.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/hcp/backup/runner.go`:
- Around line 321-338: In findVeleroPod, narrow selection to a non-terminating,
truly ready pod by skipping pods with pod.DeletionTimestamp != nil or
pod.Status.Phase != corev1.PodRunning and by checking pod.Status.Conditions for
a condition with Type == corev1.PodReady and Status == corev1.ConditionTrue;
return the first pod.Name that satisfies these checks (update the loop in
defaultBackupRunner.findVeleroPod accordingly) and add/update unit tests to
include a PodReady condition case so terminating or unready Running pods are not
selected.

In `@docs/osdctl_hcp_backup.md`:
- Around line 27-62: The fenced code blocks for the usage line, Examples,
Options, and Options inherited from parent commands are missing language
identifiers and trigger MD040; update each triple-backtick fence that surrounds
the usage string ("osdctl hcp backup --cluster-id <cluster-id> --reason <reason>
[flags]"), the Examples block, the Options block, and the Options inherited
block to use a language identifier (use "text") so each fence reads ```text at
start.

In `@docs/README.md`:
- Around line 2864-2886: The two fenced code blocks shown (the usage line
starting with "osdctl hcp backup --cluster-id <cluster-id> --reason <reason>
[flags]" and the Flags block listing CLI flags) need language identifiers to
satisfy markdownlint MD040; update the opening fences to include a language such
as "text" or "console" (e.g., change ``` to ```text) so both the usage snippet
and the flags snippet are explicitly labeled.

---

Nitpick comments:
In `@cmd/hcp/backup/runner.go`:
- Around line 13-23: Remove the unnecessary import of
"github.com/openshift/osdctl/cmd/cluster" and replace its stdout/stderr
collectors with local buffers: create bytes.Buffer instances (e.g., stdoutBuf,
stderrBuf) and use them as io.Writers where the code previously passed cluster
package collectors (for example in functions that perform exec to pod or build
remotecommand.StreamOptions / remotecommand.Executor usage such as any
execToPod/runBackup helpers). Ensure you import "bytes" and wire stdoutBuf and
stderrBuf into the remotecommand.StreamOptions or the exec call instead of
referencing cluster package symbols so the command packages no longer depend on
each other.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bdb789b4-7d7c-4c1c-9b8c-7847cc9b33ea

📥 Commits

Reviewing files that changed from the base of the PR and between 54f45c0 and 559e950.

📒 Files selected for processing (10)
  • cmd/hcp/backup/backup.go
  • cmd/hcp/backup/description.txt
  • cmd/hcp/backup/flags.go
  • cmd/hcp/backup/options.go
  • cmd/hcp/backup/runner.go
  • cmd/hcp/backup/runner_test.go
  • cmd/hcp/cmd.go
  • docs/README.md
  • docs/osdctl_hcp.md
  • docs/osdctl_hcp_backup.md
✅ Files skipped from review due to trivial changes (3)
  • cmd/hcp/backup/flags.go
  • cmd/hcp/backup/description.txt
  • docs/osdctl_hcp.md
🚧 Files skipped from review as they are similar to previous changes (3)
  • cmd/hcp/cmd.go
  • cmd/hcp/backup/backup.go
  • cmd/hcp/backup/options.go

Comment on lines +27 to +62
```
osdctl hcp backup --cluster-id <cluster-id> --reason <reason> [flags]
```

### Examples

```
osdctl hcp backup --cluster-id 1abc2def3ghi --reason OHSS-12345
osdctl hcp backup --cluster-id 1abc2def3ghi --reason OHSS-12345 --label env=prod --label incident=OHSS-12345
osdctl hcp backup --cluster-id 1abc2def3ghi --reason OHSS-12345 --annotation owner=sre-team
```

### Options

```
--annotation stringToString Annotation to add to the Velero Backup CR (key=value); may be repeated (default [])
-C, --cluster-id string Internal ID, name, or external ID of the HCP cluster
-h, --help help for backup
--label stringToString Label to add to the Velero Backup CR (key=value); may be repeated (default [])
--reason string Reason for privilege elevation (e.g., OHSS-1234 or PD incident ID)
```

### Options inherited from parent commands

```
--as string Username to impersonate for the operation. User could be a regular user or a service account in a namespace.
--cluster string The name of the kubeconfig cluster to use
--context string The name of the kubeconfig context to use
--insecure-skip-tls-verify If true, the server's certificate will not be checked for validity. This will make your HTTPS connections insecure
--kubeconfig string Path to the kubeconfig file to use for CLI requests.
-o, --output string Valid formats are ['', 'json', 'yaml', 'env']
--request-timeout string The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests. (default "0")
-s, --server string The address and port of the Kubernetes API server
--skip-aws-proxy-check aws_proxy Don't use the configured aws_proxy value
-S, --skip-version-check skip checking to see if this is the most recent release
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language identifiers to the fenced blocks.

Line 27, Line 33, Line 41, and Line 51 are bare fences, so this page currently trips MD040. text is enough for the generated usage/options blocks.

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 27-27: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 33-33: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 41-41: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 51-51: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/osdctl_hcp_backup.md` around lines 27 - 62, The fenced code blocks for
the usage line, Examples, Options, and Options inherited from parent commands
are missing language identifiers and trigger MD040; update each triple-backtick
fence that surrounds the usage string ("osdctl hcp backup --cluster-id
<cluster-id> --reason <reason> [flags]"), the Examples block, the Options block,
and the Options inherited block to use a language identifier (use "text") so
each fence reads ```text at start.

Comment on lines +2864 to +2886
```
osdctl hcp backup --cluster-id <cluster-id> --reason <reason> [flags]
```

#### Flags

```
--annotation stringToString Annotation to add to the Velero Backup CR (key=value); may be repeated (default [])
--as string Username to impersonate for the operation. User could be a regular user or a service account in a namespace.
--cluster string The name of the kubeconfig cluster to use
-C, --cluster-id string Internal ID, name, or external ID of the HCP cluster
--context string The name of the kubeconfig context to use
-h, --help help for backup
--insecure-skip-tls-verify If true, the server's certificate will not be checked for validity. This will make your HTTPS connections insecure
--kubeconfig string Path to the kubeconfig file to use for CLI requests.
--label stringToString Label to add to the Velero Backup CR (key=value); may be repeated (default [])
-o, --output string Valid formats are ['', 'json', 'yaml', 'env']
--reason string Reason for privilege elevation (e.g., OHSS-1234 or PD incident ID)
--request-timeout string The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests. (default "0")
-s, --server string The address and port of the Kubernetes API server
--skip-aws-proxy-check aws_proxy Don't use the configured aws_proxy value
-S, --skip-version-check skip checking to see if this is the most recent release
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language identifiers to the new fenced blocks.

Line 2864 and Line 2870 use bare fences, which is why markdownlint is flagging MD040 on this section. Mark them as text or console so the docs stay lint-clean.

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 2864-2864: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 2870-2870: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/README.md` around lines 2864 - 2886, The two fenced code blocks shown
(the usage line starting with "osdctl hcp backup --cluster-id <cluster-id>
--reason <reason> [flags]" and the Flags block listing CLI flags) need language
identifiers to satisfy markdownlint MD040; update the opening fences to include
a language such as "text" or "console" (e.g., change ``` to ```text) so both the
usage snippet and the flags snippet are explicitly labeled.

@Ajpantuso Ajpantuso force-pushed the apantuso/SREP-4160 branch from 559e950 to 22e8165 Compare March 19, 2026 21:43
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
docs/README.md (1)

2864-2866: ⚠️ Potential issue | 🟡 Minor

Add language identifiers to the new fenced blocks.

These fences are still bare, so markdownlint keeps flagging MD040.

🧰 Minimal fix
-```
+```text
 osdctl hcp backup --cluster-id <cluster-id> --reason <reason> [flags]

@@
- +text
--annotation stringToString Annotation to add to the Velero Backup CR (key=value); may be repeated (default [])
--as string Username to impersonate for the operation. User could be a regular user or a service account in a namespace.
--cluster string The name of the kubeconfig cluster to use
-C, --cluster-id string Internal ID, name, or external ID of the HCP cluster
@@
-s, --server string The address and port of the Kubernetes API server
--skip-aws-proxy-check aws_proxy Don't use the configured aws_proxy value
-S, --skip-version-check skip checking to see if this is the most recent release

Also applies to: 2870-2886

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/README.md` around lines 2864 - 2866, The fenced code blocks showing the
osdctl examples are missing language identifiers causing MD040; change the
opening backticks for the snippet that starts with "osdctl hcp backup
--cluster-id <cluster-id> --reason <reason> [flags]" and the following flags
block (the block that begins with "--annotation stringToString" and ends with
"--skip-version-check") from ``` to ```text, and apply the same replacement for
the additional similar block around lines 2870-2886 so all fenced blocks use
```text.
docs/osdctl_hcp_backup.md (1)

27-29: ⚠️ Potential issue | 🟡 Minor

Add language identifiers to the fenced blocks.

The usage, examples, options, and inherited-options fences are still bare, so this page continues to fail MD040.

🧰 Minimal fix
-```
+```text
 osdctl hcp backup --cluster-id <cluster-id> --reason <reason> [flags]

@@
- +text
osdctl hcp backup --cluster-id 1abc2def3ghi --reason OHSS-12345
osdctl hcp backup --cluster-id 1abc2def3ghi --reason OHSS-12345 --label env=prod --label incident=OHSS-12345
osdctl hcp backup --cluster-id 1abc2def3ghi --reason OHSS-12345 --annotation owner=sre-team

@@
-```
+```text
      --annotation stringToString   Annotation to add to the Velero Backup CR (key=value); may be repeated (default [])
  -C, --cluster-id string           Internal ID, name, or external ID of the HCP cluster
  -h, --help                        help for backup
      --label stringToString        Label to add to the Velero Backup CR (key=value); may be repeated (default [])
      --reason string               Reason for privilege elevation (e.g., OHSS-1234 or PD incident ID)

@@
- +text
--as string Username to impersonate for the operation. User could be a regular user or a service account in a namespace.
--cluster string The name of the kubeconfig cluster to use
--context string The name of the kubeconfig context to use
--insecure-skip-tls-verify If true, the server's certificate will not be checked for validity. This will make your HTTPS connections insecure
--kubeconfig string Path to the kubeconfig file to use for CLI requests.
-o, --output string Valid formats are ['', 'json', 'yaml', 'env']
--request-timeout string The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests. (default "0")
-s, --server string The address and port of the Kubernetes API server
--skip-aws-proxy-check aws_proxy Don't use the configured aws_proxy value
-S, --skip-version-check skip checking to see if this is the most recent release

Also applies to: 33-37, 41-47, 51-62

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/osdctl_hcp_backup.md` around lines 27 - 29, Update the fenced code
blocks in docs/osdctl_hcp_backup.md (the usage, examples, options, and
inherited-options blocks) to include a language identifier (use "text") so the
fences become ```text ... ```; specifically change the bare triple-backtick
blocks around the usage line (osdctl hcp backup --cluster-id ...), the example
invocations, the options list, and the inherited options block to use ```text to
satisfy MD040.
🧹 Nitpick comments (4)
cmd/hcp/backup/client_builder.go (1)

115-127: Fail fast when required build options are missing.

Build assumes WithClusterID is present and, for elevated sessions, that WithElevation.Reason is non-empty, but both checks are currently implicit. Returning a local error here is clearer than letting backplane fail later with an empty cluster ID or blank audit reason.

💡 Suggested guard clauses
 import (
 	"context"
+	"errors"
 	"fmt"
+	"strings"
 
 	ocmsdk "github.com/openshift-online/ocm-sdk-go"
 	"github.com/openshift/osdctl/pkg/utils"
 	logrus "github.com/sirupsen/logrus"
 )
@@
 func (b *backplaneClientBuilder) Build(ctx context.Context, opts ...BuildOption) (KubeClient, error) {
 	cfg := &buildConfig{}
 	for _, o := range opts {
 		o.configureBuild(cfg)
 	}
+
+	if strings.TrimSpace(cfg.clusterID) == "" {
+		return nil, errors.New("WithClusterID is required")
+	}
+	if cfg.elevated && strings.TrimSpace(cfg.elevationReason) == "" {
+		return nil, errors.New("WithElevation requires a non-empty Reason")
+	}
 
 	if cfg.elevated {
 		b.logger.Infof("Logging into cluster %s (elevated)...", cfg.clusterID)
 		return newKubeClientForCluster(b.ocmConn, cfg.clusterID, cfg.elevationReason)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/hcp/backup/client_builder.go` around lines 115 - 127, Build currently
assumes required options exist: ensure Build validates cfg.clusterID is
non-empty and, when cfg.elevated is true, that cfg.elevationReason is non-empty;
if either is missing return a clear error instead of calling
newKubeClientForCluster. Add guard clauses at the start of
backplaneClientBuilder.Build to return an error like "missing cluster ID" when
cfg.clusterID == "" and "missing elevation reason" when cfg.elevated &&
cfg.elevationReason == "", referencing Build, buildConfig, cfg.clusterID,
cfg.elevated, cfg.elevationReason and the call sites that invoke
newKubeClientForCluster so callers never receive empty values.
cmd/hcp/backup/runner_test.go (3)

536-539: Add one end-to-end assertion for the exec target.

execFn exposes namespace, pod, and container, but the happy-path cases only inspect cmd. If Run regressed to exec into the wrong pod, namespace, or container, this suite would still pass. Please assert those three values in at least one success case.

Also applies to: 554-559

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/hcp/backup/runner_test.go` around lines 536 - 539, The tests currently
allow execFn to ignore namespace/pod/container, so add an end-to-end assertion
in at least one successful happy-path subtest to verify the values passed into
execFn for namespace, pod, and container; update the execFn closure used in the
success case(s) (the execFn variable and the subtests that call Run) to assert
the expected namespace, pod, and container strings before returning the expected
output, and do the same for the other similar success case block around the
554-559 range so the test fails if Run execs into the wrong
pod/namespace/container.

144-155: Strengthen the logger/printer override assertions.

These checks can stay green even if the injected objects are ignored: Lines 147-154 only assert non-nil, which the defaults already satisfy, and Line 204 uses value equality instead of instance identity. Please store the injected logger/printer in the table and use assert.Same so these tests actually verify the override path.

Also applies to: 203-204

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/hcp/backup/runner_test.go` around lines 144 - 155, The tests currently
only assert non-nil for injected Logger/Printer which can pass with defaults;
update the table-driven tests that use WithLogger and WithPrinter to store the
specific injected instances (e.g., l := logrus.New() and p := &defaultPrinter{w:
&strings.Builder{}}) in the opts entries and then replace
assert.NotNil/reflect-equality checks with assert.Same(t, l, cfg.Logger) and
assert.Same(t, p, cfg.Printer) (also change the other equality assertion around
the Printer at the noted location to assert.Same) so the test verifies instance
identity for WithLogger, WithPrinter, DefaultBackupRunnerOption and
defaultBackupRunnerConfig.

545-549: wantBuildCalls cannot prove the “no login yet” cases.

Here, nil means “skip assertion”, so the resolver/unprivileged/elevated sequencing cases only validate the final error text. A regression that builds clients too early would still pass. Make build-call checking explicit so you can assert zero calls for resolver failure and exact one/two-call sequences for the login-order cases.

One simple way to make zero-call expectations testable
-		wantBuildCalls  []buildConfig // if non-nil, asserts on the Build calls received by the builder
+		wantBuildCalls  *[]buildConfig // nil means "skip assertion"; non-nil may point to an empty slice

...

-			if tt.wantBuildCalls != nil {
-				assert.Equal(t, tt.wantBuildCalls, clientBuilder.calls,
+			if tt.wantBuildCalls != nil {
+				assert.Equal(t, *tt.wantBuildCalls, clientBuilder.calls,
 					"Build calls should match expected clusterID and elevation options")
 			}

Also applies to: 795-825, 914-917

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/hcp/backup/runner_test.go` around lines 545 - 549, The test table's
wantBuildCalls currently uses nil to mean "skip assertion", so tests for "no
login yet" can't assert zero builder calls; change the test harness so
wantBuildCalls is explicit: either make wantBuildCalls a pointer to
[]buildConfig (nil = skip, non-nil = assert exact sequence including zero-length
slice for zero calls) or add a separate bool/int field (e.g., wantBuildCallCount
or expectNoBuildCalls) and update the assertion logic in the runner_test.go test
loop to compare the actual builder calls against wantBuildCalls when non-nil
(treating an empty slice as expect-zero) or against the new count/flag; update
all cases referenced (including the resolver/unprivileged/elevated sequencing
cases and the blocks around the other locations noted) to use the explicit
expectation instead of relying on nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/hcp/backup/runner.go`:
- Around line 330-332: The post-run hint always prints "oc get backup ..."
against the user's current kube context; update the three Printf calls in
runner.go (the lines using r.printer.Printf with backupID and
r.cfg.ADPNamespace) to be management-cluster aware: if a resolved management
cluster identifier is available (e.g. r.cfg.ManagementClusterID) print an
explicit context-switch or login hint (for example "oc config use-context
<management-cluster-id>" or "oc login ...") before the oc get command, otherwise
print an additional prerequisite line telling the user to switch/login to the
management cluster; ensure you reference backupID and r.cfg.ADPNamespace in the
same message so the output remains actionable.

---

Duplicate comments:
In `@docs/osdctl_hcp_backup.md`:
- Around line 27-29: Update the fenced code blocks in docs/osdctl_hcp_backup.md
(the usage, examples, options, and inherited-options blocks) to include a
language identifier (use "text") so the fences become ```text ... ```;
specifically change the bare triple-backtick blocks around the usage line
(osdctl hcp backup --cluster-id ...), the example invocations, the options list,
and the inherited options block to use ```text to satisfy MD040.

In `@docs/README.md`:
- Around line 2864-2866: The fenced code blocks showing the osdctl examples are
missing language identifiers causing MD040; change the opening backticks for the
snippet that starts with "osdctl hcp backup --cluster-id <cluster-id> --reason
<reason> [flags]" and the following flags block (the block that begins with
"--annotation stringToString" and ends with "--skip-version-check") from ``` to
```text, and apply the same replacement for the additional similar block around
lines 2870-2886 so all fenced blocks use ```text.

---

Nitpick comments:
In `@cmd/hcp/backup/client_builder.go`:
- Around line 115-127: Build currently assumes required options exist: ensure
Build validates cfg.clusterID is non-empty and, when cfg.elevated is true, that
cfg.elevationReason is non-empty; if either is missing return a clear error
instead of calling newKubeClientForCluster. Add guard clauses at the start of
backplaneClientBuilder.Build to return an error like "missing cluster ID" when
cfg.clusterID == "" and "missing elevation reason" when cfg.elevated &&
cfg.elevationReason == "", referencing Build, buildConfig, cfg.clusterID,
cfg.elevated, cfg.elevationReason and the call sites that invoke
newKubeClientForCluster so callers never receive empty values.

In `@cmd/hcp/backup/runner_test.go`:
- Around line 536-539: The tests currently allow execFn to ignore
namespace/pod/container, so add an end-to-end assertion in at least one
successful happy-path subtest to verify the values passed into execFn for
namespace, pod, and container; update the execFn closure used in the success
case(s) (the execFn variable and the subtests that call Run) to assert the
expected namespace, pod, and container strings before returning the expected
output, and do the same for the other similar success case block around the
554-559 range so the test fails if Run execs into the wrong
pod/namespace/container.
- Around line 144-155: The tests currently only assert non-nil for injected
Logger/Printer which can pass with defaults; update the table-driven tests that
use WithLogger and WithPrinter to store the specific injected instances (e.g., l
:= logrus.New() and p := &defaultPrinter{w: &strings.Builder{}}) in the opts
entries and then replace assert.NotNil/reflect-equality checks with
assert.Same(t, l, cfg.Logger) and assert.Same(t, p, cfg.Printer) (also change
the other equality assertion around the Printer at the noted location to
assert.Same) so the test verifies instance identity for WithLogger, WithPrinter,
DefaultBackupRunnerOption and defaultBackupRunnerConfig.
- Around line 545-549: The test table's wantBuildCalls currently uses nil to
mean "skip assertion", so tests for "no login yet" can't assert zero builder
calls; change the test harness so wantBuildCalls is explicit: either make
wantBuildCalls a pointer to []buildConfig (nil = skip, non-nil = assert exact
sequence including zero-length slice for zero calls) or add a separate bool/int
field (e.g., wantBuildCallCount or expectNoBuildCalls) and update the assertion
logic in the runner_test.go test loop to compare the actual builder calls
against wantBuildCalls when non-nil (treating an empty slice as expect-zero) or
against the new count/flag; update all cases referenced (including the
resolver/unprivileged/elevated sequencing cases and the blocks around the other
locations noted) to use the explicit expectation instead of relying on nil.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 17f3a54b-c41b-4a78-87cb-7408ffd0a24a

📥 Commits

Reviewing files that changed from the base of the PR and between 559e950 and 22e8165.

📒 Files selected for processing (11)
  • cmd/hcp/backup/backup.go
  • cmd/hcp/backup/client_builder.go
  • cmd/hcp/backup/description.txt
  • cmd/hcp/backup/flags.go
  • cmd/hcp/backup/options.go
  • cmd/hcp/backup/runner.go
  • cmd/hcp/backup/runner_test.go
  • cmd/hcp/cmd.go
  • docs/README.md
  • docs/osdctl_hcp.md
  • docs/osdctl_hcp_backup.md
✅ Files skipped from review due to trivial changes (2)
  • docs/osdctl_hcp.md
  • cmd/hcp/backup/description.txt
🚧 Files skipped from review as they are similar to previous changes (3)
  • cmd/hcp/cmd.go
  • cmd/hcp/backup/flags.go
  • cmd/hcp/backup/options.go

Comment on lines +330 to +332
r.printer.Printf("Backup %q triggered successfully.\n", backupID)
r.printer.Printf("To check status, run:\n")
r.printer.Printf("oc get backup %s -n %s\n", backupID, r.cfg.ADPNamespace)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make the post-run status hint management-cluster aware.

The command never leaves the caller logged into the management cluster, so the printed oc get backup ... follow-up will usually fail against whatever context they already had. The docs added in docs/README.md:2858-2861 and docs/osdctl_hcp_backup.md:21-24 already call out that extra step; the CLI should surface the same prerequisite or the resolved management cluster ID here.

💡 Suggested output tweak
-	r.printer.Printf("To check status, run:\n")
-	r.printer.Printf("oc get backup %s -n %s\n", backupID, r.cfg.ADPNamespace)
+	r.printer.Printf("To check status from a management-cluster context, run:\n")
+	r.printer.Printf("Management cluster: %s\n", clusterInfo.MgmtClusterID)
+	r.printer.Printf("oc get backup %s -n %s\n", backupID, r.cfg.ADPNamespace)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/hcp/backup/runner.go` around lines 330 - 332, The post-run hint always
prints "oc get backup ..." against the user's current kube context; update the
three Printf calls in runner.go (the lines using r.printer.Printf with backupID
and r.cfg.ADPNamespace) to be management-cluster aware: if a resolved management
cluster identifier is available (e.g. r.cfg.ManagementClusterID) print an
explicit context-switch or login hint (for example "oc config use-context
<management-cluster-id>" or "oc login ...") before the oc get command, otherwise
print an additional prerequisite line telling the user to switch/login to the
management cluster; ensure you reference backupID and r.cfg.ADPNamespace in the
same message so the output remains actionable.

@Ajpantuso
Copy link
Contributor Author

/retest-required

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 19, 2026

@Ajpantuso: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link
Member

@MateSaary MateSaary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 20, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 20, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ajpantuso, MateSaary

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 20, 2026
@openshift-merge-bot openshift-merge-bot bot merged commit b0c2a8c into openshift:master Mar 20, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants