USHIFT-6797: C2CC Probe remote clusters#6729
Conversation
|
Skipping CI for Draft Pull Request. |
|
@pmtk: This pull request references USHIFT-6797 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: pmtk The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughImplements C2CC probe support: RemoteCluster API/CRD, probe config and deterministic probe IPs, probe runtime and CLI, controller reconciliation and probe deployment, asset/codegen additions, and unit/acceptance tests. ChangesC2CC Probe Feature
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 9 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (9 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
test/suites/c2cc/probe.robot (1)
123-130: ⚡ Quick winConsider extracting
Compute 11th IPto a shared resource.This keyword is duplicated identically in
07-healthcheck.robot(lines 97-104). Since both test files already importc2cc.resource, extracting this to the shared resource file would eliminate duplication and provide a single point of maintenance.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/suites/c2cc/probe.robot` around lines 123 - 130, The keyword Compute 11th IP is duplicated; extract it into the shared resource by moving the entire keyword block (the Robot Framework keyword named "Compute 11th IP" that uses Process.Run Process to call python3 with the ipaddress snippet, checks rc and returns the stripped stdout) into the c2cc.resource file, remove the duplicate definitions from test/suites/c2cc/probe.robot and 07-healthcheck.robot, and ensure both tests continue to import c2cc.resource so they call the single shared Compute 11th IP keyword.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@assets/components/c2cc/deployment.yaml`:
- Around line 54-60: The deployment currently hard-pins c2cc-probe to
master-labeled nodes via nodeSelector node-role.kubernetes.io/master and a
NoSchedule toleration; change this to a topology-aware preference instead:
remove the strict nodeSelector and NoSchedule toleration and replace with a
preferred node affinity (preferredDuringSchedulingIgnoredDuringExecution) that
targets control-plane labels (e.g., node-role.kubernetes.io/master or
node-role.kubernetes.io/control-plane) so pods prefer control-plane nodes but
can run elsewhere for single-node/arbiter variants, and if a toleration is still
required use a non-blocking preference (e.g., avoid NoSchedule) or match the
exact taint/key semantics; update the c2cc deployment
(nodeSelector/tolerations/priorityClassName usage) and add topology-specific CI
to validate placement across single-node and multi-node topologies.
In `@cmd/generate-config/config/config-openapi-spec.json`:
- Around line 206-209: The OpenAPI schema for probeInterval lacks a default and
should match runtime default used in fillDefaults(); update the "probeInterval"
property in the OpenAPI schema (clusterToCluster.probeInterval) to include a
default value of "10s" and keep the existing description/type, so generated docs
and clients reflect the actual default; locate the "probeInterval" entry in the
config-openapi-spec JSON and add the "default": "10s" field.
In `@pkg/config/c2cc_test.go`:
- Around line 864-865: fillDefaults() is an error-returning initializer and the
tests currently ignore its error; update each call (e.g., the cfg.fillDefaults()
invocations around the assertions at the noted locations) to capture the
returned error and assert success (for example assign err := cfg.fillDefaults()
and call require.NoError(t, err) or assert.NoError(t, err)) before continuing to
the assert.Equal checks so tests fail fast on initialization errors.
In `@pkg/controllers/c2cc/probe.go`:
- Around line 85-87: The call to factory.WaitForCacheSync currently ignores its
result; update the startup in pkg/controllers/c2cc/probe.go to check the
returned map[reflect.Type]bool (the result of factory.WaitForCacheSync) and fail
fast if any informer failed to sync: after factory.Start(ctx.Done()) capture the
sync map from factory.WaitForCacheSync(ctx.Done()), inspect its values, and if
any value is false log an error via klog.Errorf (including what failed) and
return/exit the function/manager startup instead of continuing to
klog.Infof("Probe manager running, watching RemoteCluster CRs").
- Around line 67-70: The UpdateFunc is restarting probes on any RemoteCluster
update causing status-update-driven restart loops; change UpdateFunc to accept
both oldObj and newObj, cast them to *microshiftv1alpha1.RemoteCluster, compare
the relevant spec fields (e.g., Spec.ProbeTarget and Spec.ProbeInterval) between
old and new, and only call pm.restartProbe(ctx, rc) when those spec fields
differ (ignore updates that only change Status); this preserves runProbeLoop
behavior and prevents resetting consecutiveFailures to 0 on status-only updates.
---
Nitpick comments:
In `@test/suites/c2cc/probe.robot`:
- Around line 123-130: The keyword Compute 11th IP is duplicated; extract it
into the shared resource by moving the entire keyword block (the Robot Framework
keyword named "Compute 11th IP" that uses Process.Run Process to call python3
with the ipaddress snippet, checks rc and returns the stripped stdout) into the
c2cc.resource file, remove the duplicate definitions from
test/suites/c2cc/probe.robot and 07-healthcheck.robot, and ensure both tests
continue to import c2cc.resource so they call the single shared Compute 11th IP
keyword.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 3633745b-f762-4fbe-9134-34031e2ec1d5
⛔ Files ignored due to path filters (22)
pkg/apis/microshift/v1alpha1/zz_generated.deepcopy.gois excluded by!**/zz_generated*pkg/generated/clientset/versioned/clientset.gois excluded by!**/generated/**pkg/generated/clientset/versioned/fake/clientset_generated.gois excluded by!**/generated/**pkg/generated/clientset/versioned/fake/doc.gois excluded by!**/generated/**pkg/generated/clientset/versioned/fake/register.gois excluded by!**/generated/**pkg/generated/clientset/versioned/scheme/doc.gois excluded by!**/generated/**pkg/generated/clientset/versioned/scheme/register.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/doc.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/fake/doc.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/fake/fake_microshift_client.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/fake/fake_remotecluster.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/generated_expansion.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/microshift_client.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/remotecluster.gois excluded by!**/generated/**pkg/generated/informers/externalversions/factory.gois excluded by!**/generated/**pkg/generated/informers/externalversions/generic.gois excluded by!**/generated/**pkg/generated/informers/externalversions/internalinterfaces/factory_interfaces.gois excluded by!**/generated/**pkg/generated/informers/externalversions/microshift/interface.gois excluded by!**/generated/**pkg/generated/informers/externalversions/microshift/v1alpha1/interface.gois excluded by!**/generated/**pkg/generated/informers/externalversions/microshift/v1alpha1/remotecluster.gois excluded by!**/generated/**pkg/generated/listers/microshift/v1alpha1/expansion_generated.gois excluded by!**/generated/**pkg/generated/listers/microshift/v1alpha1/remotecluster.gois excluded by!**/generated/**
📒 Files selected for processing (30)
Makefileassets/components/c2cc/clusterrole.yamlassets/components/c2cc/clusterrolebinding.yamlassets/components/c2cc/deployment.yamlassets/components/c2cc/namespace.yamlassets/components/c2cc/service.yamlassets/components/c2cc/serviceaccount.yamlassets/crd/microshift.io_remoteclusters.yamlcmd/generate-config/config/config-openapi-spec.jsoncmd/microshift/main.godocs/user/howto_config.mdpackaging/microshift/config.yamlpkg/apis/microshift/v1alpha1/doc.gopkg/apis/microshift/v1alpha1/groupversion_info.gopkg/apis/microshift/v1alpha1/types.gopkg/assets/crd.gopkg/cmd/c2cc_probe.gopkg/config/c2cc.gopkg/config/c2cc_test.gopkg/config/config.gopkg/controllers/c2cc/controller.gopkg/controllers/c2cc/deploy_probe.gopkg/controllers/c2cc/healthcheck.gopkg/controllers/c2cc/healthcheck_test.gopkg/controllers/c2cc/probe.goscripts/auto-rebase/assets.yamlscripts/boilerplate.go.txtscripts/generate-crds.shtest/suites/c2cc/07-healthcheck.robottest/suites/c2cc/probe.robot
| nodeSelector: | ||
| node-role.kubernetes.io/master: "" | ||
| priorityClassName: system-cluster-critical | ||
| tolerations: | ||
| - key: node-role.kubernetes.io/master | ||
| operator: Exists | ||
| effect: NoSchedule |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="assets/components/c2cc/deployment.yaml"
echo "=== File exists? ==="
ls -l "$FILE" || exit 1
echo
echo "=== Lines 1-120 (context) ==="
nl -ba "$FILE" | sed -n '1,120p'
echo
echo "=== Lines 45-75 (scheduling snippet focus) ==="
nl -ba "$FILE" | sed -n '45,75p'
echo
echo "=== Search for scheduling/topology fields in this file ==="
rg -n --hidden --no-ignore-vcs "(nodeSelector|affinity|podAntiAffinity|topologySpreadConstraints|topologyKey|replicas|podDisruptionBudget|PDB|node-role.kubernetes.io/master|control-plane|master|kubernetes.io/hostname)" "$FILE" || true
echo
echo "=== Repo-wide usage of node-role.kubernetes.io/master ==="
rg -n --hidden --no-ignore-vcs "node-role\.kubernetes\.io/master" . || trueRepository: openshift/microshift
Length of output: 266
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="assets/components/c2cc/deployment.yaml"
echo "=== File exists? ==="
ls -l "$FILE" || exit 1
echo
echo "=== Lines 1-120 (context) ==="
awk 'NR>=1 && NR<=120 {printf "%6d %s\n", NR, $0}' "$FILE"
echo
echo "=== Lines 45-75 (scheduling snippet focus) ==="
awk 'NR>=45 && NR<=75 {printf "%6d %s\n", NR, $0}' "$FILE"
echo
echo "=== Search for scheduling/topology fields in this file ==="
rg -n --hidden --no-ignore-vcs "(nodeSelector|affinity|podAntiAffinity|topologySpreadConstraints|topologyKey|replicas|podDisruptionBudget|PDB|node-role.kubernetes.io/master|control-plane|master|kubernetes.io/hostname)" "$FILE" || true
echo
echo "=== Repo-wide usage of node-role.kubernetes.io/master ==="
rg -n --hidden --no-ignore-vcs "node-role\.kubernetes\.io/master" . || trueRepository: openshift/microshift
Length of output: 15624
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="assets/components/c2cc/deployment.yaml"
echo "=== File exists? ==="
ls -l "$FILE" || exit 1
echo
echo "=== Lines 1-120 (context) ==="
awk 'NR>=1 && NR<=120 {printf "%6d %s\n", NR, $0}' "$FILE"
echo
echo "=== Lines 45-75 (scheduling snippet focus) ==="
awk 'NR>=45 && NR<=75 {printf "%6d %s\n", NR, $0}' "$FILE"
echo
echo "=== Search for scheduling/topology fields in this file ==="
rg -n --hidden --no-ignore-vcs "(nodeSelector|affinity|podAntiAffinity|topologySpreadConstraints|topologyKey|replicas|podDisruptionBudget|PDB|node-role.kubernetes.io/master|control-plane|master|kubernetes.io/hostname)" "$FILE" || true
echo
echo "=== Repo-wide usage of node-role.kubernetes.io/master ==="
rg -n --hidden --no-ignore-vcs "node-role\.kubernetes\.io/master" . || trueRepository: openshift/microshift
Length of output: 15624
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Repo-wide: control-plane label usage ==="
rg -n --hidden --no-ignore-vcs "node-role\.kubernetes\.io/control-plane" . || true
echo
echo "=== Repo-wide: arbiter mentions ==="
rg -n --hidden --no-ignore-vcs "arbiter" . || true
echo
echo "=== Repo-wide: SNO mention ==="
rg -n --hidden --no-ignore-vcs "\bSNO\b|single[- ]node|single[- ]node" . || true
echo
echo "=== CI/topology job hints ==="
rg -n --hidden --no-ignore-vcs "topology|topological|spread|DoNotSchedule|hostname|podAntiAffinity" .github . || trueRepository: openshift/microshift
Length of output: 50378
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Repo-wide: control-plane label usage ==="
rg -n --hidden --no-ignore-vcs "node-role\.kubernetes\.io/control-plane" . || true
echo
echo "=== Repo-wide: arbiter mentions ==="
rg -n --hidden --no-ignore-vcs "arbiter" . || true
echo
echo "=== Repo-wide: SNO mention ==="
rg -n --hidden --no-ignore-vcs "\bSNO\b|single[- ]node|single[- ]node" . || true
echo
echo "=== CI/topology job hints ==="
rg -n --hidden --no-ignore-vcs "topology|topological|spread|DoNotSchedule|hostname|podAntiAffinity" .github . || trueRepository: openshift/microshift
Length of output: 50378
Avoid hard pinning c2cc-probe to master-labeled control-plane nodes
In assets/components/c2cc/deployment.yaml (lines 54-60), the workload uses nodeSelector: node-role.kubernetes.io/master: "" plus a NoSchedule toleration for the same taint, forcing placement onto master-labeled nodes only. This bakes in a specific control-plane topology and isn’t topology-aware for single-node/arbiter variants—make placement topology-aware and cover it with topology-specific CI.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@assets/components/c2cc/deployment.yaml` around lines 54 - 60, The deployment
currently hard-pins c2cc-probe to master-labeled nodes via nodeSelector
node-role.kubernetes.io/master and a NoSchedule toleration; change this to a
topology-aware preference instead: remove the strict nodeSelector and NoSchedule
toleration and replace with a preferred node affinity
(preferredDuringSchedulingIgnoredDuringExecution) that targets control-plane
labels (e.g., node-role.kubernetes.io/master or
node-role.kubernetes.io/control-plane) so pods prefer control-plane nodes but
can run elsewhere for single-node/arbiter variants, and if a toleration is still
required use a non-blocking preference (e.g., avoid NoSchedule) or match the
exact taint/key semantics; update the c2cc deployment
(nodeSelector/tolerations/priorityClassName usage) and add topology-specific CI
to validate placement across single-node and multi-node topologies.
There was a problem hiding this comment.
Won't fix — MicroShift is single-node. Every other deployment (service-ca, OVN, CSI, DNS) uses the identical nodeSelector: node-role.kubernetes.io/master + NoSchedule toleration pattern. This is the standard convention, not a topology concern.
There was a problem hiding this comment.
@pmtk, understood — that's a valid convention for MicroShift's single-node topology. I'll defer to the established pattern.
✏️ Learnings added
Learnt from: pmtk
Repo: openshift/microshift PR: 6729
File: assets/components/c2cc/deployment.yaml:54-60
Timestamp: 2026-05-25T11:24:44.503Z
Learning: In MicroShift (openshift/microshift), all system-level deployments (service-ca, OVN, CSI, DNS, c2cc-probe, etc.) use the standard pattern of `nodeSelector: node-role.kubernetes.io/master: ""` plus a `NoSchedule` toleration for `node-role.kubernetes.io/master`. This is intentional and correct — MicroShift is a single-node distribution, so topology-aware scheduling is not applicable. Do not flag this pattern as a topology concern.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| factory.Start(ctx.Done()) | ||
| factory.WaitForCacheSync(ctx.Done()) | ||
| klog.Infof("Probe manager running, watching RemoteCluster CRs") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify cache sync result is currently ignored.
rg -n -C2 'factory.Start|WaitForCacheSync|HasSynced' pkg/controllers/c2cc/probe.goRepository: openshift/microshift
Length of output: 218
Fail fast if informer cache does not sync. In pkg/controllers/c2cc/probe.go (lines 85-87), the factory.WaitForCacheSync(ctx.Done()) return value is ignored; if the RemoteCluster informer cache never syncs, the manager can start without a valid initial state—handle the sync result (log and exit/return an error) when it fails.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/controllers/c2cc/probe.go` around lines 85 - 87, The call to
factory.WaitForCacheSync currently ignores its result; update the startup in
pkg/controllers/c2cc/probe.go to check the returned map[reflect.Type]bool (the
result of factory.WaitForCacheSync) and fail fast if any informer failed to
sync: after factory.Start(ctx.Done()) capture the sync map from
factory.WaitForCacheSync(ctx.Done()), inspect its values, and if any value is
false log an error via klog.Errorf (including what failed) and return/exit the
function/manager startup instead of continuing to klog.Infof("Probe manager
running, watching RemoteCluster CRs").
There was a problem hiding this comment.
Won't fix — WaitForCacheSync(ctx.Done()) blocks until sync or context cancellation. If ctx is cancelled, the next line <-ctx.Done() exits immediately anyway — both paths converge on the same shutdown behavior.
There was a problem hiding this comment.
@pmtk, fair point — if the context is already cancelled when WaitForCacheSync returns false, the <-ctx.Done() path shuts things down cleanly regardless. Acknowledged.
|
Depends on #6727 |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
assets/components/c2cc/deployment.yaml (1)
35-40: ⚡ Quick winConsider adding a readinessProbe.
The Deployment has a livenessProbe but no readinessProbe. Without it, the Service will route traffic immediately on pod start, potentially before the HTTP server is ready.
♻️ Proposed addition
livenessProbe: httpGet: path: / port: 8080 initialDelaySeconds: 10 periodSeconds: 10 + readinessProbe: + httpGet: + path: / + port: 8080 + initialDelaySeconds: 5 + periodSeconds: 5🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@assets/components/c2cc/deployment.yaml` around lines 35 - 40, Add a readinessProbe alongside the existing livenessProbe so the Service only routes traffic once the pod's HTTP server is ready; mirror the livenessProbe's httpGet (path: /, port: 8080) and set sensible readiness-specific timing (for example readinessProbe with initialDelaySeconds, periodSeconds and optionally successThreshold/failureThreshold) in the same Pod spec where livenessProbe is defined so Kubernetes checks readiness separately from liveness.pkg/controllers/c2cc/deploy_probe.go (1)
26-27: ⚡ Quick winAdd a defensive guard before indexing ServiceNetwork.
Line 26 assumes at least one entry and can panic on empty config; return a clear error instead.
Suggested patch
func (c *C2CCRouteManager) deployProbe(ctx context.Context) error { + if len(c.cfg.Network.ServiceNetwork) == 0 { + return fmt.Errorf("service network is empty") + } _, svcNet, err := net.ParseCIDR(c.cfg.Network.ServiceNetwork[0]) if err != nil { return fmt.Errorf("failed to parse local service network: %w", err) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/controllers/c2cc/deploy_probe.go` around lines 26 - 27, Add a defensive guard before indexing c.cfg.Network.ServiceNetwork: check that len(c.cfg.Network.ServiceNetwork) > 0 and if not, return a clear error (e.g., fmt.Errorf or wrapped error) indicating the ServiceNetwork is empty; only then call net.ParseCIDR(c.cfg.Network.ServiceNetwork[0]) and handle its error as before. This protects the code around net.ParseCIDR and the surrounding function (where c.cfg.Network.ServiceNetwork is used) from panics due to an empty slice.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/controllers/c2cc/healthcheck.go`:
- Around line 65-67: The reconcile path in healthcheckCRManager.reconcile()
writes got.Spec = want.Spec then calls h.client.RemoteClusters().Update(...)
which can fail with 409s if the probe pod updates status between List and
Update; wrap the Update call in a retry-on-conflict loop (use
retry.RetryOnConflict) that re-gets the latest RemoteCluster
(h.client.RemoteClusters().Get), applies the desired Spec (assign want.Spec to
the fetched object's Spec), and retries the Update until success or a
non-conflict error, preserving the existing metav1.UpdateOptions usage.
In `@pkg/controllers/c2cc/probe.go`:
- Around line 177-181: The current probe logic unconditionally sets status.State
to "Healthy" whenever consecutiveFailures < unhealthyThreshold, which hides
errors; change the logic in the probe result handler that updates status.State
(the block using consecutiveFailures, unhealthyThreshold and status.State) to
avoid overwriting the prior state when there are Errors—either preserve the
existing status.State if Errors is non-empty or use an intermediate state such
as "Degraded"/"UnhealthyPending" instead of "Healthy"; ensure you handle the
initial/zero-value state (set to "Healthy" only when there are no Errors and
consecutiveFailures == 0) so the code that writes status.Errors and status.State
remains consistent.
- Line 44: The http.Server instantiation (server := &http.Server{Addr: ":8080",
Handler: mux}) lacks a ReadHeaderTimeout, making it vulnerable to Slowloris;
update the server struct to include a sensible ReadHeaderTimeout (e.g., a few
seconds) alongside Addr and Handler so the server closes connections that stall
while sending headers (modify the &http.Server literal where server is created).
---
Nitpick comments:
In `@assets/components/c2cc/deployment.yaml`:
- Around line 35-40: Add a readinessProbe alongside the existing livenessProbe
so the Service only routes traffic once the pod's HTTP server is ready; mirror
the livenessProbe's httpGet (path: /, port: 8080) and set sensible
readiness-specific timing (for example readinessProbe with initialDelaySeconds,
periodSeconds and optionally successThreshold/failureThreshold) in the same Pod
spec where livenessProbe is defined so Kubernetes checks readiness separately
from liveness.
In `@pkg/controllers/c2cc/deploy_probe.go`:
- Around line 26-27: Add a defensive guard before indexing
c.cfg.Network.ServiceNetwork: check that len(c.cfg.Network.ServiceNetwork) > 0
and if not, return a clear error (e.g., fmt.Errorf or wrapped error) indicating
the ServiceNetwork is empty; only then call
net.ParseCIDR(c.cfg.Network.ServiceNetwork[0]) and handle its error as before.
This protects the code around net.ParseCIDR and the surrounding function (where
c.cfg.Network.ServiceNetwork is used) from panics due to an empty slice.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: ba6f532c-76d3-473b-8672-a7e040dd7d51
⛔ Files ignored due to path filters (22)
pkg/apis/microshift/v1alpha1/zz_generated.deepcopy.gois excluded by!**/zz_generated*pkg/generated/clientset/versioned/clientset.gois excluded by!**/generated/**pkg/generated/clientset/versioned/fake/clientset_generated.gois excluded by!**/generated/**pkg/generated/clientset/versioned/fake/doc.gois excluded by!**/generated/**pkg/generated/clientset/versioned/fake/register.gois excluded by!**/generated/**pkg/generated/clientset/versioned/scheme/doc.gois excluded by!**/generated/**pkg/generated/clientset/versioned/scheme/register.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/doc.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/fake/doc.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/fake/fake_microshift_client.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/fake/fake_remotecluster.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/generated_expansion.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/microshift_client.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/remotecluster.gois excluded by!**/generated/**pkg/generated/informers/externalversions/factory.gois excluded by!**/generated/**pkg/generated/informers/externalversions/generic.gois excluded by!**/generated/**pkg/generated/informers/externalversions/internalinterfaces/factory_interfaces.gois excluded by!**/generated/**pkg/generated/informers/externalversions/microshift/interface.gois excluded by!**/generated/**pkg/generated/informers/externalversions/microshift/v1alpha1/interface.gois excluded by!**/generated/**pkg/generated/informers/externalversions/microshift/v1alpha1/remotecluster.gois excluded by!**/generated/**pkg/generated/listers/microshift/v1alpha1/expansion_generated.gois excluded by!**/generated/**pkg/generated/listers/microshift/v1alpha1/remotecluster.gois excluded by!**/generated/**
📒 Files selected for processing (31)
Makefileassets/components/c2cc/clusterrole.yamlassets/components/c2cc/clusterrolebinding.yamlassets/components/c2cc/deployment.yamlassets/components/c2cc/namespace.yamlassets/components/c2cc/service.yamlassets/components/c2cc/serviceaccount.yamlassets/crd/microshift.io_remoteclusters.yamlcmd/generate-config/config/config-openapi-spec.jsoncmd/microshift/main.godocs/user/howto_config.mdpackaging/microshift/config.yamlpkg/apis/microshift/v1alpha1/doc.gopkg/apis/microshift/v1alpha1/groupversion_info.gopkg/apis/microshift/v1alpha1/types.gopkg/assets/crd.gopkg/cmd/c2cc_probe.gopkg/config/c2cc.gopkg/config/c2cc_test.gopkg/config/config.gopkg/controllers/c2cc/controller.gopkg/controllers/c2cc/deploy_probe.gopkg/controllers/c2cc/healthcheck.gopkg/controllers/c2cc/healthcheck_test.gopkg/controllers/c2cc/probe.goscripts/auto-rebase/assets.yamlscripts/boilerplate.go.txtscripts/generate-crds.shtest/resources/c2cc.resourcetest/suites/c2cc/healthcheck.robottest/suites/c2cc/probe.robot
✅ Files skipped from review due to trivial changes (4)
- pkg/apis/microshift/v1alpha1/doc.go
- assets/components/c2cc/namespace.yaml
- cmd/generate-config/config/config-openapi-spec.json
- docs/user/howto_config.md
| got.Spec = want.Spec | ||
| if _, err := h.client.RemoteClusters().Update(ctx, got, metav1.UpdateOptions{}); err != nil { | ||
| return fmt.Errorf("updating RemoteCluster %q: %w", name, err) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify concurrent status writers and current lack of retry-on-conflict in update path.
rg -nP --type=go 'RemoteClusters\(\)\.UpdateStatus\('
rg -nP --type=go 'healthcheckCRManager\)\s+reconcile|RemoteClusters\(\)\.Update\(|RetryOnConflict'Repository: openshift/microshift
Length of output: 28029
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Inspect the exact code around the update call
sed -n '1,140p' pkg/controllers/c2cc/healthcheck.go
# 2) Find all remotecluster update call sites in this repo (spec or status)
rg -n --type=go "RemoteClusters\(\)\.(Update|UpdateStatus|Patch)" pkg | head -n 200
# 3) Find any other controller that updates RemoteCluster status/fields
rg -n --type=go "RemoteCluster" pkg/controllers/c2cc | head -n 200
rg -n --type=go "RemoteCluster" pkg | head -n 200Repository: openshift/microshift
Length of output: 29154
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '130,280p' pkg/controllers/c2cc/probe.go
# Also inspect imports to see if retry is already used in probe.go
sed -n '1,120p' pkg/controllers/c2cc/probe.goRepository: openshift/microshift
Length of output: 6156
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n --type=go "RetryOnConflict|util/retry|Update\(ctx,.*RemoteClusters" pkg/controllers/c2cc/healthcheck.go
rg -n --type=go "RemoteClusters\(\)\.Update\(" pkg/controllers/c2cc
rg -n --type=go "UpdateStatus\(" pkg/controllers/c2ccRepository: openshift/microshift
Length of output: 46
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show how healthcheck manager + probe are wired
sed -n '1,220p' pkg/controllers/c2cc/controller.go
ls -1 pkg/controllers/c2cc
# Find where healthcheckCRManager is instantiated / reconcile loop is started
rg -n --type=go "newHealthcheckCRManager|healthcheckCRManager|\\.reconcile\\(" pkg/controllers/c2cc
rg -n --type=go "RunProbe\\(" pkg/controllers/c2cc
rg -n --type=go "RunProbe" pkg/controllers | head -n 50Repository: openshift/microshift
Length of output: 8925
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '140,260p' pkg/controllers/c2cc/controller.go
sed -n '1,220p' pkg/controllers/c2cc/deploy_probe.go
rg -n --type=go "RunProbe|deploy_probe|Probe" pkg/controllers/c2cc | head -n 200
# Show full healthcheck.go to confirm Update uses entire object (status included)
sed -n '1,120p' pkg/controllers/c2cc/healthcheck.goRepository: openshift/microshift
Length of output: 13925
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "deployment.yaml|c2cc-probe|probe" components/c2cc -S || true
ls -R components/c2cc || true
# Find the deployment template file referenced by deployProbe
fd -a "deployment.yaml" . | head -n 20
# Print the deployment.yaml relevant bits (replicas, container args/env) without dumping entire file
FILE="$(fd -a "deployment.yaml" . | head -n 1)"
echo "Using $FILE"
rg -n "replicas:|c2cc.*probe|RunProbe|healthcheck-probe|container|args:" "$FILE" -n || true
sed -n '1,220p' "$FILE"Repository: openshift/microshift
Length of output: 6913
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Probe deployment template
ls -la assets/components/c2cc || true
sed -n '1,240p' assets/components/c2cc/deployment.yaml
# 2) RemoteCluster CRD to see if status subresource is enabled
CRD="$(fd -a "microshift.io_remoteclusters.yaml" . | head -n 1)"
echo "CRD path: $CRD"
sed -n '1,220p' "$CRD"
rg -n "subresources:|status:" "$CRD" -n
# 3) Retry constants used in suggested patch
sed -n '1,140p' vendor/k8s.io/client-go/util/retry/util.go
rg -n "DefaultRetry|DefaultBackoff" vendor/k8s.io/client-go/util/retry/util.goRepository: openshift/microshift
Length of output: 9233
Add retry-on-conflict when updating RemoteCluster Spec (avoid 409s from concurrent probe status writes).
healthcheckCRManager.reconcile() does a List then full-object Update() after modifying got.Spec. The probe pod writes RemoteCluster health via UpdateStatus, and the CRD enables the status subresource—so probe status updates can bump resourceVersion between List and Update, causing reconcile to fail early on conflicts.
Suggested patch
import (
"context"
"fmt"
"net"
"strings"
microshiftv1alpha1 "github.com/openshift/microshift/pkg/apis/microshift/v1alpha1"
"github.com/openshift/microshift/pkg/config"
microshiftclient "github.com/openshift/microshift/pkg/generated/clientset/versioned/typed/microshift/v1alpha1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+ "k8s.io/client-go/util/retry"
"k8s.io/klog/v2"
)
@@
- got.Spec = want.Spec
- if _, err := h.client.RemoteClusters().Update(ctx, got, metav1.UpdateOptions{}); err != nil {
+ if err := retry.RetryOnConflict(retry.DefaultRetry, func() error {
+ latest, err := h.client.RemoteClusters().Get(ctx, name, metav1.GetOptions{})
+ if err != nil {
+ return err
+ }
+ latest.Spec = want.Spec
+ _, err = h.client.RemoteClusters().Update(ctx, latest, metav1.UpdateOptions{})
+ return err
+ }); err != nil {
return fmt.Errorf("updating RemoteCluster %q: %w", name, err)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| got.Spec = want.Spec | |
| if _, err := h.client.RemoteClusters().Update(ctx, got, metav1.UpdateOptions{}); err != nil { | |
| return fmt.Errorf("updating RemoteCluster %q: %w", name, err) | |
| if err := retry.RetryOnConflict(retry.DefaultRetry, func() error { | |
| latest, err := h.client.RemoteClusters().Get(ctx, name, metav1.GetOptions{}) | |
| if err != nil { | |
| return err | |
| } | |
| latest.Spec = want.Spec | |
| _, err = h.client.RemoteClusters().Update(ctx, latest, metav1.UpdateOptions{}) | |
| return err | |
| }); err != nil { | |
| return fmt.Errorf("updating RemoteCluster %q: %w", name, err) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/controllers/c2cc/healthcheck.go` around lines 65 - 67, The reconcile path
in healthcheckCRManager.reconcile() writes got.Spec = want.Spec then calls
h.client.RemoteClusters().Update(...) which can fail with 409s if the probe pod
updates status between List and Update; wrap the Update call in a
retry-on-conflict loop (use retry.RetryOnConflict) that re-gets the latest
RemoteCluster (h.client.RemoteClusters().Get), applies the desired Spec (assign
want.Spec to the fetched object's Spec), and retries the Update until success or
a non-conflict error, preserving the existing metav1.UpdateOptions usage.
| w.WriteHeader(http.StatusOK) | ||
| fmt.Fprint(w, "ok") | ||
| }) | ||
| server := &http.Server{Addr: ":8080", Handler: mux} |
There was a problem hiding this comment.
Add ReadHeaderTimeout to prevent Slowloris attacks.
The HTTP server lacks ReadHeaderTimeout, leaving it vulnerable to slow header attacks that can exhaust connections.
🔒 Proposed fix
- server := &http.Server{Addr: ":8080", Handler: mux}
+ server := &http.Server{
+ Addr: ":8080",
+ Handler: mux,
+ ReadHeaderTimeout: 10 * time.Second,
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| server := &http.Server{Addr: ":8080", Handler: mux} | |
| server := &http.Server{ | |
| Addr: ":8080", | |
| Handler: mux, | |
| ReadHeaderTimeout: 10 * time.Second, | |
| } |
🧰 Tools
🪛 golangci-lint (2.12.2)
[medium] 44-44: G112: Potential Slowloris Attack because ReadHeaderTimeout is not configured in the http.Server
(gosec)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/controllers/c2cc/probe.go` at line 44, The http.Server instantiation
(server := &http.Server{Addr: ":8080", Handler: mux}) lacks a ReadHeaderTimeout,
making it vulnerable to Slowloris; update the server struct to include a
sensible ReadHeaderTimeout (e.g., a few seconds) alongside Addr and Handler so
the server closes connections that stall while sending headers (modify the
&http.Server literal where server is created).
| if consecutiveFailures >= unhealthyThreshold { | ||
| status.State = "Unhealthy" | ||
| } else { | ||
| status.State = "Healthy" | ||
| } |
There was a problem hiding this comment.
State shows "Healthy" with errors until threshold is reached.
When a probe fails but consecutiveFailures < unhealthyThreshold, the status is set to "Healthy" with Errors populated. This creates a confusing state where the CR appears healthy but contains error messages. Consider preserving the previous state or introducing an intermediate state.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/controllers/c2cc/probe.go` around lines 177 - 181, The current probe
logic unconditionally sets status.State to "Healthy" whenever
consecutiveFailures < unhealthyThreshold, which hides errors; change the logic
in the probe result handler that updates status.State (the block using
consecutiveFailures, unhealthyThreshold and status.State) to avoid overwriting
the prior state when there are Errors—either preserve the existing status.State
if Errors is non-empty or use an intermediate state such as
"Degraded"/"UnhealthyPending" instead of "Healthy"; ensure you handle the
initial/zero-value state (set to "Healthy" only when there are no Errors and
consecutiveFailures == 0) so the code that writes status.Errors and status.State
remains consistent.
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
pkg/controllers/c2cc/probe.go (1)
44-44:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSet
ReadHeaderTimeouton the probe HTTP server.This keeps the endpoint from being vulnerable to slow-header connection exhaustion.
Suggested patch
- server := &http.Server{Addr: ":8080", Handler: mux} + server := &http.Server{ + Addr: ":8080", + Handler: mux, + ReadHeaderTimeout: 10 * time.Second, + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/controllers/c2cc/probe.go` at line 44, The probe HTTP server instantiation (the http.Server assigned to variable server in probe.go: http.Server{Addr: ":8080", Handler: mux}) lacks a ReadHeaderTimeout which makes it vulnerable to slow-header attacks; set the ReadHeaderTimeout field (e.g., time.Second * 5 or a configurable value) on that http.Server struct so the server closes connections that take too long to send request headers.pkg/controllers/c2cc/healthcheck.go (1)
65-67:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd retry-on-conflict around
RemoteClusterupdates.
Update()can hit 409 conflicts when probe status writers update the same object between list and update.Suggested patch
import ( "context" "fmt" "net" "strings" microshiftv1alpha1 "github.com/openshift/microshift/pkg/apis/microshift/v1alpha1" "github.com/openshift/microshift/pkg/config" microshiftclient "github.com/openshift/microshift/pkg/generated/clientset/versioned/typed/microshift/v1alpha1" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "k8s.io/client-go/util/retry" "k8s.io/klog/v2" ) @@ - got.Spec = want.Spec - if _, err := h.client.RemoteClusters().Update(ctx, got, metav1.UpdateOptions{}); err != nil { + if err := retry.RetryOnConflict(retry.DefaultRetry, func() error { + latest, err := h.client.RemoteClusters().Get(ctx, name, metav1.GetOptions{}) + if err != nil { + return err + } + latest.Spec = want.Spec + _, err = h.client.RemoteClusters().Update(ctx, latest, metav1.UpdateOptions{}) + return err + }); err != nil { return fmt.Errorf("updating RemoteCluster %q: %w", name, err) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/controllers/c2cc/healthcheck.go` around lines 65 - 67, The RemoteCluster update can fail with 409 conflicts; wrap the update in a retry-on-conflict loop (use k8s.io/client-go/util/retry.RetryOnConflict) around h.client.RemoteClusters().Update, re-get the latest object inside the retry, set got.Spec = want.Spec each attempt, then call Update(ctx, got, metav1.UpdateOptions{}), and return the final error if retries exhaust; reference the variables/functions got, want, and h.client.RemoteClusters().Update to locate where to apply the loop.
🧹 Nitpick comments (1)
pkg/controllers/c2cc/deploy_probe.go (1)
59-59: ⚡ Quick winLower repetitive success logging to verbose level.
This path runs every reconcile, so
Infofwill generate noisy logs over time.Suggested change
- klog.Infof("C2CC probe assets deployed (probe ClusterIP=%s)", probeIP) + klog.V(4).Infof("C2CC probe assets deployed (probe ClusterIP=%s)", probeIP)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/controllers/c2cc/deploy_probe.go` at line 59, The success log in deploy_probe.go currently uses klog.Infof("C2CC probe assets deployed (probe ClusterIP=%s)", probeIP) which runs every reconcile and is noisy; change this to a verbose-level log (e.g., klog.V(2).Infof or appropriate verbosity) so normal Info logs are not flooded—locate the klog.Infof call in the deploy probe reconciliation (symbol: the logging call in deploy_probe.go that prints probe ClusterIP) and replace it with a klog.V(x).Infof call using an appropriate verbosity level.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@assets/components/c2cc/deployment.yaml`:
- Around line 35-40: Add a readinessProbe alongside the existing livenessProbe
so the Service only routes traffic after the app is ready: add a readinessProbe
entry (mirroring the livenessProbe's httpGet path: / and port: 8080) and
configure sensible timings (e.g., initialDelaySeconds, periodSeconds,
successThreshold/failureThreshold) to gate traffic until the probe server is
listening; update the same Kubernetes manifest section containing livenessProbe
to include readinessProbe.
In `@pkg/controllers/c2cc/probe.go`:
- Around line 61-86: The call to informer.AddEventHandler currently ignores its
returned error, so update the code where informer.AddEventHandler(...) is
invoked to capture the returned error and fail fast on error (e.g., return the
error from the surrounding function or call log.Fatalf/process exit), rather
than silently proceeding; keep the existing handlers (AddFunc calling
pm.startProbe, UpdateFunc calling pm.restartProbe, DeleteFunc calling
pm.stopProbe) intact and reference this change around the
informer.AddEventHandler invocation to ensure event-handler registration errors
are handled and propagated.
In `@test/resources/c2cc.resource`:
- Around line 286-294: The current inline Python command uses n[11] which raises
IndexError for networks with fewer than 12 addresses; modify the command built
in the ${cmd} variable to first compute n = ipaddress.ip_network('${cidr}',
strict=False) and check n.num_addresses > 11 (or >=12) before accessing n[11];
if the check fails, print a clear error message and exit with a non-zero status
so the Process.Run call (python3 -c ${cmd}) returns a failing rc and the Robot
keyword surfaces a readable validation error instead of an IndexError traceback.
---
Duplicate comments:
In `@pkg/controllers/c2cc/healthcheck.go`:
- Around line 65-67: The RemoteCluster update can fail with 409 conflicts; wrap
the update in a retry-on-conflict loop (use
k8s.io/client-go/util/retry.RetryOnConflict) around
h.client.RemoteClusters().Update, re-get the latest object inside the retry, set
got.Spec = want.Spec each attempt, then call Update(ctx, got,
metav1.UpdateOptions{}), and return the final error if retries exhaust;
reference the variables/functions got, want, and
h.client.RemoteClusters().Update to locate where to apply the loop.
In `@pkg/controllers/c2cc/probe.go`:
- Line 44: The probe HTTP server instantiation (the http.Server assigned to
variable server in probe.go: http.Server{Addr: ":8080", Handler: mux}) lacks a
ReadHeaderTimeout which makes it vulnerable to slow-header attacks; set the
ReadHeaderTimeout field (e.g., time.Second * 5 or a configurable value) on that
http.Server struct so the server closes connections that take too long to send
request headers.
---
Nitpick comments:
In `@pkg/controllers/c2cc/deploy_probe.go`:
- Line 59: The success log in deploy_probe.go currently uses klog.Infof("C2CC
probe assets deployed (probe ClusterIP=%s)", probeIP) which runs every reconcile
and is noisy; change this to a verbose-level log (e.g., klog.V(2).Infof or
appropriate verbosity) so normal Info logs are not flooded—locate the klog.Infof
call in the deploy probe reconciliation (symbol: the logging call in
deploy_probe.go that prints probe ClusterIP) and replace it with a
klog.V(x).Infof call using an appropriate verbosity level.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 03ed3a56-44f9-4fcb-accc-ec1e9217620d
⛔ Files ignored due to path filters (1)
pkg/apis/microshift/v1alpha1/zz_generated.deepcopy.gois excluded by!**/zz_generated*
📒 Files selected for processing (26)
assets/components/c2cc/clusterrole.yamlassets/components/c2cc/clusterrolebinding.yamlassets/components/c2cc/deployment.yamlassets/components/c2cc/namespace.yamlassets/components/c2cc/service.yamlassets/components/c2cc/serviceaccount.yamlassets/crd/microshift.io_remoteclusters.yamlcmd/generate-config/config/config-openapi-spec.jsoncmd/microshift/main.godocs/user/howto_config.mdpackaging/microshift/config.yamlpkg/apis/microshift/v1alpha1/types.gopkg/assets/crd.gopkg/cmd/c2cc_probe.gopkg/config/c2cc.gopkg/config/c2cc_test.gopkg/config/config.gopkg/controllers/c2cc/controller.gopkg/controllers/c2cc/deploy_probe.gopkg/controllers/c2cc/healthcheck.gopkg/controllers/c2cc/healthcheck_test.gopkg/controllers/c2cc/probe.goscripts/auto-rebase/assets.yamltest/resources/c2cc.resourcetest/suites/c2cc/healthcheck.robottest/suites/c2cc/probe.robot
✅ Files skipped from review due to trivial changes (2)
- docs/user/howto_config.md
- cmd/generate-config/config/config-openapi-spec.json
| livenessProbe: | ||
| httpGet: | ||
| path: / | ||
| port: 8080 | ||
| initialDelaySeconds: 10 | ||
| periodSeconds: 10 |
There was a problem hiding this comment.
Add a readiness probe for endpoint gating.
Without readiness, the Service can route traffic before the probe server is actually listening, which can cause avoidable startup failures.
Suggested change
livenessProbe:
httpGet:
path: /
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
+ readinessProbe:
+ httpGet:
+ path: /
+ port: 8080
+ initialDelaySeconds: 2
+ periodSeconds: 5🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@assets/components/c2cc/deployment.yaml` around lines 35 - 40, Add a
readinessProbe alongside the existing livenessProbe so the Service only routes
traffic after the app is ready: add a readinessProbe entry (mirroring the
livenessProbe's httpGet path: / and port: 8080) and configure sensible timings
(e.g., initialDelaySeconds, periodSeconds, successThreshold/failureThreshold) to
gate traffic until the probe server is listening; update the same Kubernetes
manifest section containing livenessProbe to include readinessProbe.
| informer.AddEventHandler(cache.ResourceEventHandlerFuncs{ | ||
| AddFunc: func(obj interface{}) { | ||
| if rc, ok := obj.(*microshiftv1alpha1.RemoteCluster); ok { | ||
| pm.startProbe(ctx, rc) | ||
| } | ||
| }, | ||
| UpdateFunc: func(oldObj, newObj interface{}) { | ||
| oldRC, ok1 := oldObj.(*microshiftv1alpha1.RemoteCluster) | ||
| newRC, ok2 := newObj.(*microshiftv1alpha1.RemoteCluster) | ||
| if ok1 && ok2 && (oldRC.Spec.ProbeTarget != newRC.Spec.ProbeTarget || | ||
| oldRC.Spec.ProbeInterval != newRC.Spec.ProbeInterval) { | ||
| pm.restartProbe(ctx, newRC) | ||
| } | ||
| }, | ||
| DeleteFunc: func(obj interface{}) { | ||
| rc, ok := obj.(*microshiftv1alpha1.RemoteCluster) | ||
| if !ok { | ||
| if tombstone, ok := obj.(cache.DeletedFinalStateUnknown); ok { | ||
| rc, _ = tombstone.Obj.(*microshiftv1alpha1.RemoteCluster) | ||
| } | ||
| } | ||
| if rc != nil { | ||
| pm.stopProbe(rc.Name) | ||
| } | ||
| }, | ||
| }) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show relevant section with line numbers
sed -n '1,140p' pkg/controllers/c2cc/probe.go | nl -ba | sed -n '40,120p'
# Confirm signature/usage elsewhere in repo
rg -n "AddEventHandler\(" pkg/controllers/c2cc/probe.go pkg/controllers/c2cc -S || true
rg -n "failed to add RemoteCluster informer handlers" -S . || true
# Show imports to see if fmt is already available
sed -n '1,80p' pkg/controllers/c2cc/probe.go | nl -baRepository: openshift/microshift
Length of output: 107
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Print relevant portion with line numbers (using awk since nl isn't available)
awk 'NR>=40 && NR<=120 {printf "%d\t%s\n", NR, $0}' pkg/controllers/c2cc/probe.go
echo "----"
# Search for AddEventHandler usage in this package
rg -n "AddEventHandler\\(" pkg/controllers/c2cc -S || true
echo "----"
# Search for fmt usage in this file to see if fmt is already imported/used
rg -n "\\bfmt\\." pkg/controllers/c2cc/probe.go -S || true
echo "----"
# Print import block to see if fmt is present
awk 'NR>=1 && NR<=80 {print}' pkg/controllers/c2cc/probe.goRepository: openshift/microshift
Length of output: 5881
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate the generated informer file for RemoteClusters
fd -a "remoteclusters" pkg/generated/informers/externalversions -S || true
fd -a "RemoteClusters" pkg/generated/informers/externalversions -S || true
# Search for the Informer() method definition returning the informer type
rg -n "RemoteClusters\\(\\)\\.?Informer\\(" pkg/generated/informers/externalversions -S
# Show the specific Informer() implementation + returned type
rg -n "func \\(.*\\) Informer\\(\\) cache\\.SharedIndexInformer" pkg/generated/informers/externalversions -S || true
rg -n "func \\(.*\\) Informer\\(\\)" pkg/generated/informers/externalversions -S | rg -n "RemoteClusters" || true
# Find the AddEventHandler method signature in the client-go source if present in module cache
# (best-effort; ignore failures if go env isn't available)
if command -v go >/dev/null 2>&1; then
GOPATH="$(go env GOPATH 2>/dev/null || true)"
if [ -n "$GOPATH" ] && [ -d "$GOPATH/pkg/mod" ]; then
rg -n "func \\(.*\\) AddEventHandler\\(" "$GOPATH/pkg/mod" -S --hidden --no-ignore-vcs 2>/dev/null \
| rg -n "ResourceEventHandler" | head -n 20
fi
fiRepository: openshift/microshift
Length of output: 761
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== In-repo: find SharedIndexInformer interface definition =="
rg -n "type +SharedIndexInformer +interface" -S . || true
echo "== In-repo: find AddEventHandler signature on SharedIndexInformer =="
rg -n "SharedIndexInformer.*AddEventHandler|AddEventHandler\\(handler" -S . || true
echo "== In GOPATH pkg/mod: find SharedIndexInformer interface =="
if command -v go >/dev/null 2>&1; then
GOPATH="$(go env GOPATH 2>/dev/null || true)"
if [ -n "${GOPATH:-}" ] && [ -d "$GOPATH/pkg/mod" ]; then
rg -n "type +SharedIndexInformer +interface" "$GOPATH/pkg/mod" -S --no-mmap 2>/dev/null | head -n 20 || true
else
echo "GOPATH pkg/mod not found"
fi
else
echo "go not available"
fi
echo "== In GOPATH pkg/mod: find AddEventHandler signature =="
if command -v go >/dev/null 2>&1; then
GOPATH="$(go env GOPATH 2>/dev/null || true)"
if [ -n "${GOPATH:-}" ] && [ -d "$GOPATH/pkg/mod" ]; then
rg -n "AddEventHandler\\(.*ResourceEventHandler" "$GOPATH/pkg/mod" -S --no-mmap 2>/dev/null \
| rg -n "SharedIndexInformer|cache\\.SharedIndexInformer|type .*SharedIndexInformer" \
| head -n 50 || true
fi
fi
echo "== Show generated Informer() return type for RemoteClusters =="
sed -n '60,110p' pkg/generated/informers/externalversions/microshift/v1alpha1/remotecluster.goRepository: openshift/microshift
Length of output: 9804
Handle RemoteCluster informer event-handler registration errors
cache.SharedIndexInformer.AddEventHandler(...) returns an (…, error) but pkg/controllers/c2cc/probe.go ignores the error, so probe event handling could be silently disabled. Fail fast if registration fails.
Suggested patch
- informer.AddEventHandler(cache.ResourceEventHandlerFuncs{
+ if _, err := informer.AddEventHandler(cache.ResourceEventHandlerFuncs{
AddFunc: func(obj interface{}) {
if rc, ok := obj.(*microshiftv1alpha1.RemoteCluster); ok {
pm.startProbe(ctx, rc)
}
},
UpdateFunc: func(oldObj, newObj interface{}) {
oldRC, ok1 := oldObj.(*microshiftv1alpha1.RemoteCluster)
newRC, ok2 := newObj.(*microshiftv1alpha1.RemoteCluster)
if ok1 && ok2 && (oldRC.Spec.ProbeTarget != newRC.Spec.ProbeTarget ||
oldRC.Spec.ProbeInterval != newRC.Spec.ProbeInterval) {
pm.restartProbe(ctx, newRC)
}
},
DeleteFunc: func(obj interface{}) {
rc, ok := obj.(*microshiftv1alpha1.RemoteCluster)
if !ok {
if tombstone, ok := obj.(cache.DeletedFinalStateUnknown); ok {
rc, _ = tombstone.Obj.(*microshiftv1alpha1.RemoteCluster)
}
}
if rc != nil {
pm.stopProbe(rc.Name)
}
},
- })
+ }); err != nil {
+ return fmt.Errorf("failed to add RemoteCluster informer handlers: %w", err)
+ }🧰 Tools
🪛 golangci-lint (2.12.2)
[error] 61-61: Error return value of informer.AddEventHandler is not checked
(errcheck)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/controllers/c2cc/probe.go` around lines 61 - 86, The call to
informer.AddEventHandler currently ignores its returned error, so update the
code where informer.AddEventHandler(...) is invoked to capture the returned
error and fail fast on error (e.g., return the error from the surrounding
function or call log.Fatalf/process exit), rather than silently proceeding; keep
the existing handlers (AddFunc calling pm.startProbe, UpdateFunc calling
pm.restartProbe, DeleteFunc calling pm.stopProbe) intact and reference this
change around the informer.AddEventHandler invocation to ensure event-handler
registration errors are handled and propagated.
| Compute 11th IP | ||
| [Documentation] Return the 11th host address in a CIDR (e.g. 10.43.0.0/16 -> 10.43.0.11). | ||
| [Arguments] ${cidr} | ||
| VAR ${cmd}= import ipaddress; n=ipaddress.ip_network('${cidr}', strict=False); print(n[11]) | ||
| ${result}= Process.Run Process python3 -c ${cmd} | ||
| Should Be Equal As Integers ${result.rc} 0 | ||
| ${ip}= Strip String ${result.stdout} | ||
| RETURN ${ip} | ||
|
|
There was a problem hiding this comment.
Add bounds check for small CIDRs.
Accessing n[11] will raise IndexError if the network has fewer than 12 addresses (e.g., a /29 CIDR has only 8). While service CIDRs are typically large enough, adding a validation ensures clear error messages if a small CIDR is passed.
🛡️ Proposed fix to validate network size
Compute 11th IP
[Documentation] Return the 11th host address in a CIDR (e.g. 10.43.0.0/16 -> 10.43.0.11).
[Arguments] ${cidr}
- VAR ${cmd}= import ipaddress; n=ipaddress.ip_network('${cidr}', strict=False); print(n[11])
+ VAR ${cmd}= import ipaddress; n=ipaddress.ip_network('${cidr}', strict=False); assert n.num_addresses > 11, f"CIDR {n} too small"; print(n[11])
${result}= Process.Run Process python3 -c ${cmd}
Should Be Equal As Integers ${result.rc} 0🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/resources/c2cc.resource` around lines 286 - 294, The current inline
Python command uses n[11] which raises IndexError for networks with fewer than
12 addresses; modify the command built in the ${cmd} variable to first compute n
= ipaddress.ip_network('${cidr}', strict=False) and check n.num_addresses > 11
(or >=12) before accessing n[11]; if the check fails, print a clear error
message and exit with a non-zero status so the Process.Run call (python3 -c
${cmd}) returns a failing rc and the Robot keyword surfaces a readable
validation error instead of an IndexError traceback.
The probe pod needs a place to report health state per remote cluster. Add State (enum: NeverProbed/Healthy/Unhealthy with NeverProbed as default), LastSuccessfulProbe, LastProbeTime, and Errors fields to RemoteClusterStatus. Latency statistics are deferred to a later ticket. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The probe pod has a dual role: it listens on :8080 as a probe target for remote clusters, and watches RemoteCluster CRs to actively probe remote probe services. Per-CR goroutines perform HTTP GET at the configured interval and update CR status with state transitions (NeverProbed -> Healthy -> Unhealthy after 3 consecutive failures). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Hidden subcommand invoked by the probe pod deployment as 'microshift c2cc-probe'. Delegates to c2cc.RunProbe() which serves as a probe target and actively probes remote clusters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Namespace, ServiceAccount, ClusterRole, ClusterRoleBinding, Deployment, and Service for the C2CC probe pod. The Deployment mounts the host microshift binary and runs 'microshift c2cc-probe'. The Service gets a deterministic ClusterIP (11th IP of local service CIDR) so remote clusters can target it as a probe endpoint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
After the RemoteCluster CRD is established, the controller applies the probe pod manifests (namespace, SA, RBAC, deployment, service). The probe service gets the 11th IP of the local service CIDR as its deterministic ClusterIP so remote clusters can target it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When C2CC config is removed, cleanupAll() now deletes the probe pod namespace (cascading to SA, Deployment, Service), ClusterRole, and ClusterRoleBinding before removing the RemoteCluster CRD. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move deployProbe() into fullReconcile() so the probe deployment is re-applied every 10s. If the deployment is deleted or scaled down at runtime, the next reconcile cycle recreates it. ApplyDeployments is idempotent so this is a no-op when the deployment already matches. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests verify probe namespace, deployment readiness, service ClusterIP assignment, RemoteCluster status transitions, and self-healing after deployment deletion or scale-down. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
UpdateFunc unconditionally restarted probes on every RemoteCluster update, including status updates from the probe loop itself. This created a feedback loop that reset consecutiveFailures to 0 on every tick, preventing clusters from ever reaching the unhealthy threshold. Only restart probes when spec fields actually change. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fillDefaults() returns an error but the tests were ignoring it. Use require.NoError so tests fail fast with a clear message if default initialization breaks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The keyword was duplicated in both probe.robot and 07-healthcheck.robot. Move it to c2cc.resource which both files already import. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (3)
pkg/controllers/c2cc/healthcheck.go (1)
65-67:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRetry on conflict for spec updates.
The update path can fail on transient
resourceVersionconflicts when status is updated concurrently; retrying avoids reconcile aborts.Suggested patch
import ( "context" "fmt" "net" "strings" microshiftv1alpha1 "github.com/openshift/microshift/pkg/apis/microshift/v1alpha1" "github.com/openshift/microshift/pkg/config" microshiftclient "github.com/openshift/microshift/pkg/generated/clientset/versioned/typed/microshift/v1alpha1" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "k8s.io/client-go/util/retry" "k8s.io/klog/v2" ) @@ - got.Spec = want.Spec - if _, err := h.client.RemoteClusters().Update(ctx, got, metav1.UpdateOptions{}); err != nil { + if err := retry.RetryOnConflict(retry.DefaultRetry, func() error { + latest, err := h.client.RemoteClusters().Get(ctx, name, metav1.GetOptions{}) + if err != nil { + return err + } + latest.Spec = want.Spec + _, err = h.client.RemoteClusters().Update(ctx, latest, metav1.UpdateOptions{}) + return err + }); err != nil { return fmt.Errorf("updating RemoteCluster %q: %w", name, err) }#!/bin/bash set -euo pipefail echo "Verify current reconcile update path (no retry loop):" awk 'NR>=60 && NR<=72 {printf "%d:%s\n", NR, $0}' pkg/controllers/c2cc/healthcheck.go echo "Verify concurrent status writer exists:" rg -n --type=go 'RemoteClusters\(\)\.UpdateStatus\(' pkg/controllers/c2cc/probe.go echo "Verify RetryOnConflict is not currently used in healthcheck manager:" rg -n --type=go 'RetryOnConflict' pkg/controllers/c2cc/healthcheck.go || true🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/controllers/c2cc/healthcheck.go` around lines 65 - 67, The update currently sets got.Spec = want.Spec and calls h.client.RemoteClusters().Update(...) which can fail with resourceVersion conflicts; wrap the update in a retry loop using k8s.io/apimachinery/pkg/util/wait.RetryOnConflict: inside the loop re-get the latest RemoteCluster (using h.client.RemoteClusters().Get), set its Spec to want.Spec, then call Update and return on success, propagating non-conflict errors; reference the existing symbols got.Spec = want.Spec, h.client.RemoteClusters().Update, and use RetryOnConflict to implement the retry semantics.pkg/controllers/c2cc/probe.go (2)
61-86:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHandle
AddEventHandlerregistration errors.If handler registration fails, probe management silently never starts correctly.
Suggested patch
- informer.AddEventHandler(cache.ResourceEventHandlerFuncs{ + if _, err := informer.AddEventHandler(cache.ResourceEventHandlerFuncs{ AddFunc: func(obj interface{}) { if rc, ok := obj.(*microshiftv1alpha1.RemoteCluster); ok { pm.startProbe(ctx, rc) } }, @@ DeleteFunc: func(obj interface{}) { @@ }, - }) + }); err != nil { + return fmt.Errorf("failed to add RemoteCluster informer handlers: %w", err) + }#!/bin/bash set -euo pipefail echo "Inspect AddEventHandler call site:" awk 'NR>=58 && NR<=90 {printf "%d:%s\n", NR, $0}' pkg/controllers/c2cc/probe.go echo "Confirm return value is not captured:" rg -n --type=go '^\s*informer\.AddEventHandler\(' pkg/controllers/c2cc/probe.go🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/controllers/c2cc/probe.go` around lines 61 - 86, The AddEventHandler call in probe.go is currently un-checked so failures to register handlers are ignored; modify the code that registers the informer handlers (the informer.AddEventHandler call in the same function that calls pm.startProbe/pm.restartProbe/pm.stopProbe) to capture and handle any registration error—i.e., call the registration variant that returns an error (or wrap the registration in a helper that validates success), and on error log the failure via the controller logger and return the error (or propagate it) so probe management fails fast instead of silently never starting.
44-44:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSet
ReadHeaderTimeouton the probe HTTP server.This leaves the listener exposed to slow-header connection abuse.
Suggested patch
- server := &http.Server{Addr: ":8080", Handler: mux} + server := &http.Server{ + Addr: ":8080", + Handler: mux, + ReadHeaderTimeout: 10 * time.Second, + }#!/bin/bash set -euo pipefail echo "Inspect server construction:" awk 'NR>=39 && NR<=48 {printf "%d:%s\n", NR, $0}' pkg/controllers/c2cc/probe.go echo "Check whether ReadHeaderTimeout is present:" rg -n --type=go 'ReadHeaderTimeout' pkg/controllers/c2cc/probe.go || true🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/controllers/c2cc/probe.go` at line 44, The probe HTTP server is created with server := &http.Server{Addr: ":8080", Handler: mux} and lacks a ReadHeaderTimeout, leaving it vulnerable to slow-header attacks; update the http.Server construction for the probe (the variable named server) to include a ReadHeaderTimeout (e.g., 5s or another suitable duration defined in a constant/config) so the server rejects connections that take too long to send headers, and ensure the chosen timeout is applied where the probe server is started.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@assets/crd/microshift.io_remoteclusters.yaml`:
- Around line 43-47: Add schema-level admission validation for
spec.probeInterval: update the CRD's openAPIV3Schema for the spec.probeInterval
property to include an x-kubernetes-validations rule that enforces a duration
between 1s and 5m (e.g. a CEL expression that parses the duration and asserts
seconds >= 1 && seconds <= 300) and a pattern fallback to only accept valid
duration strings (digits + unit like s, m). Target the spec.probeInterval
property in the CRD schema and include a clear message for rejection.
In `@Makefile`:
- Around line 349-352: The generate-crds Makefile target is currently leaf-only
and not invoked by the broader generate/verify flows; update the top-level
targets (e.g., the generate target and verify/verify-config targets used by make
and make verify) to depend on generate-crds so CRD and client generation runs
automatically—modify the prerequisites for the primary generate target and the
verify/verify-config targets to include generate-crds (references: Makefile
targets "generate-crds", "generate", "verify", "verify-config") so stale
CRDs/clients are caught by the normal make/verify pipelines.
In `@pkg/controllers/c2cc/deploy_probe.go`:
- Around line 26-29: The code accesses c.cfg.Network.ServiceNetwork[0] without
checking length which will panic if the slice is empty; before calling
net.ParseCIDR, add a guard that validates len(c.cfg.Network.ServiceNetwork) > 0
and return a clear error (e.g. "service network not configured") if empty, then
proceed to call net.ParseCIDR using the first element and preserve the existing
error wrapping for ParseCIDR failures (variables involved:
c.cfg.Network.ServiceNetwork, net.ParseCIDR, svcNet, err).
---
Duplicate comments:
In `@pkg/controllers/c2cc/healthcheck.go`:
- Around line 65-67: The update currently sets got.Spec = want.Spec and calls
h.client.RemoteClusters().Update(...) which can fail with resourceVersion
conflicts; wrap the update in a retry loop using
k8s.io/apimachinery/pkg/util/wait.RetryOnConflict: inside the loop re-get the
latest RemoteCluster (using h.client.RemoteClusters().Get), set its Spec to
want.Spec, then call Update and return on success, propagating non-conflict
errors; reference the existing symbols got.Spec = want.Spec,
h.client.RemoteClusters().Update, and use RetryOnConflict to implement the retry
semantics.
In `@pkg/controllers/c2cc/probe.go`:
- Around line 61-86: The AddEventHandler call in probe.go is currently
un-checked so failures to register handlers are ignored; modify the code that
registers the informer handlers (the informer.AddEventHandler call in the same
function that calls pm.startProbe/pm.restartProbe/pm.stopProbe) to capture and
handle any registration error—i.e., call the registration variant that returns
an error (or wrap the registration in a helper that validates success), and on
error log the failure via the controller logger and return the error (or
propagate it) so probe management fails fast instead of silently never starting.
- Line 44: The probe HTTP server is created with server := &http.Server{Addr:
":8080", Handler: mux} and lacks a ReadHeaderTimeout, leaving it vulnerable to
slow-header attacks; update the http.Server construction for the probe (the
variable named server) to include a ReadHeaderTimeout (e.g., 5s or another
suitable duration defined in a constant/config) so the server rejects
connections that take too long to send headers, and ensure the chosen timeout is
applied where the probe server is started.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 928d58d0-49a8-4e1a-b7d7-a07340e7adee
⛔ Files ignored due to path filters (22)
pkg/apis/microshift/v1alpha1/zz_generated.deepcopy.gois excluded by!**/zz_generated*pkg/generated/clientset/versioned/clientset.gois excluded by!**/generated/**pkg/generated/clientset/versioned/fake/clientset_generated.gois excluded by!**/generated/**pkg/generated/clientset/versioned/fake/doc.gois excluded by!**/generated/**pkg/generated/clientset/versioned/fake/register.gois excluded by!**/generated/**pkg/generated/clientset/versioned/scheme/doc.gois excluded by!**/generated/**pkg/generated/clientset/versioned/scheme/register.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/doc.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/fake/doc.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/fake/fake_microshift_client.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/fake/fake_remotecluster.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/generated_expansion.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/microshift_client.gois excluded by!**/generated/**pkg/generated/clientset/versioned/typed/microshift/v1alpha1/remotecluster.gois excluded by!**/generated/**pkg/generated/informers/externalversions/factory.gois excluded by!**/generated/**pkg/generated/informers/externalversions/generic.gois excluded by!**/generated/**pkg/generated/informers/externalversions/internalinterfaces/factory_interfaces.gois excluded by!**/generated/**pkg/generated/informers/externalversions/microshift/interface.gois excluded by!**/generated/**pkg/generated/informers/externalversions/microshift/v1alpha1/interface.gois excluded by!**/generated/**pkg/generated/informers/externalversions/microshift/v1alpha1/remotecluster.gois excluded by!**/generated/**pkg/generated/listers/microshift/v1alpha1/expansion_generated.gois excluded by!**/generated/**pkg/generated/listers/microshift/v1alpha1/remotecluster.gois excluded by!**/generated/**
📒 Files selected for processing (31)
Makefileassets/components/c2cc/clusterrole.yamlassets/components/c2cc/clusterrolebinding.yamlassets/components/c2cc/deployment.yamlassets/components/c2cc/namespace.yamlassets/components/c2cc/service.yamlassets/components/c2cc/serviceaccount.yamlassets/crd/microshift.io_remoteclusters.yamlcmd/generate-config/config/config-openapi-spec.jsoncmd/microshift/main.godocs/user/howto_config.mdpackaging/microshift/config.yamlpkg/apis/microshift/v1alpha1/doc.gopkg/apis/microshift/v1alpha1/groupversion_info.gopkg/apis/microshift/v1alpha1/types.gopkg/assets/crd.gopkg/cmd/c2cc_probe.gopkg/config/c2cc.gopkg/config/c2cc_test.gopkg/config/config.gopkg/controllers/c2cc/controller.gopkg/controllers/c2cc/deploy_probe.gopkg/controllers/c2cc/healthcheck.gopkg/controllers/c2cc/healthcheck_test.gopkg/controllers/c2cc/probe.goscripts/auto-rebase/assets.yamlscripts/boilerplate.go.txtscripts/generate-crds.shtest/resources/c2cc.resourcetest/suites/c2cc/healthcheck.robottest/suites/c2cc/probe.robot
✅ Files skipped from review due to trivial changes (4)
- pkg/apis/microshift/v1alpha1/doc.go
- scripts/auto-rebase/assets.yaml
- docs/user/howto_config.md
- assets/components/c2cc/namespace.yaml
| probeInterval: | ||
| default: 10s | ||
| description: Interval between probe attempts (e.g. "10s", "1m"). | ||
| type: string | ||
| probeTarget: |
There was a problem hiding this comment.
Enforce spec.probeInterval constraints at admission.
probeInterval is documented as 1s–5m, but the CRD schema currently accepts any string. Add schema-level validation so invalid values are rejected early.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@assets/crd/microshift.io_remoteclusters.yaml` around lines 43 - 47, Add
schema-level admission validation for spec.probeInterval: update the CRD's
openAPIV3Schema for the spec.probeInterval property to include an
x-kubernetes-validations rule that enforces a duration between 1s and 5m (e.g. a
CEL expression that parses the duration and asserts seconds >= 1 && seconds <=
300) and a pattern fallback to only accept valid duration strings (digits + unit
like s, m). Target the spec.probeInterval property in the CRD schema and include
a clear message for rejection.
| .PHONY: generate-crds | ||
| generate-crds: | ||
| ./scripts/fetch_tools.sh controller-gen && \ | ||
| ./scripts/generate-crds.sh |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Wire generate-crds into the default generate/verify flow.
Right now this target is leaf-only, so the visible make, make verify, and make verify-config paths won't catch stale CRDs or generated clients after API changes.
Suggested change
-all: generate-config microshift etcd
+all: generate-config generate-crds microshift etcd
...
-verify-config: generate-config
+verify-config: generate-config generate-crds🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Makefile` around lines 349 - 352, The generate-crds Makefile target is
currently leaf-only and not invoked by the broader generate/verify flows; update
the top-level targets (e.g., the generate target and verify/verify-config
targets used by make and make verify) to depend on generate-crds so CRD and
client generation runs automatically—modify the prerequisites for the primary
generate target and the verify/verify-config targets to include generate-crds
(references: Makefile targets "generate-crds", "generate", "verify",
"verify-config") so stale CRDs/clients are caught by the normal make/verify
pipelines.
| _, svcNet, err := net.ParseCIDR(c.cfg.Network.ServiceNetwork[0]) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to parse local service network: %w", err) | ||
| } |
There was a problem hiding this comment.
Guard empty ServiceNetwork before indexing.
Line 26 accesses c.cfg.Network.ServiceNetwork[0] without a length check; an empty list will panic and crash reconciliation.
Suggested fix
func (c *C2CCRouteManager) deployProbe(ctx context.Context) error {
+ if len(c.cfg.Network.ServiceNetwork) == 0 {
+ return fmt.Errorf("service network is empty")
+ }
_, svcNet, err := net.ParseCIDR(c.cfg.Network.ServiceNetwork[0])
if err != nil {
return fmt.Errorf("failed to parse local service network: %w", err)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| _, svcNet, err := net.ParseCIDR(c.cfg.Network.ServiceNetwork[0]) | |
| if err != nil { | |
| return fmt.Errorf("failed to parse local service network: %w", err) | |
| } | |
| if len(c.cfg.Network.ServiceNetwork) == 0 { | |
| return fmt.Errorf("service network is empty") | |
| } | |
| _, svcNet, err := net.ParseCIDR(c.cfg.Network.ServiceNetwork[0]) | |
| if err != nil { | |
| return fmt.Errorf("failed to parse local service network: %w", err) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/controllers/c2cc/deploy_probe.go` around lines 26 - 29, The code accesses
c.cfg.Network.ServiceNetwork[0] without checking length which will panic if the
slice is empty; before calling net.ParseCIDR, add a guard that validates
len(c.cfg.Network.ServiceNetwork) > 0 and return a clear error (e.g. "service
network not configured") if empty, then proceed to call net.ParseCIDR using the
first element and preserve the existing error wrapping for ParseCIDR failures
(variables involved: c.cfg.Network.ServiceNetwork, net.ParseCIDR, svcNet, err).
Summary by CodeRabbit