Skip to content

OCPEDGE-2379: new test to add an extra worker on TNA cluster#30876

Draft
agullon wants to merge 1 commit intoopenshift:mainfrom
agullon:OCPEDGE-2379
Draft

OCPEDGE-2379: new test to add an extra worker on TNA cluster#30876
agullon wants to merge 1 commit intoopenshift:mainfrom
agullon:OCPEDGE-2379

Conversation

@agullon
Copy link

@agullon agullon commented Mar 13, 2026

Adding a new Test Case to create a extra worker on a TNA cluster as a 2 day operation.

Phase Step Description
Setup SETUP-01 Validate cluster topology is HighlyAvailableArbiter
Setup SETUP-02 Clean up stale extra worker BMHs from previous runs (30-min timeout)
Setup SETUP-03 Initialize baremetal test helper
Setup SETUP-04 Verify extra worker data is available (extraworkers-secret)
Test STEP-01 Record initial node count and worker MachineConfigPool state
Test STEP-02 Provision extra worker BMH from extraworkers-secret
Test STEP-03 Approve pending CSRs for the new worker (kubelet client + serving)
Test STEP-04 Wait for the new worker node to become Ready
Test STEP-05 Verify new node membership (node count +1) and worker label
Test STEP-06 Verify the new node has no pressure conditions (memory, disk, PID)
Test STEP-07 Verify a pod can be scheduled and run on the new node
Test STEP-08 Verify the worker MachineConfigPool converges with the new node
Test STEP-09 Verify the new node's kubelet version matches control-plane nodes
Cleanup CLEANUP-01 Delete extra worker BMHs and wait for Metal3 deprovisioning (30-min timeout)
Cleanup CLEANUP-02 Remove stale extra worker Node objects from the cluster

This is a successful log example:

$ ./openshift-tests run-test '[sig-node][apigroup:config.openshift.io][OCPFeatureGate:HighlyAvailableArbiter][Serial] Extra worker scaling in HighlyAvailableArbiterMode should deploy an extra worker node that joins the cluster and becomes Ready [Timeout:60m][Slow]'

  I0312 16:33:58.731594   50194 i18n.go:139] Couldn't find translations for C, using default
  I0312 16:33:58.866736   50194 binary.go:78] Found 8524 test specs
  I0312 16:33:58.867806   50194 binary.go:95] 1074 test specs remain, after filtering out k8s
openshift-tests v4.1.0-10870-gfbba1c2
  I0312 16:33:59.705199   50194 test_setup.go:125] Extended test version v4.1.0-10870-gfbba1c2
  I0312 16:33:59.705234   50194 test_context.go:559] Tolerating taints "node-role.kubernetes.io/control-plane" when considering if nodes are ready
  I0312 16:33:59.705589 50194 framework.go:2324] [precondition-check] checking if cluster is MicroShift
  I0312 16:33:59.752027 50194 framework.go:2348] IsMicroShiftCluster: microshift-version configmap not found, not MicroShift
  I0312 16:33:59.752470   50194 binary.go:122] Loaded test configuration: &framework.TestContextType{KubeConfig:"/Users/agullon/workspace/two-node-toolbox/deploy/openshift-clusters/kubeconfig", KubeContext:"", KubeAPIContentType:"application/vnd.kubernetes.protobuf", KubeletRootDir:"/var/lib/kubelet", KubeletConfigDropinDir:"", CertDir:"", Host:"https://api.ostest.test.metalkube.org:6443", BearerToken:"<redacted>", RepoRoot:"../../", ListImages:false, listTests:false, listLabels:false, ListConformanceTests:false, Provider:"skeleton", Tooling:"", timeouts:framework.TimeoutContext{Poll:2000000000, PodStart:300000000000, PodStartShort:120000000000, PodStartSlow:900000000000, PodDelete:300000000000, ClaimProvision:300000000000, DataSourceProvision:300000000000, ClaimProvisionShort:60000000000, ClaimBound:180000000000, PVReclaim:180000000000, PVBound:180000000000, PVCreate:180000000000, PVDelete:300000000000, PVDeleteSlow:1200000000000, SnapshotCreate:300000000000, SnapshotDelete:300000000000, SnapshotControllerMetrics:300000000000, SystemPodsStartup:600000000000, NodeSchedulable:1800000000000, SystemDaemonsetStartup:300000000000, NodeNotReady:180000000000}, CloudConfig:framework.CloudConfig{APIEndpoint:"", ProjectID:"", Zone:"", Zones:[]string{}, Region:"", MultiZone:false, MultiMaster:true, Cluster:"", MasterName:"", NodeInstanceGroup:"", NumNodes:2, ClusterIPRange:"", ClusterTag:"", Network:"", ConfigFile:"", NodeTag:"", MasterTag:"", Provider:framework.NullProvider{}}, KubectlPath:"kubectl", OutputDir:"/tmp", ReportDir:"", ReportPrefix:"", ReportCompleteGinkgo:false, ReportCompleteJUnit:false, Prefix:"e2e", MinStartupPods:-1, EtcdUpgradeStorage:"", EtcdUpgradeVersion:"", GCEUpgradeScript:"", ContainerRuntimeEndpoint:"unix:///run/containerd/containerd.sock", ContainerRuntimeProcessName:"containerd", ContainerRuntimePidFile:"/run/containerd/containerd.pid", SystemdServices:"containerd*", DumpSystemdJournal:false, ImageServiceEndpoint:"", MasterOSDistro:"custom", NodeOSDistro:"custom", NodeOSArch:"amd64", VerifyServiceAccount:true, DeleteNamespace:true, DeleteNamespaceOnFailure:true, AllowedNotReadyNodes:-1, CleanStart:false, GatherKubeSystemResourceUsageData:"false", GatherLogsSizes:false, GatherMetricsAfterTest:"false", GatherSuiteMetricsAfterTest:false, MaxNodesToGather:0, IncludeClusterAutoscalerMetrics:false, OutputPrintType:"json", CreateTestingNS:(framework.CreateTestingNSFn)(0x107bda660), DumpLogsOnFailure:true, DisableLogDump:false, LogexporterGCSPath:"", NodeTestContextType:framework.NodeTestContextType{NodeE2E:false, NodeName:"", NodeConformance:false, PrepullImages:false, ImageDescription:"", RuntimeConfig:map[string]string(nil), SystemSpecName:"", RestartKubelet:false, ExtraEnvs:map[string]string(nil), StandaloneMode:false, CriProxyEnabled:false}, ClusterDNSDomain:"cluster.local", NodeKiller:framework.NodeKillerConfig{Enabled:false, FailureRatio:0.01, Interval:60000000000, JitterFactor:60, SimulatedDowntime:600000000000, NodeKillerStopCtx:context.Context(nil), NodeKillerStop:(func())(nil)}, IPFamily:"ipv4", NonblockingTaints:"node-role.kubernetes.io/control-plane", ProgressReportURL:"", SriovdpConfigMapFile:"", SpecSummaryOutput:"", DockerConfigFile:"", E2EDockerConfigFile:"", KubeTestRepoList:"", SnapshotControllerPodName:"", SnapshotControllerHTTPPort:0, RequireDevices:false, EnabledVolumeDrivers:[]string(nil)}
  Running Suite:  - /Users/agullon/workspace/origin
  =================================================
  Random Seed: 1773329638 - will randomize all specs

  Will run 1 of 1 specs
  ------------------------------
  [sig-node][apigroup:config.openshift.io][OCPFeatureGate:HighlyAvailableArbiter][Serial] Extra worker scaling in HighlyAvailableArbiterMode should deploy an extra worker node that joins the cluster and becomes Ready [Timeout:60m][Slow]
  github.com/openshift/origin/test/extended/two_node/tna_extra_worker.go:93
    STEP: Creating a kubernetes client @ 03/12/26 16:33:59.76
  I0312 16:33:59.760839   50194 discovery.go:214] Invalidating discovery information
    STEP: SETUP: Validating cluster topology is HighlyAvailableArbiter @ 03/12/26 16:33:59.76
  I0312 16:33:59.760992 50194 common.go:109] [precondition-check] validating cluster topology is HighlyAvailableArbiter
    STEP: SETUP: Cleaning up stale extra worker BMHs from previous runs @ 03/12/26 16:33:59.809
    STEP: SETUP: Initializing baremetal test helper @ 03/12/26 16:34:00.966
    STEP: SETUP: Ensuring extra worker data is available @ 03/12/26 16:34:00.968
    STEP: STEP-01: Recording initial values for the test @ 03/12/26 16:34:01.014
  I0312 16:34:01.169957 50194 tna_extra_worker.go:132] Initial state: 3 nodes, 0 machines in worker MCP
    STEP: STEP-02: Provisioning extra worker BMH @ 03/12/26 16:34:01.17
    STEP: wait until ostest-extraworker-0 becomes available @ 03/12/26 16:34:01.383
  I0312 16:36:56.543657 50194 tna_extra_worker.go:152] Extra worker BMH ostest-extraworker-0 deployed and reached available state
  I0312 16:36:56.614026 50194 tna_extra_worker.go:344] Patched BMH ostest-extraworker-0 with customDeploy=install_coreos, bootMode=UEFI, rootDeviceHints=/dev/sda, userData=worker-user-data-managed
  I0312 16:36:56.670014 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: available
  I0312 16:37:06.663389 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
  I0312 16:37:16.663525 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
  I0312 16:37:26.664017 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
  I0312 16:37:36.662165 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
  I0312 16:37:46.675360 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
  I0312 16:37:56.668122 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
  I0312 16:38:06.667721 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
  I0312 16:38:16.666742 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioned
    STEP: STEP-03: Approving pending CSRs for the new worker @ 03/12/26 16:38:16.666
  I0312 16:38:16.666883 50194 csr.go:23] Starting CSR approval monitoring for 10m0s
  I0312 16:41:16.770274 50194 csr.go:39] Approving CSR: csr-gjz7g
  I0312 16:41:16.870699 50194 csr.go:61] Approved CSR csr-gjz7g (total approved: 1)
  I0312 16:41:16.870791 50194 csr.go:69] Approved 1 CSRs this iteration, continuing to monitor (elapsed: 3m0.199850833s)
  I0312 16:42:16.819227 50194 csr.go:39] Approving CSR: csr-6v2fn
  I0312 16:42:16.922483 50194 csr.go:61] Approved CSR csr-6v2fn (total approved: 2)
  I0312 16:42:16.922520 50194 csr.go:69] Approved 1 CSRs this iteration, continuing to monitor (elapsed: 4m0.250732958s)
  I0312 16:42:16.922531 50194 csr.go:74] All 2 expected CSRs approved! (elapsed: 4m0.250745416s)
  I0312 16:42:16.922544 50194 csr.go:80] CSR approval monitoring complete: approved 2 CSRs in 4m0.250758875s
    STEP: STEP-04: Waiting for a new worker node to become Ready @ 03/12/26 16:42:16.922
  I0312 16:42:17.021811 50194 tna_extra_worker.go:185] New worker node extraworker-0 is Ready
    STEP: STEP-05: Verifying new node membership and worker label @ 03/12/26 16:42:17.021
    STEP: STEP-06: Verifying the new node has no pressure conditions @ 03/12/26 16:42:17.121
  I0312 16:42:17.121938 50194 tna_extra_worker.go:222] Node extraworker-0 has no pressure conditions
    STEP: STEP-07: Verifying a pod can be scheduled and run on the new node @ 03/12/26 16:42:17.121
  I0312 16:42:17.234400 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
  I0312 16:42:22.234409 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
  I0312 16:42:27.232393 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
  I0312 16:42:32.235067 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
  I0312 16:42:37.237223 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
  I0312 16:42:42.234168 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
  I0312 16:42:47.235259 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Succeeded
  I0312 16:42:47.294940 50194 tna_extra_worker.go:262] Pod successfully scheduled and ran on node extraworker-0
    STEP: STEP-08: Verifying the worker MachineConfigPool converges with the new node @ 03/12/26 16:42:47.36
  I0312 16:42:47.409102 50194 tna_extra_worker.go:284] Worker MCP: machineCount=1, readyMachineCount=1, updatedMachineCount=1, degradedMachineCount=0
  I0312 16:42:47.409332 50194 tna_extra_worker.go:292] Worker MachineConfigPool converged with 1 machines
    STEP: STEP-09: Verifying the new node's kubelet version matches existing nodes @ 03/12/26 16:42:47.409
  I0312 16:42:47.502612 50194 tna_extra_worker.go:305] Control-plane kubelet versions: map[v1.34.2:true], new worker kubelet version: v1.34.2
  I0312 16:42:47.502692 50194 tna_extra_worker.go:121] Extra worker extraworker-0 joined the cluster successfully
    STEP: CLEANUP: Deleting extra worker BMHs and waiting for Metal3 deprovisioning @ 03/12/26 16:42:47.504
  I0312 16:42:47.557551 50194 tna_extra_worker.go:397] Deleting extraworker BMH ostest-extraworker-0
  I0312 16:42:47.663250 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:43:02.664446 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:43:17.664284 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:43:32.668089 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:43:47.664791 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:44:02.664577 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:44:17.664681 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:44:32.666638 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:44:47.666547 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:45:02.665981 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:45:17.665362 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:45:32.665374 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:45:47.666957 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
^[[B^[[B  I0312 16:46:02.666800 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:46:17.668567 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:46:32.670286 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:46:47.669244 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:47:02.669048 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:47:17.671916 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:47:32.668483 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:47:47.671125 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:48:02.669652 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:48:17.670213 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:48:32.669910 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
  I0312 16:48:47.666651 50194 tna_extra_worker.go:410] BMH ostest-extraworker-0 deleted
    STEP: CLEANUP: Removing stale extra worker Node objects from the cluster @ 03/12/26 16:48:47.666
  I0312 16:48:47.763990 50194 tna_extra_worker.go:436] Deleting stale extra worker node extraworker-0
  • [888.047 seconds]
  ------------------------------

  Ran 1 of 1 Specs in 888.047 seconds
  SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
[
  {
    "name": "[sig-node][apigroup:config.openshift.io][OCPFeatureGate:HighlyAvailableArbiter][Serial] Extra worker scaling in HighlyAvailableArbiterMode should deploy an extra worker node that joins the cluster and becomes Ready [Timeout:60m][Slow]",
    "lifecycle": "blocking",
    "duration": 888061,
    "startTime": "2026-03-12 15:33:59.752713 UTC",
    "endTime": "2026-03-12 15:48:47.813882 UTC",
    "result": "passed",
    "output": "  STEP: Creating a kubernetes client @ 03/12/26 16:33:59.76\n  STEP: SETUP: Validating cluster topology is HighlyAvailableArbiter @ 03/12/26 16:33:59.76\nI0312 16:33:59.760992 50194 common.go:109] [precondition-check] validating cluster topology is HighlyAvailableArbiter\n  STEP: SETUP: Cleaning up stale extra worker BMHs from previous runs @ 03/12/26 16:33:59.809\n  STEP: SETUP: Initializing baremetal test helper @ 03/12/26 16:34:00.966\n  STEP: SETUP: Ensuring extra worker data is available @ 03/12/26 16:34:00.968\n  STEP: STEP-01: Recording initial values for the test @ 03/12/26 16:34:01.014\nI0312 16:34:01.169957 50194 tna_extra_worker.go:132] Initial state: 3 nodes, 0 machines in worker MCP\n  STEP: STEP-02: Provisioning extra worker BMH @ 03/12/26 16:34:01.17\n  STEP: wait until ostest-extraworker-0 becomes available @ 03/12/26 16:34:01.383\nI0312 16:36:56.543657 50194 tna_extra_worker.go:152] Extra worker BMH ostest-extraworker-0 deployed and reached available state\nI0312 16:36:56.614026 50194 tna_extra_worker.go:344] Patched BMH ostest-extraworker-0 with customDeploy=install_coreos, bootMode=UEFI, rootDeviceHints=/dev/sda, userData=worker-user-data-managed\nI0312 16:36:56.670014 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: available\nI0312 16:37:06.663389 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:37:16.663525 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:37:26.664017 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:37:36.662165 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:37:46.675360 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:37:56.668122 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:38:06.667721 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:38:16.666742 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioned\n  STEP: STEP-03: Approving pending CSRs for the new worker @ 03/12/26 16:38:16.666\nI0312 16:38:16.666883 50194 csr.go:23] Starting CSR approval monitoring for 10m0s\nI0312 16:41:16.770274 50194 csr.go:39] Approving CSR: csr-gjz7g\nI0312 16:41:16.870699 50194 csr.go:61] Approved CSR csr-gjz7g (total approved: 1)\nI0312 16:41:16.870791 50194 csr.go:69] Approved 1 CSRs this iteration, continuing to monitor (elapsed: 3m0.199850833s)\nI0312 16:42:16.819227 50194 csr.go:39] Approving CSR: csr-6v2fn\nI0312 16:42:16.922483 50194 csr.go:61] Approved CSR csr-6v2fn (total approved: 2)\nI0312 16:42:16.922520 50194 csr.go:69] Approved 1 CSRs this iteration, continuing to monitor (elapsed: 4m0.250732958s)\nI0312 16:42:16.922531 50194 csr.go:74] All 2 expected CSRs approved! (elapsed: 4m0.250745416s)\nI0312 16:42:16.922544 50194 csr.go:80] CSR approval monitoring complete: approved 2 CSRs in 4m0.250758875s\n  STEP: STEP-04: Waiting for a new worker node to become Ready @ 03/12/26 16:42:16.922\nI0312 16:42:17.021811 50194 tna_extra_worker.go:185] New worker node extraworker-0 is Ready\n  STEP: STEP-05: Verifying new node membership and worker label @ 03/12/26 16:42:17.021\n  STEP: STEP-06: Verifying the new node has no pressure conditions @ 03/12/26 16:42:17.121\nI0312 16:42:17.121938 50194 tna_extra_worker.go:222] Node extraworker-0 has no pressure conditions\n  STEP: STEP-07: Verifying a pod can be scheduled and run on the new node @ 03/12/26 16:42:17.121\nI0312 16:42:17.234400 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:22.234409 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:27.232393 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:32.235067 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:37.237223 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:42.234168 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:47.235259 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Succeeded\nI0312 16:42:47.294940 50194 tna_extra_worker.go:262] Pod successfully scheduled and ran on node extraworker-0\n  STEP: STEP-08: Verifying the worker MachineConfigPool converges with the new node @ 03/12/26 16:42:47.36\nI0312 16:42:47.409102 50194 tna_extra_worker.go:284] Worker MCP: machineCount=1, readyMachineCount=1, updatedMachineCount=1, degradedMachineCount=0\nI0312 16:42:47.409332 50194 tna_extra_worker.go:292] Worker MachineConfigPool converged with 1 machines\n  STEP: STEP-09: Verifying the new node's kubelet version matches existing nodes @ 03/12/26 16:42:47.409\nI0312 16:42:47.502612 50194 tna_extra_worker.go:305] Control-plane kubelet versions: map[v1.34.2:true], new worker kubelet version: v1.34.2\nI0312 16:42:47.502692 50194 tna_extra_worker.go:121] Extra worker extraworker-0 joined the cluster successfully\n  STEP: CLEANUP: Deleting extra worker BMHs and waiting for Metal3 deprovisioning @ 03/12/26 16:42:47.504\nI0312 16:42:47.557551 50194 tna_extra_worker.go:397] Deleting extraworker BMH ostest-extraworker-0\nI0312 16:42:47.663250 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:43:02.664446 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:43:17.664284 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:43:32.668089 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:43:47.664791 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:44:02.664577 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:44:17.664681 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:44:32.666638 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:44:47.666547 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:45:02.665981 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:45:17.665362 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:45:32.665374 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:45:47.666957 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:46:02.666800 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:46:17.668567 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:46:32.670286 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:46:47.669244 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:47:02.669048 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:47:17.671916 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:47:32.668483 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:47:47.671125 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:48:02.669652 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:48:17.670213 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:48:32.669910 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:48:47.666651 50194 tna_extra_worker.go:410] BMH ostest-extraworker-0 deleted\n  STEP: CLEANUP: Removing stale extra worker Node objects from the cluster @ 03/12/26 16:48:47.666\nI0312 16:48:47.763990 50194 tna_extra_worker.go:436] Deleting stale extra worker node extraworker-0\n"
  }
]

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->

## Summary by CodeRabbit

* **Tests**
  * Added end-to-end test coverage for dynamically adding worker nodes to two-node OpenShift clusters configured with highly available arbiter mode, including validation of bare metal provisioning, node readiness, pod scheduling, machine configuration pool convergence, and kubelet compatibility.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

pre-commit.check-secrets: ENABLED
@openshift-ci-robot
Copy link

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 13, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 13, 2026

@agullon: This pull request references OCPEDGE-2379 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Adding a new Test Case to create a extra worker on a TNA cluster as a 2 day operation.

Phase Step Description
Setup SETUP-01 Validate cluster topology is HighlyAvailableArbiter
Setup SETUP-02 Clean up stale extra worker BMHs from previous runs (30-min timeout)
Setup SETUP-03 Initialize baremetal test helper
Setup SETUP-04 Verify extra worker data is available (extraworkers-secret)
Test STEP-01 Record initial node count and worker MachineConfigPool state
Test STEP-02 Provision extra worker BMH from extraworkers-secret
Test STEP-03 Approve pending CSRs for the new worker (kubelet client + serving)
Test STEP-04 Wait for the new worker node to become Ready
Test STEP-05 Verify new node membership (node count +1) and worker label
Test STEP-06 Verify the new node has no pressure conditions (memory, disk, PID)
Test STEP-07 Verify a pod can be scheduled and run on the new node
Test STEP-08 Verify the worker MachineConfigPool converges with the new node
Test STEP-09 Verify the new node's kubelet version matches control-plane nodes
Cleanup CLEANUP-01 Delete extra worker BMHs and wait for Metal3 deprovisioning (30-min timeout)
Cleanup CLEANUP-02 Remove stale extra worker Node objects from the cluster

This is a successful log example:

$ ./openshift-tests run-test '[sig-node][apigroup:config.openshift.io][OCPFeatureGate:HighlyAvailableArbiter][Serial] Extra worker scaling in HighlyAvailableArbiterMode should deploy an extra worker node that joins the cluster and becomes Ready [Timeout:60m][Slow]'

 I0312 16:33:58.731594   50194 i18n.go:139] Couldn't find translations for C, using default
 I0312 16:33:58.866736   50194 binary.go:78] Found 8524 test specs
 I0312 16:33:58.867806   50194 binary.go:95] 1074 test specs remain, after filtering out k8s
openshift-tests v4.1.0-10870-gfbba1c2
 I0312 16:33:59.705199   50194 test_setup.go:125] Extended test version v4.1.0-10870-gfbba1c2
 I0312 16:33:59.705234   50194 test_context.go:559] Tolerating taints "node-role.kubernetes.io/control-plane" when considering if nodes are ready
 I0312 16:33:59.705589 50194 framework.go:2324] [precondition-check] checking if cluster is MicroShift
 I0312 16:33:59.752027 50194 framework.go:2348] IsMicroShiftCluster: microshift-version configmap not found, not MicroShift
 I0312 16:33:59.752470   50194 binary.go:122] Loaded test configuration: &framework.TestContextType{KubeConfig:"/Users/agullon/workspace/two-node-toolbox/deploy/openshift-clusters/kubeconfig", KubeContext:"", KubeAPIContentType:"application/vnd.kubernetes.protobuf", KubeletRootDir:"/var/lib/kubelet", KubeletConfigDropinDir:"", CertDir:"", Host:"https://api.ostest.test.metalkube.org:6443", BearerToken:"<redacted>", RepoRoot:"../../", ListImages:false, listTests:false, listLabels:false, ListConformanceTests:false, Provider:"skeleton", Tooling:"", timeouts:framework.TimeoutContext{Poll:2000000000, PodStart:300000000000, PodStartShort:120000000000, PodStartSlow:900000000000, PodDelete:300000000000, ClaimProvision:300000000000, DataSourceProvision:300000000000, ClaimProvisionShort:60000000000, ClaimBound:180000000000, PVReclaim:180000000000, PVBound:180000000000, PVCreate:180000000000, PVDelete:300000000000, PVDeleteSlow:1200000000000, SnapshotCreate:300000000000, SnapshotDelete:300000000000, SnapshotControllerMetrics:300000000000, SystemPodsStartup:600000000000, NodeSchedulable:1800000000000, SystemDaemonsetStartup:300000000000, NodeNotReady:180000000000}, CloudConfig:framework.CloudConfig{APIEndpoint:"", ProjectID:"", Zone:"", Zones:[]string{}, Region:"", MultiZone:false, MultiMaster:true, Cluster:"", MasterName:"", NodeInstanceGroup:"", NumNodes:2, ClusterIPRange:"", ClusterTag:"", Network:"", ConfigFile:"", NodeTag:"", MasterTag:"", Provider:framework.NullProvider{}}, KubectlPath:"kubectl", OutputDir:"/tmp", ReportDir:"", ReportPrefix:"", ReportCompleteGinkgo:false, ReportCompleteJUnit:false, Prefix:"e2e", MinStartupPods:-1, EtcdUpgradeStorage:"", EtcdUpgradeVersion:"", GCEUpgradeScript:"", ContainerRuntimeEndpoint:"unix:///run/containerd/containerd.sock", ContainerRuntimeProcessName:"containerd", ContainerRuntimePidFile:"/run/containerd/containerd.pid", SystemdServices:"containerd*", DumpSystemdJournal:false, ImageServiceEndpoint:"", MasterOSDistro:"custom", NodeOSDistro:"custom", NodeOSArch:"amd64", VerifyServiceAccount:true, DeleteNamespace:true, DeleteNamespaceOnFailure:true, AllowedNotReadyNodes:-1, CleanStart:false, GatherKubeSystemResourceUsageData:"false", GatherLogsSizes:false, GatherMetricsAfterTest:"false", GatherSuiteMetricsAfterTest:false, MaxNodesToGather:0, IncludeClusterAutoscalerMetrics:false, OutputPrintType:"json", CreateTestingNS:(framework.CreateTestingNSFn)(0x107bda660), DumpLogsOnFailure:true, DisableLogDump:false, LogexporterGCSPath:"", NodeTestContextType:framework.NodeTestContextType{NodeE2E:false, NodeName:"", NodeConformance:false, PrepullImages:false, ImageDescription:"", RuntimeConfig:map[string]string(nil), SystemSpecName:"", RestartKubelet:false, ExtraEnvs:map[string]string(nil), StandaloneMode:false, CriProxyEnabled:false}, ClusterDNSDomain:"cluster.local", NodeKiller:framework.NodeKillerConfig{Enabled:false, FailureRatio:0.01, Interval:60000000000, JitterFactor:60, SimulatedDowntime:600000000000, NodeKillerStopCtx:context.Context(nil), NodeKillerStop:(func())(nil)}, IPFamily:"ipv4", NonblockingTaints:"node-role.kubernetes.io/control-plane", ProgressReportURL:"", SriovdpConfigMapFile:"", SpecSummaryOutput:"", DockerConfigFile:"", E2EDockerConfigFile:"", KubeTestRepoList:"", SnapshotControllerPodName:"", SnapshotControllerHTTPPort:0, RequireDevices:false, EnabledVolumeDrivers:[]string(nil)}
 Running Suite:  - /Users/agullon/workspace/origin
 =================================================
 Random Seed: 1773329638 - will randomize all specs

 Will run 1 of 1 specs
 ------------------------------
 [sig-node][apigroup:config.openshift.io][OCPFeatureGate:HighlyAvailableArbiter][Serial] Extra worker scaling in HighlyAvailableArbiterMode should deploy an extra worker node that joins the cluster and becomes Ready [Timeout:60m][Slow]
 github.com/openshift/origin/test/extended/two_node/tna_extra_worker.go:93
   STEP: Creating a kubernetes client @ 03/12/26 16:33:59.76
 I0312 16:33:59.760839   50194 discovery.go:214] Invalidating discovery information
   STEP: SETUP: Validating cluster topology is HighlyAvailableArbiter @ 03/12/26 16:33:59.76
 I0312 16:33:59.760992 50194 common.go:109] [precondition-check] validating cluster topology is HighlyAvailableArbiter
   STEP: SETUP: Cleaning up stale extra worker BMHs from previous runs @ 03/12/26 16:33:59.809
   STEP: SETUP: Initializing baremetal test helper @ 03/12/26 16:34:00.966
   STEP: SETUP: Ensuring extra worker data is available @ 03/12/26 16:34:00.968
   STEP: STEP-01: Recording initial values for the test @ 03/12/26 16:34:01.014
 I0312 16:34:01.169957 50194 tna_extra_worker.go:132] Initial state: 3 nodes, 0 machines in worker MCP
   STEP: STEP-02: Provisioning extra worker BMH @ 03/12/26 16:34:01.17
   STEP: wait until ostest-extraworker-0 becomes available @ 03/12/26 16:34:01.383
 I0312 16:36:56.543657 50194 tna_extra_worker.go:152] Extra worker BMH ostest-extraworker-0 deployed and reached available state
 I0312 16:36:56.614026 50194 tna_extra_worker.go:344] Patched BMH ostest-extraworker-0 with customDeploy=install_coreos, bootMode=UEFI, rootDeviceHints=/dev/sda, userData=worker-user-data-managed
 I0312 16:36:56.670014 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: available
 I0312 16:37:06.663389 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
 I0312 16:37:16.663525 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
 I0312 16:37:26.664017 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
 I0312 16:37:36.662165 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
 I0312 16:37:46.675360 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
 I0312 16:37:56.668122 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
 I0312 16:38:06.667721 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
 I0312 16:38:16.666742 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioned
   STEP: STEP-03: Approving pending CSRs for the new worker @ 03/12/26 16:38:16.666
 I0312 16:38:16.666883 50194 csr.go:23] Starting CSR approval monitoring for 10m0s
 I0312 16:41:16.770274 50194 csr.go:39] Approving CSR: csr-gjz7g
 I0312 16:41:16.870699 50194 csr.go:61] Approved CSR csr-gjz7g (total approved: 1)
 I0312 16:41:16.870791 50194 csr.go:69] Approved 1 CSRs this iteration, continuing to monitor (elapsed: 3m0.199850833s)
 I0312 16:42:16.819227 50194 csr.go:39] Approving CSR: csr-6v2fn
 I0312 16:42:16.922483 50194 csr.go:61] Approved CSR csr-6v2fn (total approved: 2)
 I0312 16:42:16.922520 50194 csr.go:69] Approved 1 CSRs this iteration, continuing to monitor (elapsed: 4m0.250732958s)
 I0312 16:42:16.922531 50194 csr.go:74] All 2 expected CSRs approved! (elapsed: 4m0.250745416s)
 I0312 16:42:16.922544 50194 csr.go:80] CSR approval monitoring complete: approved 2 CSRs in 4m0.250758875s
   STEP: STEP-04: Waiting for a new worker node to become Ready @ 03/12/26 16:42:16.922
 I0312 16:42:17.021811 50194 tna_extra_worker.go:185] New worker node extraworker-0 is Ready
   STEP: STEP-05: Verifying new node membership and worker label @ 03/12/26 16:42:17.021
   STEP: STEP-06: Verifying the new node has no pressure conditions @ 03/12/26 16:42:17.121
 I0312 16:42:17.121938 50194 tna_extra_worker.go:222] Node extraworker-0 has no pressure conditions
   STEP: STEP-07: Verifying a pod can be scheduled and run on the new node @ 03/12/26 16:42:17.121
 I0312 16:42:17.234400 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
 I0312 16:42:22.234409 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
 I0312 16:42:27.232393 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
 I0312 16:42:32.235067 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
 I0312 16:42:37.237223 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
 I0312 16:42:42.234168 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
 I0312 16:42:47.235259 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Succeeded
 I0312 16:42:47.294940 50194 tna_extra_worker.go:262] Pod successfully scheduled and ran on node extraworker-0
   STEP: STEP-08: Verifying the worker MachineConfigPool converges with the new node @ 03/12/26 16:42:47.36
 I0312 16:42:47.409102 50194 tna_extra_worker.go:284] Worker MCP: machineCount=1, readyMachineCount=1, updatedMachineCount=1, degradedMachineCount=0
 I0312 16:42:47.409332 50194 tna_extra_worker.go:292] Worker MachineConfigPool converged with 1 machines
   STEP: STEP-09: Verifying the new node's kubelet version matches existing nodes @ 03/12/26 16:42:47.409
 I0312 16:42:47.502612 50194 tna_extra_worker.go:305] Control-plane kubelet versions: map[v1.34.2:true], new worker kubelet version: v1.34.2
 I0312 16:42:47.502692 50194 tna_extra_worker.go:121] Extra worker extraworker-0 joined the cluster successfully
   STEP: CLEANUP: Deleting extra worker BMHs and waiting for Metal3 deprovisioning @ 03/12/26 16:42:47.504
 I0312 16:42:47.557551 50194 tna_extra_worker.go:397] Deleting extraworker BMH ostest-extraworker-0
 I0312 16:42:47.663250 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:43:02.664446 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:43:17.664284 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:43:32.668089 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:43:47.664791 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:44:02.664577 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:44:17.664681 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:44:32.666638 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:44:47.666547 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:45:02.665981 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:45:17.665362 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:45:32.665374 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:45:47.666957 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
^[[B^[[B  I0312 16:46:02.666800 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:46:17.668567 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:46:32.670286 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:46:47.669244 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:47:02.669048 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:47:17.671916 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:47:32.668483 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:47:47.671125 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:48:02.669652 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:48:17.670213 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:48:32.669910 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:48:47.666651 50194 tna_extra_worker.go:410] BMH ostest-extraworker-0 deleted
   STEP: CLEANUP: Removing stale extra worker Node objects from the cluster @ 03/12/26 16:48:47.666
 I0312 16:48:47.763990 50194 tna_extra_worker.go:436] Deleting stale extra worker node extraworker-0
 • [888.047 seconds]
 ------------------------------

 Ran 1 of 1 Specs in 888.047 seconds
 SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
[
 {
   "name": "[sig-node][apigroup:config.openshift.io][OCPFeatureGate:HighlyAvailableArbiter][Serial] Extra worker scaling in HighlyAvailableArbiterMode should deploy an extra worker node that joins the cluster and becomes Ready [Timeout:60m][Slow]",
   "lifecycle": "blocking",
   "duration": 888061,
   "startTime": "2026-03-12 15:33:59.752713 UTC",
   "endTime": "2026-03-12 15:48:47.813882 UTC",
   "result": "passed",
   "output": "  STEP: Creating a kubernetes client @ 03/12/26 16:33:59.76\n  STEP: SETUP: Validating cluster topology is HighlyAvailableArbiter @ 03/12/26 16:33:59.76\nI0312 16:33:59.760992 50194 common.go:109] [precondition-check] validating cluster topology is HighlyAvailableArbiter\n  STEP: SETUP: Cleaning up stale extra worker BMHs from previous runs @ 03/12/26 16:33:59.809\n  STEP: SETUP: Initializing baremetal test helper @ 03/12/26 16:34:00.966\n  STEP: SETUP: Ensuring extra worker data is available @ 03/12/26 16:34:00.968\n  STEP: STEP-01: Recording initial values for the test @ 03/12/26 16:34:01.014\nI0312 16:34:01.169957 50194 tna_extra_worker.go:132] Initial state: 3 nodes, 0 machines in worker MCP\n  STEP: STEP-02: Provisioning extra worker BMH @ 03/12/26 16:34:01.17\n  STEP: wait until ostest-extraworker-0 becomes available @ 03/12/26 16:34:01.383\nI0312 16:36:56.543657 50194 tna_extra_worker.go:152] Extra worker BMH ostest-extraworker-0 deployed and reached available state\nI0312 16:36:56.614026 50194 tna_extra_worker.go:344] Patched BMH ostest-extraworker-0 with customDeploy=install_coreos, bootMode=UEFI, rootDeviceHints=/dev/sda, userData=worker-user-data-managed\nI0312 16:36:56.670014 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: available\nI0312 16:37:06.663389 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:37:16.663525 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:37:26.664017 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:37:36.662165 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:37:46.675360 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:37:56.668122 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:38:06.667721 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:38:16.666742 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioned\n  STEP: STEP-03: Approving pending CSRs for the new worker @ 03/12/26 16:38:16.666\nI0312 16:38:16.666883 50194 csr.go:23] Starting CSR approval monitoring for 10m0s\nI0312 16:41:16.770274 50194 csr.go:39] Approving CSR: csr-gjz7g\nI0312 16:41:16.870699 50194 csr.go:61] Approved CSR csr-gjz7g (total approved: 1)\nI0312 16:41:16.870791 50194 csr.go:69] Approved 1 CSRs this iteration, continuing to monitor (elapsed: 3m0.199850833s)\nI0312 16:42:16.819227 50194 csr.go:39] Approving CSR: csr-6v2fn\nI0312 16:42:16.922483 50194 csr.go:61] Approved CSR csr-6v2fn (total approved: 2)\nI0312 16:42:16.922520 50194 csr.go:69] Approved 1 CSRs this iteration, continuing to monitor (elapsed: 4m0.250732958s)\nI0312 16:42:16.922531 50194 csr.go:74] All 2 expected CSRs approved! (elapsed: 4m0.250745416s)\nI0312 16:42:16.922544 50194 csr.go:80] CSR approval monitoring complete: approved 2 CSRs in 4m0.250758875s\n  STEP: STEP-04: Waiting for a new worker node to become Ready @ 03/12/26 16:42:16.922\nI0312 16:42:17.021811 50194 tna_extra_worker.go:185] New worker node extraworker-0 is Ready\n  STEP: STEP-05: Verifying new node membership and worker label @ 03/12/26 16:42:17.021\n  STEP: STEP-06: Verifying the new node has no pressure conditions @ 03/12/26 16:42:17.121\nI0312 16:42:17.121938 50194 tna_extra_worker.go:222] Node extraworker-0 has no pressure conditions\n  STEP: STEP-07: Verifying a pod can be scheduled and run on the new node @ 03/12/26 16:42:17.121\nI0312 16:42:17.234400 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:22.234409 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:27.232393 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:32.235067 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:37.237223 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:42.234168 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:47.235259 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Succeeded\nI0312 16:42:47.294940 50194 tna_extra_worker.go:262] Pod successfully scheduled and ran on node extraworker-0\n  STEP: STEP-08: Verifying the worker MachineConfigPool converges with the new node @ 03/12/26 16:42:47.36\nI0312 16:42:47.409102 50194 tna_extra_worker.go:284] Worker MCP: machineCount=1, readyMachineCount=1, updatedMachineCount=1, degradedMachineCount=0\nI0312 16:42:47.409332 50194 tna_extra_worker.go:292] Worker MachineConfigPool converged with 1 machines\n  STEP: STEP-09: Verifying the new node's kubelet version matches existing nodes @ 03/12/26 16:42:47.409\nI0312 16:42:47.502612 50194 tna_extra_worker.go:305] Control-plane kubelet versions: map[v1.34.2:true], new worker kubelet version: v1.34.2\nI0312 16:42:47.502692 50194 tna_extra_worker.go:121] Extra worker extraworker-0 joined the cluster successfully\n  STEP: CLEANUP: Deleting extra worker BMHs and waiting for Metal3 deprovisioning @ 03/12/26 16:42:47.504\nI0312 16:42:47.557551 50194 tna_extra_worker.go:397] Deleting extraworker BMH ostest-extraworker-0\nI0312 16:42:47.663250 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:43:02.664446 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:43:17.664284 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:43:32.668089 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:43:47.664791 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:44:02.664577 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:44:17.664681 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:44:32.666638 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:44:47.666547 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:45:02.665981 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:45:17.665362 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:45:32.665374 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:45:47.666957 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:46:02.666800 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:46:17.668567 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:46:32.670286 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:46:47.669244 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:47:02.669048 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:47:17.671916 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:47:32.668483 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:47:47.671125 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:48:02.669652 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:48:17.670213 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:48:32.669910 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:48:47.666651 50194 tna_extra_worker.go:410] BMH ostest-extraworker-0 deleted\n  STEP: CLEANUP: Removing stale extra worker Node objects from the cluster @ 03/12/26 16:48:47.666\nI0312 16:48:47.763990 50194 tna_extra_worker.go:436] Deleting stale extra worker node extraworker-0\n"
 }
]

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

Walkthrough

Introduces a new end-to-end test file for extending a two-node OpenShift cluster with an extra worker node under HighlyAvailableArbiter mode. The test covers BMH provisioning, CSR approval, node readiness validation, MCP convergence, and pod scheduling capabilities with comprehensive error handling and logging.

Changes

Cohort / File(s) Summary
Extra Worker Test
test/extended/two_node/tna_extra_worker.go
New end-to-end test file (451 lines) covering dynamic worker scaling including BMH orchestration, CSR handling, node readiness checks, MCP convergence validation, pod scheduling tests, and kubelet version alignment with robust cleanup and extended timeouts.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 56.25% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning Test violates assertion message requirement #4 with five missing meaningful failure messages across recordInitialState, verifyNodeMembership, verifyKubeletVersion, and triggerBMHProvisioning functions. Add descriptive failure messages to all five assertions: line 129 'failed to list initial nodes', line 200 'failed to list final nodes', line 205 'failed to get new worker node', line 297 'failed to get control-plane nodes', line 339 'failed to marshal BMH patch'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly summarizes the main change: introducing a new test for adding an extra worker on a TNA cluster, which matches the file addition and test implementation.
Stable And Deterministic Test Names ✅ Passed Test file contains stable, deterministic test names without dynamic content like node names, timestamps, or UUIDs.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.3)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 13, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: agullon
Once this PR has been reviewed and has the lgtm label, please assign eggfoobar for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 13, 2026

@agullon: This pull request references OCPEDGE-2379 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Adding a new Test Case to create a extra worker on a TNA cluster as a 2 day operation.

Phase Step Description
Setup SETUP-01 Validate cluster topology is HighlyAvailableArbiter
Setup SETUP-02 Clean up stale extra worker BMHs from previous runs (30-min timeout)
Setup SETUP-03 Initialize baremetal test helper
Setup SETUP-04 Verify extra worker data is available (extraworkers-secret)
Test STEP-01 Record initial node count and worker MachineConfigPool state
Test STEP-02 Provision extra worker BMH from extraworkers-secret
Test STEP-03 Approve pending CSRs for the new worker (kubelet client + serving)
Test STEP-04 Wait for the new worker node to become Ready
Test STEP-05 Verify new node membership (node count +1) and worker label
Test STEP-06 Verify the new node has no pressure conditions (memory, disk, PID)
Test STEP-07 Verify a pod can be scheduled and run on the new node
Test STEP-08 Verify the worker MachineConfigPool converges with the new node
Test STEP-09 Verify the new node's kubelet version matches control-plane nodes
Cleanup CLEANUP-01 Delete extra worker BMHs and wait for Metal3 deprovisioning (30-min timeout)
Cleanup CLEANUP-02 Remove stale extra worker Node objects from the cluster

This is a successful log example:

$ ./openshift-tests run-test '[sig-node][apigroup:config.openshift.io][OCPFeatureGate:HighlyAvailableArbiter][Serial] Extra worker scaling in HighlyAvailableArbiterMode should deploy an extra worker node that joins the cluster and becomes Ready [Timeout:60m][Slow]'

 I0312 16:33:58.731594   50194 i18n.go:139] Couldn't find translations for C, using default
 I0312 16:33:58.866736   50194 binary.go:78] Found 8524 test specs
 I0312 16:33:58.867806   50194 binary.go:95] 1074 test specs remain, after filtering out k8s
openshift-tests v4.1.0-10870-gfbba1c2
 I0312 16:33:59.705199   50194 test_setup.go:125] Extended test version v4.1.0-10870-gfbba1c2
 I0312 16:33:59.705234   50194 test_context.go:559] Tolerating taints "node-role.kubernetes.io/control-plane" when considering if nodes are ready
 I0312 16:33:59.705589 50194 framework.go:2324] [precondition-check] checking if cluster is MicroShift
 I0312 16:33:59.752027 50194 framework.go:2348] IsMicroShiftCluster: microshift-version configmap not found, not MicroShift
 I0312 16:33:59.752470   50194 binary.go:122] Loaded test configuration: &framework.TestContextType{KubeConfig:"/Users/agullon/workspace/two-node-toolbox/deploy/openshift-clusters/kubeconfig", KubeContext:"", KubeAPIContentType:"application/vnd.kubernetes.protobuf", KubeletRootDir:"/var/lib/kubelet", KubeletConfigDropinDir:"", CertDir:"", Host:"https://api.ostest.test.metalkube.org:6443", BearerToken:"<redacted>", RepoRoot:"../../", ListImages:false, listTests:false, listLabels:false, ListConformanceTests:false, Provider:"skeleton", Tooling:"", timeouts:framework.TimeoutContext{Poll:2000000000, PodStart:300000000000, PodStartShort:120000000000, PodStartSlow:900000000000, PodDelete:300000000000, ClaimProvision:300000000000, DataSourceProvision:300000000000, ClaimProvisionShort:60000000000, ClaimBound:180000000000, PVReclaim:180000000000, PVBound:180000000000, PVCreate:180000000000, PVDelete:300000000000, PVDeleteSlow:1200000000000, SnapshotCreate:300000000000, SnapshotDelete:300000000000, SnapshotControllerMetrics:300000000000, SystemPodsStartup:600000000000, NodeSchedulable:1800000000000, SystemDaemonsetStartup:300000000000, NodeNotReady:180000000000}, CloudConfig:framework.CloudConfig{APIEndpoint:"", ProjectID:"", Zone:"", Zones:[]string{}, Region:"", MultiZone:false, MultiMaster:true, Cluster:"", MasterName:"", NodeInstanceGroup:"", NumNodes:2, ClusterIPRange:"", ClusterTag:"", Network:"", ConfigFile:"", NodeTag:"", MasterTag:"", Provider:framework.NullProvider{}}, KubectlPath:"kubectl", OutputDir:"/tmp", ReportDir:"", ReportPrefix:"", ReportCompleteGinkgo:false, ReportCompleteJUnit:false, Prefix:"e2e", MinStartupPods:-1, EtcdUpgradeStorage:"", EtcdUpgradeVersion:"", GCEUpgradeScript:"", ContainerRuntimeEndpoint:"unix:///run/containerd/containerd.sock", ContainerRuntimeProcessName:"containerd", ContainerRuntimePidFile:"/run/containerd/containerd.pid", SystemdServices:"containerd*", DumpSystemdJournal:false, ImageServiceEndpoint:"", MasterOSDistro:"custom", NodeOSDistro:"custom", NodeOSArch:"amd64", VerifyServiceAccount:true, DeleteNamespace:true, DeleteNamespaceOnFailure:true, AllowedNotReadyNodes:-1, CleanStart:false, GatherKubeSystemResourceUsageData:"false", GatherLogsSizes:false, GatherMetricsAfterTest:"false", GatherSuiteMetricsAfterTest:false, MaxNodesToGather:0, IncludeClusterAutoscalerMetrics:false, OutputPrintType:"json", CreateTestingNS:(framework.CreateTestingNSFn)(0x107bda660), DumpLogsOnFailure:true, DisableLogDump:false, LogexporterGCSPath:"", NodeTestContextType:framework.NodeTestContextType{NodeE2E:false, NodeName:"", NodeConformance:false, PrepullImages:false, ImageDescription:"", RuntimeConfig:map[string]string(nil), SystemSpecName:"", RestartKubelet:false, ExtraEnvs:map[string]string(nil), StandaloneMode:false, CriProxyEnabled:false}, ClusterDNSDomain:"cluster.local", NodeKiller:framework.NodeKillerConfig{Enabled:false, FailureRatio:0.01, Interval:60000000000, JitterFactor:60, SimulatedDowntime:600000000000, NodeKillerStopCtx:context.Context(nil), NodeKillerStop:(func())(nil)}, IPFamily:"ipv4", NonblockingTaints:"node-role.kubernetes.io/control-plane", ProgressReportURL:"", SriovdpConfigMapFile:"", SpecSummaryOutput:"", DockerConfigFile:"", E2EDockerConfigFile:"", KubeTestRepoList:"", SnapshotControllerPodName:"", SnapshotControllerHTTPPort:0, RequireDevices:false, EnabledVolumeDrivers:[]string(nil)}
 Running Suite:  - /Users/agullon/workspace/origin
 =================================================
 Random Seed: 1773329638 - will randomize all specs

 Will run 1 of 1 specs
 ------------------------------
 [sig-node][apigroup:config.openshift.io][OCPFeatureGate:HighlyAvailableArbiter][Serial] Extra worker scaling in HighlyAvailableArbiterMode should deploy an extra worker node that joins the cluster and becomes Ready [Timeout:60m][Slow]
 github.com/openshift/origin/test/extended/two_node/tna_extra_worker.go:93
   STEP: Creating a kubernetes client @ 03/12/26 16:33:59.76
 I0312 16:33:59.760839   50194 discovery.go:214] Invalidating discovery information
   STEP: SETUP: Validating cluster topology is HighlyAvailableArbiter @ 03/12/26 16:33:59.76
 I0312 16:33:59.760992 50194 common.go:109] [precondition-check] validating cluster topology is HighlyAvailableArbiter
   STEP: SETUP: Cleaning up stale extra worker BMHs from previous runs @ 03/12/26 16:33:59.809
   STEP: SETUP: Initializing baremetal test helper @ 03/12/26 16:34:00.966
   STEP: SETUP: Ensuring extra worker data is available @ 03/12/26 16:34:00.968
   STEP: STEP-01: Recording initial values for the test @ 03/12/26 16:34:01.014
 I0312 16:34:01.169957 50194 tna_extra_worker.go:132] Initial state: 3 nodes, 0 machines in worker MCP
   STEP: STEP-02: Provisioning extra worker BMH @ 03/12/26 16:34:01.17
   STEP: wait until ostest-extraworker-0 becomes available @ 03/12/26 16:34:01.383
 I0312 16:36:56.543657 50194 tna_extra_worker.go:152] Extra worker BMH ostest-extraworker-0 deployed and reached available state
 I0312 16:36:56.614026 50194 tna_extra_worker.go:344] Patched BMH ostest-extraworker-0 with customDeploy=install_coreos, bootMode=UEFI, rootDeviceHints=/dev/sda, userData=worker-user-data-managed
 I0312 16:36:56.670014 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: available
 I0312 16:37:06.663389 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
 I0312 16:37:16.663525 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
 I0312 16:37:26.664017 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
 I0312 16:37:36.662165 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
 I0312 16:37:46.675360 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
 I0312 16:37:56.668122 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
 I0312 16:38:06.667721 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning
 I0312 16:38:16.666742 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioned
   STEP: STEP-03: Approving pending CSRs for the new worker @ 03/12/26 16:38:16.666
 I0312 16:38:16.666883 50194 csr.go:23] Starting CSR approval monitoring for 10m0s
 I0312 16:41:16.770274 50194 csr.go:39] Approving CSR: csr-gjz7g
 I0312 16:41:16.870699 50194 csr.go:61] Approved CSR csr-gjz7g (total approved: 1)
 I0312 16:41:16.870791 50194 csr.go:69] Approved 1 CSRs this iteration, continuing to monitor (elapsed: 3m0.199850833s)
 I0312 16:42:16.819227 50194 csr.go:39] Approving CSR: csr-6v2fn
 I0312 16:42:16.922483 50194 csr.go:61] Approved CSR csr-6v2fn (total approved: 2)
 I0312 16:42:16.922520 50194 csr.go:69] Approved 1 CSRs this iteration, continuing to monitor (elapsed: 4m0.250732958s)
 I0312 16:42:16.922531 50194 csr.go:74] All 2 expected CSRs approved! (elapsed: 4m0.250745416s)
 I0312 16:42:16.922544 50194 csr.go:80] CSR approval monitoring complete: approved 2 CSRs in 4m0.250758875s
   STEP: STEP-04: Waiting for a new worker node to become Ready @ 03/12/26 16:42:16.922
 I0312 16:42:17.021811 50194 tna_extra_worker.go:185] New worker node extraworker-0 is Ready
   STEP: STEP-05: Verifying new node membership and worker label @ 03/12/26 16:42:17.021
   STEP: STEP-06: Verifying the new node has no pressure conditions @ 03/12/26 16:42:17.121
 I0312 16:42:17.121938 50194 tna_extra_worker.go:222] Node extraworker-0 has no pressure conditions
   STEP: STEP-07: Verifying a pod can be scheduled and run on the new node @ 03/12/26 16:42:17.121
 I0312 16:42:17.234400 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
 I0312 16:42:22.234409 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
 I0312 16:42:27.232393 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
 I0312 16:42:32.235067 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
 I0312 16:42:37.237223 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
 I0312 16:42:42.234168 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending
 I0312 16:42:47.235259 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Succeeded
 I0312 16:42:47.294940 50194 tna_extra_worker.go:262] Pod successfully scheduled and ran on node extraworker-0
   STEP: STEP-08: Verifying the worker MachineConfigPool converges with the new node @ 03/12/26 16:42:47.36
 I0312 16:42:47.409102 50194 tna_extra_worker.go:284] Worker MCP: machineCount=1, readyMachineCount=1, updatedMachineCount=1, degradedMachineCount=0
 I0312 16:42:47.409332 50194 tna_extra_worker.go:292] Worker MachineConfigPool converged with 1 machines
   STEP: STEP-09: Verifying the new node's kubelet version matches existing nodes @ 03/12/26 16:42:47.409
 I0312 16:42:47.502612 50194 tna_extra_worker.go:305] Control-plane kubelet versions: map[v1.34.2:true], new worker kubelet version: v1.34.2
 I0312 16:42:47.502692 50194 tna_extra_worker.go:121] Extra worker extraworker-0 joined the cluster successfully
   STEP: CLEANUP: Deleting extra worker BMHs and waiting for Metal3 deprovisioning @ 03/12/26 16:42:47.504
 I0312 16:42:47.557551 50194 tna_extra_worker.go:397] Deleting extraworker BMH ostest-extraworker-0
 I0312 16:42:47.663250 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:43:02.664446 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:43:17.664284 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:43:32.668089 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:43:47.664791 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:44:02.664577 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:44:17.664681 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:44:32.666638 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:44:47.666547 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:45:02.665981 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:45:17.665362 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:45:32.665374 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:45:47.666957 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
^[[B^[[B  I0312 16:46:02.666800 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:46:17.668567 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:46:32.670286 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:46:47.669244 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:47:02.669048 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:47:17.671916 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:47:32.668483 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:47:47.671125 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:48:02.669652 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:48:17.670213 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:48:32.669910 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...
 I0312 16:48:47.666651 50194 tna_extra_worker.go:410] BMH ostest-extraworker-0 deleted
   STEP: CLEANUP: Removing stale extra worker Node objects from the cluster @ 03/12/26 16:48:47.666
 I0312 16:48:47.763990 50194 tna_extra_worker.go:436] Deleting stale extra worker node extraworker-0
 • [888.047 seconds]
 ------------------------------

 Ran 1 of 1 Specs in 888.047 seconds
 SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
[
 {
   "name": "[sig-node][apigroup:config.openshift.io][OCPFeatureGate:HighlyAvailableArbiter][Serial] Extra worker scaling in HighlyAvailableArbiterMode should deploy an extra worker node that joins the cluster and becomes Ready [Timeout:60m][Slow]",
   "lifecycle": "blocking",
   "duration": 888061,
   "startTime": "2026-03-12 15:33:59.752713 UTC",
   "endTime": "2026-03-12 15:48:47.813882 UTC",
   "result": "passed",
   "output": "  STEP: Creating a kubernetes client @ 03/12/26 16:33:59.76\n  STEP: SETUP: Validating cluster topology is HighlyAvailableArbiter @ 03/12/26 16:33:59.76\nI0312 16:33:59.760992 50194 common.go:109] [precondition-check] validating cluster topology is HighlyAvailableArbiter\n  STEP: SETUP: Cleaning up stale extra worker BMHs from previous runs @ 03/12/26 16:33:59.809\n  STEP: SETUP: Initializing baremetal test helper @ 03/12/26 16:34:00.966\n  STEP: SETUP: Ensuring extra worker data is available @ 03/12/26 16:34:00.968\n  STEP: STEP-01: Recording initial values for the test @ 03/12/26 16:34:01.014\nI0312 16:34:01.169957 50194 tna_extra_worker.go:132] Initial state: 3 nodes, 0 machines in worker MCP\n  STEP: STEP-02: Provisioning extra worker BMH @ 03/12/26 16:34:01.17\n  STEP: wait until ostest-extraworker-0 becomes available @ 03/12/26 16:34:01.383\nI0312 16:36:56.543657 50194 tna_extra_worker.go:152] Extra worker BMH ostest-extraworker-0 deployed and reached available state\nI0312 16:36:56.614026 50194 tna_extra_worker.go:344] Patched BMH ostest-extraworker-0 with customDeploy=install_coreos, bootMode=UEFI, rootDeviceHints=/dev/sda, userData=worker-user-data-managed\nI0312 16:36:56.670014 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: available\nI0312 16:37:06.663389 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:37:16.663525 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:37:26.664017 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:37:36.662165 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:37:46.675360 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:37:56.668122 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:38:06.667721 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioning\nI0312 16:38:16.666742 50194 tna_extra_worker.go:362] BMH ostest-extraworker-0 provisioning state: provisioned\n  STEP: STEP-03: Approving pending CSRs for the new worker @ 03/12/26 16:38:16.666\nI0312 16:38:16.666883 50194 csr.go:23] Starting CSR approval monitoring for 10m0s\nI0312 16:41:16.770274 50194 csr.go:39] Approving CSR: csr-gjz7g\nI0312 16:41:16.870699 50194 csr.go:61] Approved CSR csr-gjz7g (total approved: 1)\nI0312 16:41:16.870791 50194 csr.go:69] Approved 1 CSRs this iteration, continuing to monitor (elapsed: 3m0.199850833s)\nI0312 16:42:16.819227 50194 csr.go:39] Approving CSR: csr-6v2fn\nI0312 16:42:16.922483 50194 csr.go:61] Approved CSR csr-6v2fn (total approved: 2)\nI0312 16:42:16.922520 50194 csr.go:69] Approved 1 CSRs this iteration, continuing to monitor (elapsed: 4m0.250732958s)\nI0312 16:42:16.922531 50194 csr.go:74] All 2 expected CSRs approved! (elapsed: 4m0.250745416s)\nI0312 16:42:16.922544 50194 csr.go:80] CSR approval monitoring complete: approved 2 CSRs in 4m0.250758875s\n  STEP: STEP-04: Waiting for a new worker node to become Ready @ 03/12/26 16:42:16.922\nI0312 16:42:17.021811 50194 tna_extra_worker.go:185] New worker node extraworker-0 is Ready\n  STEP: STEP-05: Verifying new node membership and worker label @ 03/12/26 16:42:17.021\n  STEP: STEP-06: Verifying the new node has no pressure conditions @ 03/12/26 16:42:17.121\nI0312 16:42:17.121938 50194 tna_extra_worker.go:222] Node extraworker-0 has no pressure conditions\n  STEP: STEP-07: Verifying a pod can be scheduled and run on the new node @ 03/12/26 16:42:17.121\nI0312 16:42:17.234400 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:22.234409 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:27.232393 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:32.235067 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:37.237223 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:42.234168 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Pending\nI0312 16:42:47.235259 50194 tna_extra_worker.go:254] Test pod extra-worker-sched-pq9nz phase: Succeeded\nI0312 16:42:47.294940 50194 tna_extra_worker.go:262] Pod successfully scheduled and ran on node extraworker-0\n  STEP: STEP-08: Verifying the worker MachineConfigPool converges with the new node @ 03/12/26 16:42:47.36\nI0312 16:42:47.409102 50194 tna_extra_worker.go:284] Worker MCP: machineCount=1, readyMachineCount=1, updatedMachineCount=1, degradedMachineCount=0\nI0312 16:42:47.409332 50194 tna_extra_worker.go:292] Worker MachineConfigPool converged with 1 machines\n  STEP: STEP-09: Verifying the new node's kubelet version matches existing nodes @ 03/12/26 16:42:47.409\nI0312 16:42:47.502612 50194 tna_extra_worker.go:305] Control-plane kubelet versions: map[v1.34.2:true], new worker kubelet version: v1.34.2\nI0312 16:42:47.502692 50194 tna_extra_worker.go:121] Extra worker extraworker-0 joined the cluster successfully\n  STEP: CLEANUP: Deleting extra worker BMHs and waiting for Metal3 deprovisioning @ 03/12/26 16:42:47.504\nI0312 16:42:47.557551 50194 tna_extra_worker.go:397] Deleting extraworker BMH ostest-extraworker-0\nI0312 16:42:47.663250 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:43:02.664446 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:43:17.664284 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:43:32.668089 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:43:47.664791 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:44:02.664577 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:44:17.664681 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:44:32.666638 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:44:47.666547 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:45:02.665981 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:45:17.665362 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:45:32.665374 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:45:47.666957 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:46:02.666800 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:46:17.668567 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:46:32.670286 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:46:47.669244 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:47:02.669048 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:47:17.671916 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:47:32.668483 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:47:47.671125 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:48:02.669652 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:48:17.670213 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:48:32.669910 50194 tna_extra_worker.go:413] Waiting for BMH ostest-extraworker-0 to be deleted...\nI0312 16:48:47.666651 50194 tna_extra_worker.go:410] BMH ostest-extraworker-0 deleted\n  STEP: CLEANUP: Removing stale extra worker Node objects from the cluster @ 03/12/26 16:48:47.666\nI0312 16:48:47.763990 50194 tna_extra_worker.go:436] Deleting stale extra worker node extraworker-0\n"
 }
]

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->

## Summary by CodeRabbit

* **Tests**
 * Added end-to-end test coverage for dynamically adding worker nodes to two-node OpenShift clusters configured with highly available arbiter mode, including validation of bare metal provisioning, node readiness, pod scheduling, machine configuration pool convergence, and kubelet compatibility.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
test/extended/two_node/tna_extra_worker.go (2)

48-50: GinkgoRecover() placement is ineffective here.

defer g.GinkgoRecover() is intended to catch panics in goroutines spawned during tests and convert them to Ginkgo failures. At the top level of a Describe block (outside any goroutine), it has no practical effect since the Ginkgo framework already handles panics in the main test flow. Consider removing it or moving it inside any goroutines if spawned.

♻️ Suggested removal
 var _ = g.Describe("[sig-node][apigroup:config.openshift.io][OCPFeatureGate:HighlyAvailableArbiter][Serial] Extra worker scaling in HighlyAvailableArbiterMode", func() {
-	defer g.GinkgoRecover()
-
 	oc := exutil.NewCLIWithoutNamespace("").AsAdmin()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/two_node/tna_extra_worker.go` around lines 48 - 50, The
deferred call to g.GinkgoRecover() placed at the top of the Describe block is
ineffective because Describe runs in the main test flow; move or remove it:
locate the var _ = g.Describe(...) block and either remove the line "defer
g.GinkgoRecover()" or, if you spawn goroutines inside that Describe, place
"defer g.GinkgoRecover()" at the start of each goroutine function (or inside the
BeforeEach/It that launches the goroutine) so panics in those goroutines are
converted to Ginkgo failures.

311-374: Hardcoded BMH provisioning values limit portability.

The hardcoded bootMode=UEFI and rootDeviceHints=/dev/sda values (lines 327-330) are specific to dev-scripts VMs as noted in the comment. If this test needs to run on other environments in the future, consider extracting these as configuration or deriving them from BMH hardware inventory.

The current implementation is acceptable for the stated use case, but flagging for awareness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/two_node/tna_extra_worker.go` around lines 311 - 374, The patch
hardcodes bootMode="UEFI" and rootDeviceHints.deviceName="/dev/sda" inside
triggerBMHProvisioning which reduces portability; update triggerBMHProvisioning
(and its callers) to accept bootMode and rootDeviceName as parameters or read
them from a configurable source (env/config struct) and use those values when
building the patch map, or alternatively query the BMH hardware inventory (e.g.,
unstructured host.Object fields) to derive appropriate values before creating
the patch; change references to the hardcoded literals in triggerBMHProvisioning
to use the new parameters/config variables so other environments can override
them.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/extended/two_node/tna_extra_worker.go`:
- Around line 48-50: The deferred call to g.GinkgoRecover() placed at the top of
the Describe block is ineffective because Describe runs in the main test flow;
move or remove it: locate the var _ = g.Describe(...) block and either remove
the line "defer g.GinkgoRecover()" or, if you spawn goroutines inside that
Describe, place "defer g.GinkgoRecover()" at the start of each goroutine
function (or inside the BeforeEach/It that launches the goroutine) so panics in
those goroutines are converted to Ginkgo failures.
- Around line 311-374: The patch hardcodes bootMode="UEFI" and
rootDeviceHints.deviceName="/dev/sda" inside triggerBMHProvisioning which
reduces portability; update triggerBMHProvisioning (and its callers) to accept
bootMode and rootDeviceName as parameters or read them from a configurable
source (env/config struct) and use those values when building the patch map, or
alternatively query the BMH hardware inventory (e.g., unstructured host.Object
fields) to derive appropriate values before creating the patch; change
references to the hardcoded literals in triggerBMHProvisioning to use the new
parameters/config variables so other environments can override them.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bb518009-a333-429a-a292-566ea2aa575f

📥 Commits

Reviewing files that changed from the base of the PR and between 8ea9a12 and 08225f9.

📒 Files selected for processing (1)
  • test/extended/two_node/tna_extra_worker.go

@agullon agullon marked this pull request as draft March 13, 2026 09:33
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants