test(e2e): 2TiB quorum-loss + no-reboot recovery release gate (COV-011)#146
Conversation
Release-gate scenario for the production deadlock fear: quorum loss
under active IO wedging a large volume such that only node reboots
recover. On a 2 TiB zfs-thick volume (2 diskful + auto-tiebreaker,
on-no-quorum=suspend-io) the scenario proves:
- bounded-time create (skip-initial-sync contract for thick zvols)
- secondary-only outage keeps quorum via the witness, IO continues
- losing both the secondary and the witness mid-IO suspends IO
(writer blocks with zero errors, no crash)
- restoring the links resumes the suspended IO and heals the
cluster to all-UpToDate with the seeded marker intact, WITHOUT
any node reboot (boot_id + uptime asserted on every worker)
Outage mechanism is the per-link iptables DROP of the resource's
DRBD mesh port (the drop_pair recipe already proven by
quorum-tiebreaker-no-return and the partition scenarios) — the
strongest node-outage model reachable from a scenario on this
stand, and the only one that breaks kernel replication links while
keeping observability up.
Stand-"big"-specific: SKIPs unless a zfs-thick (providerKind=ZFS)
StoragePool exists on all workers with 2 TiB of headroom.
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
The COV-011 gate needs the stand-"big" substrate (zfs-thick pool, 2.2T disks) and deliberately SKIPs on the regular CI lanes; without the allowlist entry the runner reclassifies that SKIP as FAIL. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
|
Warning Review limit reached
More reviews will be available in 11 minutes and 19 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more credits in the billing tab to continue. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request adds a new end-to-end test scenario, quorum-loss-2tb-recovery.sh, to verify that a 2 TiB quorum-loss recovery can occur without requiring a node reboot, and updates the SKIP_ALLOWLIST in stand/run-scenarios-only.sh to include this new scenario. The review feedback correctly identifies critical syntax errors in the bash script related to invalid array expansion default value fallbacks (${BLOCKED_PAIRS[@]:-} and ${kept[@]:-}), which would cause runtime failures.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| local kept=() p | ||
| for p in "${BLOCKED_PAIRS[@]:-}"; do | ||
| [[ "$p" == "$node|$peer_ip" ]] || kept+=("$p") | ||
| done | ||
| BLOCKED_PAIRS=("${kept[@]:-}") | ||
| } |
There was a problem hiding this comment.
Using ${BLOCKED_PAIRS[@]:-} and ${kept[@]:-} results in a runtime syntax error in Bash (operand expected (error token is "-")). Since both arrays are explicitly declared (declare -a BLOCKED_PAIRS and local kept), they are guaranteed to be defined. You can safely reference them as ${BLOCKED_PAIRS[@]} and ${kept[@]} without triggering unbound variable errors under set -u.
| local kept=() p | |
| for p in "${BLOCKED_PAIRS[@]:-}"; do | |
| [[ "$p" == "$node|$peer_ip" ]] || kept+=("$p") | |
| done | |
| BLOCKED_PAIRS=("${kept[@]:-}") | |
| } | |
| local kept=() p | |
| for p in "${BLOCKED_PAIRS[@]}"; do | |
| [[ "$p" == "$node|$peer_ip" ]] || kept+=("$p") | |
| done | |
| BLOCKED_PAIRS=("${kept[@]}") | |
| } |
| for p in "${BLOCKED_PAIRS[@]:-}"; do | ||
| [[ -z "$p" ]] && continue |
There was a problem hiding this comment.
Using ${BLOCKED_PAIRS[@]:-} results in a runtime syntax error in Bash. Since BLOCKED_PAIRS is explicitly declared as an array, you can safely reference it as ${BLOCKED_PAIRS[@]}.
| for p in "${BLOCKED_PAIRS[@]:-}"; do | |
| [[ -z "$p" ]] && continue | |
| for p in "${BLOCKED_PAIRS[@]}"; do | |
| [[ -z "$p" ]] && continue |
What
Adds the missing release-gate coverage for blocker COV-011: an e2e scenario (
tests/e2e/quorum-loss-2tb-recovery.sh) proving that on a large (2 TiB) zfs-thick-backed volume, losing DRBD quorum under active IO suspends IO per thesuspend-iopolicy and — critically — IO resumes and the cluster heals without any node reboot once quorum returns.Why
The production fear is a DRBD deadlock where quorum loss under IO wedges volumes such that only node reboots recover. No existing scenario covers the large-volume quorum-loss + recovery leg; this gate pins the full contract end to end.
Scenario contract
On the canonical 2-diskful + auto-tiebreaker shape (
quorum=majority,on-no-quorum=suspend-io):zfs-thickStoragePool (providerKindZFS) on all 3 workers plus 2 TiB + 5% of free pool capacity per diskful node; SKIPs elsewhere with a clear message.skipInitialSync=truestamp is asserted explicitly).boot_idis unchanged (with monotonically increased uptime) on every worker: the explicit no-reboot assert.Every wait is bounded by an explicit deadline on a concrete condition — no blind sleeps on the critical path.
Outage mechanism
Per-link iptables DROP of the resource's DRBD mesh port (all four src/dst x sport/dport combinations), the
drop_pairrecipe already proven byquorum-tiebreaker-no-return.shand the partition scenarios. This is the strongest node-outage model reachable from a scenario on this stand: it breaks the kernel replication links the way a dead node does (no gracefuldrbdadm disconnecthandshake; DRBD must detect the loss via ping-timeout), there is no VM-level kill helper reachable from a scenario, and stopping the satellite pod would not break kernel-level replication anyway. kubectl/API traffic is untouched, so kernel truth stays observable from every node throughout the outage.Harness wiring
stand/run-scenarios-only.shSKIP_ALLOWLIST gains the new scenario: it is stand-"big"-specific by design and must SKIP (not FAIL) on the regular CI lanes, which auto-discover the whole suite viamake e2e-list.Validation
bash -nandshellcheck -xclean (zero findings)../tests/e2e/quorum-loss-2tb-recovery.sh .work/big