-
Notifications
You must be signed in to change notification settings - Fork 7
deploymentStrategy not configurable — causes data loss with ReadWriteOnce persistent storage #83
Description
Problem
When using the chart with EBS (or any ReadWriteOnce storage) and replicaCount: 1, the default
RollingUpdate deployment strategy causes a data availability failure on every Helm upgrade.
What happens:
- Rolling update starts the new pod before terminating the old one (
maxSurge: 1,
maxUnavailable: 0with a single replica) - The old pod still holds the RWO volume — the new pod can't attach it and either hangs or starts
on a different node without the data - Sourcebot starts without
/data/.sourcebot/repos/populated, returning errors like:
error: [web-actions] Failed to search commits in repository ...: Error: Git.cwd: cannot change to non-directory "/data/.sourcebot/repos/560"
- All repository data must be re-synced from scratch
This is a silent failure — the pod appears Running but all git operations fail until the next
full reindex completes.
Why increasing replicaCount doesn't solve this
A natural workaround is running more than one replica. This doesn't work with ReadWriteOnce.
EBS volumes allow only one node to mount the volume at a time. A second pod on a different node
will be permanently stuck in Pending. Even if both pods land on the same node (which the scheduler
can't guarantee), Sourcebot is not designed for concurrent writes to /data/.sourcebot/repos —
two workers racing to git clone or git fetch the same paths would corrupt each other's state.
| Approach | Requirement |
|---|---|
Switch to EFS (ReadWriteMany) |
Requires EFS-backed storage class + Sourcebot handling concurrent repo access safely |
| Separate web from sync worker | Stateless web tier scales; sync worker stays at 1 — architectural change in Sourcebot |
| Stay at 1 replica, fix the update strategy | Set deploymentStrategy.type: Recreate — correct fix for single-replica RWO workloads |
Expected behavior
The chart should expose a deploymentStrategy value so operators can set type: Recreate for
single-replica RWO deployments:
sourcebot:
deploymentStrategy:
type: RecreateWith Recreate, the old pod is terminated before the new one starts, guaranteeing the volume is
released before the new pod tries to attach it.
Suggested change
In the Deployment template, replace the hardcoded (or missing) strategy with:
strategy:
{{- toYaml .Values.sourcebot.deploymentStrategy | nindent 4 }}Add to values.yaml:
sourcebot:
# -- Deployment update strategy. For ReadWriteOnce persistent storage, set type: Recreate.
# See: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy
deploymentStrategy:
type: RollingUpdate
Environment
- Chart version: 0.1.66
- Storage: AWS EBS (ReadWriteOnce)
- replicaCount: 1
- Kubernetes: EKS