Skip to content

deploymentStrategy not configurable — causes data loss with ReadWriteOnce persistent storage #83

@bberzinskas-tw

Description

@bberzinskas-tw

Problem

When using the chart with EBS (or any ReadWriteOnce storage) and replicaCount: 1, the default
RollingUpdate deployment strategy causes a data availability failure on every Helm upgrade.

What happens:

  1. Rolling update starts the new pod before terminating the old one (maxSurge: 1,
    maxUnavailable: 0 with a single replica)
  2. The old pod still holds the RWO volume — the new pod can't attach it and either hangs or starts
    on a different node without the data
  3. Sourcebot starts without /data/.sourcebot/repos/ populated, returning errors like:

error: [web-actions] Failed to search commits in repository ...: Error: Git.cwd: cannot change to non-directory "/data/.sourcebot/repos/560"

  1. All repository data must be re-synced from scratch

This is a silent failure — the pod appears Running but all git operations fail until the next
full reindex completes.

Why increasing replicaCount doesn't solve this

A natural workaround is running more than one replica. This doesn't work with ReadWriteOnce.

EBS volumes allow only one node to mount the volume at a time. A second pod on a different node
will be permanently stuck in Pending. Even if both pods land on the same node (which the scheduler
can't guarantee), Sourcebot is not designed for concurrent writes to /data/.sourcebot/repos
two workers racing to git clone or git fetch the same paths would corrupt each other's state.

Approach Requirement
Switch to EFS (ReadWriteMany) Requires EFS-backed storage class + Sourcebot handling concurrent repo access safely
Separate web from sync worker Stateless web tier scales; sync worker stays at 1 — architectural change in Sourcebot
Stay at 1 replica, fix the update strategy Set deploymentStrategy.type: Recreate — correct fix for single-replica RWO workloads

Expected behavior

The chart should expose a deploymentStrategy value so operators can set type: Recreate for
single-replica RWO deployments:

sourcebot:
  deploymentStrategy:
    type: Recreate

With Recreate, the old pod is terminated before the new one starts, guaranteeing the volume is
released before the new pod tries to attach it.

Suggested change

In the Deployment template, replace the hardcoded (or missing) strategy with:

  strategy:
    {{- toYaml .Values.sourcebot.deploymentStrategy | nindent 4 }}

Add to values.yaml:

  sourcebot:
    # -- Deployment update strategy. For ReadWriteOnce persistent storage, set type: Recreate.
    # See: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy
    deploymentStrategy:
      type: RollingUpdate

Environment

  • Chart version: 0.1.66
  • Storage: AWS EBS (ReadWriteOnce)
  • replicaCount: 1
  • Kubernetes: EKS

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions