diff --git a/content/en/docs/next/applications/backup-and-recovery.md b/content/en/docs/next/applications/backup-and-recovery.md index 568bf3be..977dc3a8 100644 --- a/content/en/docs/next/applications/backup-and-recovery.md +++ b/content/en/docs/next/applications/backup-and-recovery.md @@ -8,12 +8,11 @@ weight: 4 This guide covers backing up and restoring **Cozystack-managed databases** — Postgres, MariaDB, and ClickHouse — as a tenant user: running one-off and scheduled backups, checking status, and restoring from a backup either in place or into a separate target instance. {{% alert color="info" %}} -**Storage, credentials, and the `BackupClass` are admin-provisioned.** Before you can run a `BackupJob`, an administrator provisions the S3 storage and the per-application credential Secrets your driver expects, and creates the cluster-scoped `BackupClass` you reference. Ask your administrator for: +**Storage and the `BackupClass` are platform-provisioned.** Cozystack ships a single cluster-scoped `BackupClass` named `cozy-default` that covers Postgres, MariaDB, ClickHouse, Etcd, VMInstance, and VMDisk via a per-Kind `strategies[]` array. You reference it by name from `BackupJob` / `Plan` / `RestoreJob` — there is no per-application `BackupClass`, and you do **not** create or supply S3 credentials, endpoints, or paths. The platform projects a credentials Secret (`cozy-backups-creds`) into your tenant namespace automatically right before each BackupJob runs. -- the `BackupClass` name to use for your application Kind (you cannot list `BackupClass` resources under your tenant kubeconfig — they are cluster-scoped); -- confirmation that the per-application credential Secrets exist in your namespace for every managed-DB application you want to back up. +If your administrator has created additional sibling `BackupClass` resources (different retention, different storage, etc.), ask for the name and substitute it for `cozy-default` in the examples below. `BackupClass` is cluster-scoped, so you cannot list it under a tenant kubeconfig — your administrator will tell you which names are valid. -Admins follow the [Managed Application Backup Configuration]({{% ref "/docs/next/operations/services/managed-app-backup-configuration" %}}) guide. +Admins follow the [Backup Classes]({{% ref "/docs/next/operations/services/backup-classes" %}}) guide. {{% /alert %}} {{% alert color="warning" %}} @@ -28,8 +27,8 @@ For backups that include the application's Helm release, CRs, and PVC snapshots ## Prerequisites -- A `BackupClass` name handed to you by your administrator (for example, `postgres-data-backup` for a `Postgres` application). -- An existing managed-DB application (`Postgres`, `MariaDB`, or `ClickHouse`) in your tenant namespace. +- A `BackupClass` name. On a default install this is `cozy-default`, which covers `Postgres`, `MariaDB`, `ClickHouse`, and `Etcd`. If your administrator has created a sibling class, substitute that name everywhere below. +- An existing managed-DB application (`Postgres`, `MariaDB`, `ClickHouse`, or `Etcd`) in your tenant namespace. - `kubectl` and a tenant kubeconfig with the `tenant--admin` role. The examples below assume `tenant-user` for the tenant namespace; substitute your own. @@ -51,7 +50,7 @@ spec: apiGroup: apps.cozystack.io kind: Postgres name: my-postgres - backupClassName: postgres-data-backup + backupClassName: cozy-default ``` ```bash @@ -62,7 +61,7 @@ kubectl -n tenant-user describe backupjob my-postgres-adhoc When the `BackupJob` reaches `phase: Succeeded`, the driver creates a `Backup` object with the same name. That name is what you reference when restoring. -Replace `Postgres` / `postgres-data-backup` with `MariaDB` / `mariadb-data-backup` or `ClickHouse` / `clickhouse-data-backup` for the other drivers. +Replace `Postgres` with `MariaDB`, `ClickHouse`, or `Etcd` for the other drivers — the `BackupClass` (`cozy-default`) is the same; the platform-shipped class binds a strategy for every supported Kind. ### Scheduled backup @@ -79,7 +78,7 @@ spec: apiGroup: apps.cozystack.io kind: Postgres name: my-postgres - backupClassName: postgres-data-backup + backupClassName: cozy-default schedule: type: cron cron: "0 */6 * * *" # every 6 hours @@ -110,7 +109,7 @@ kubectl -n tenant-user describe backupjob my-postgres-adhoc kubectl -n tenant-user get events --field-selector involvedObject.name=my-postgres-adhoc ``` -If `status.message` does not pinpoint the failure, hand the `BackupJob` name to your administrator and they will inspect the operator-native CR the driver created (see [Tenant escalation: driver-side diagnostics]({{% ref "/docs/next/operations/services/managed-app-backup-configuration#tenant-escalation-driver-side-diagnostics" %}}) in the admin guide). +If `status.message` does not pinpoint the failure, hand the `BackupJob` name to your administrator and they will inspect the operator-native CR the driver created (see [Backup Classes]({{% ref "/docs/next/operations/services/backup-classes" %}}) in the admin guide). ## Restore in place @@ -201,10 +200,10 @@ kubectl -n tenant-user describe backupjob my-postgres-adhoc kubectl -n tenant-user get events --field-selector involvedObject.name=my-postgres-adhoc ``` -If those do not explain the failure, the next layer of diagnostics lives on the operator-native CR the driver created (`cnpg.io/Backup`, `k8s.mariadb.com/Backup`, or the ClickHouse strategy `Pod` logs). These resources are not reachable under the tenant kubeconfig — hand the `BackupJob` name to your administrator and they will follow [Tenant escalation: driver-side diagnostics]({{% ref "/docs/next/operations/services/managed-app-backup-configuration#tenant-escalation-driver-side-diagnostics" %}}). +If those do not explain the failure, the next layer of diagnostics lives on the operator-native CR the driver created (`cnpg.io/Backup`, `k8s.mariadb.com/Backup`, or the ClickHouse strategy `Pod` logs). These resources are not reachable under the tenant kubeconfig — hand the `BackupJob` name to your administrator and they will follow [Backup Classes]({{% ref "/docs/next/operations/services/backup-classes" %}}). ## See also -- [Managed Application Backup Configuration]({{% ref "/docs/next/operations/services/managed-app-backup-configuration" %}}) — how administrators define strategies and `BackupClass` resources. +- [Backup Classes]({{% ref "/docs/next/operations/services/backup-classes" %}}) — how administrators define strategies and `BackupClass` resources. - [Backup and Recovery (VMs)]({{% ref "/docs/next/virtualization/backup-and-recovery" %}}) — the parallel guide for VMInstance / VMDisk backups (HelmRelease + CRs + PVC snapshots). -- [Velero Backup Configuration]({{% ref "/docs/next/operations/services/velero-backup-configuration" %}}) — administrator setup for the Velero-driven VM backups. +- [Backup Classes]({{% ref "/docs/next/operations/services/backup-classes" %}}) — administrator setup for the Velero-driven VM backups. diff --git a/content/en/docs/next/applications/clickhouse.md b/content/en/docs/next/applications/clickhouse.md index df315859..b049199f 100644 --- a/content/en/docs/next/applications/clickhouse.md +++ b/content/en/docs/next/applications/clickhouse.md @@ -21,7 +21,7 @@ It is used for online analytical processing (OLAP). ### How to restore backup from S3 {{% alert color="warning" %}} -**Backups: prefer the `BackupClass` flow.** `backup.enabled` and the S3 fields (`s3Region`, `s3Bucket`, `endpoint`, `s3PathOverride`, `s3AccessKey`/`s3SecretKey` or `s3CredentialsSecret`) are still required — they materialise the in-pod `clickhouse-backup` sidecar that the Altinity backup strategy talks to. However, `backup.schedule`, `backup.cleanupStrategy`, and `backup.resticPassword` (which drive the legacy chart-managed CronJob doing dump + restic, and the matching restic restore flow documented below) are **superseded** by the Cozystack backups framework: define a `BackupClass` + `Altinity` strategy once, then drive scheduled backups via `Plan` and restores via `RestoreJob`. See [Application Backup and Recovery]({{% ref "/docs/next/applications/backup-and-recovery" %}}) (tenant guide) and [Managed Application Backup Configuration]({{% ref "/docs/next/operations/services/managed-app-backup-configuration" %}}) (admin setup). +**Backups: prefer the `BackupClass` flow.** `backup.enabled` and the S3 fields (`s3Region`, `s3Bucket`, `endpoint`, `s3PathOverride`, `s3AccessKey`/`s3SecretKey` or `s3CredentialsSecret`) are still required — they materialise the in-pod `clickhouse-backup` sidecar that the Altinity backup strategy talks to. However, `backup.schedule`, `backup.cleanupStrategy`, and `backup.resticPassword` (which drive the legacy chart-managed CronJob doing dump + restic, and the matching restic restore flow documented below) are **superseded** by the Cozystack backups framework: define a `BackupClass` + `Altinity` strategy once, then drive scheduled backups via `Plan` and restores via `RestoreJob`. See [Application Backup and Recovery]({{% ref "/docs/next/applications/backup-and-recovery" %}}) (tenant guide) and [Backup Classes]({{% ref "/docs/next/operations/services/backup-classes" %}}) (admin setup). {{% /alert %}} 1. Find the snapshot: diff --git a/content/en/docs/next/applications/mariadb.md b/content/en/docs/next/applications/mariadb.md index d7926735..5d5e9111 100644 --- a/content/en/docs/next/applications/mariadb.md +++ b/content/en/docs/next/applications/mariadb.md @@ -111,7 +111,7 @@ more details: ### Backup parameters {{% alert color="warning" %}} -**The chart-level `backup.*` values documented below are deprecated.** The legacy `mariadb-dump` + `restic` flow is superseded by the Cozystack backups framework: define a `BackupClass` + `MariaDB` strategy once, then drive backups via `BackupJob` / `Plan` and restores via `RestoreJob`. Existing tenants with `backup.enabled=true` continue to render the legacy resources unchanged. See [Application Backup and Recovery]({{% ref "/docs/next/applications/backup-and-recovery" %}}) (tenant guide) and [Managed Application Backup Configuration]({{% ref "/docs/next/operations/services/managed-app-backup-configuration" %}}) (admin setup). +**The chart-level `backup.*` values documented below are deprecated.** The legacy `mariadb-dump` + `restic` flow is superseded by the Cozystack backups framework: define a `BackupClass` + `MariaDB` strategy once, then drive backups via `BackupJob` / `Plan` and restores via `RestoreJob`. Existing tenants with `backup.enabled=true` continue to render the legacy resources unchanged. See [Application Backup and Recovery]({{% ref "/docs/next/applications/backup-and-recovery" %}}) (tenant guide) and [Backup Classes]({{% ref "/docs/next/operations/services/backup-classes" %}}) (admin setup). {{% /alert %}} | Name | Description | Type | Value | diff --git a/content/en/docs/next/applications/postgres.md b/content/en/docs/next/applications/postgres.md index f5d9a008..7a42b414 100644 --- a/content/en/docs/next/applications/postgres.md +++ b/content/en/docs/next/applications/postgres.md @@ -28,7 +28,7 @@ This managed service is controlled by the CloudNativePG operator, ensuring effic ## Operations {{% alert color="warning" %}} -**Backups: prefer the `BackupClass` flow.** The chart-level `backup.*` values documented below still configure the Barman object store and S3 credentials that backups read from, but the chart-emitted `ScheduledBackup` and the `bootstrap`-based recovery flow have been **superseded** by the Cozystack backups framework: define a `BackupClass` + `CNPG` strategy once, then drive scheduled backups via `Plan` and restores via `RestoreJob`. See [Application Backup and Recovery]({{% ref "/docs/next/applications/backup-and-recovery" %}}) (tenant guide) and [Managed Application Backup Configuration]({{% ref "/docs/next/operations/services/managed-app-backup-configuration" %}}) (admin setup). +**Backups: prefer the `BackupClass` flow.** The chart-level `backup.*` values documented below still configure the Barman object store and S3 credentials that backups read from, but the chart-emitted `ScheduledBackup` and the `bootstrap`-based recovery flow have been **superseded** by the Cozystack backups framework: define a `BackupClass` + `CNPG` strategy once, then drive scheduled backups via `Plan` and restores via `RestoreJob`. See [Application Backup and Recovery]({{% ref "/docs/next/applications/backup-and-recovery" %}}) (tenant guide) and [Backup Classes]({{% ref "/docs/next/operations/services/backup-classes" %}}) (admin setup). {{% /alert %}} ### How to enable backups diff --git a/content/en/docs/next/kubernetes/backups-with-velero-addon.md b/content/en/docs/next/kubernetes/backups-with-velero-addon.md index 3a6db7ea..195c6990 100644 --- a/content/en/docs/next/kubernetes/backups-with-velero-addon.md +++ b/content/en/docs/next/kubernetes/backups-with-velero-addon.md @@ -10,7 +10,7 @@ The `velero` addon of the [Managed Kubernetes]({{% ref "/docs/next/kubernetes" % {{% alert color="info" %}} This guide is for the **tenant-side** Velero addon, which runs inside a tenant Kubernetes cluster and is operated by the tenant user. -For the platform-level Velero used by cluster administrators to back up `VMInstance`/`VMDisk` resources from the management cluster, see [Velero Backup Configuration]({{% ref "/docs/next/operations/services/velero-backup-configuration" %}}). +For the platform-level Velero used by cluster administrators to back up `VMInstance`/`VMDisk` resources from the management cluster, see [Backup Classes]({{% ref "/docs/next/operations/services/backup-classes" %}}). {{% /alert %}} ## What the addon installs @@ -172,4 +172,4 @@ The same pattern restores into a **different** tenant Kubernetes cluster as well - [Managed Kubernetes — `addons.velero` parameters]({{% ref "/docs/next/kubernetes#parameters" %}}) - [Buckets and Users]({{% ref "/docs/next/operations/services/object-storage/buckets" %}}) -- [Velero Backup Configuration (platform admin)]({{% ref "/docs/next/operations/services/velero-backup-configuration" %}}) +- [Backup Classes (platform admin)]({{% ref "/docs/next/operations/services/backup-classes" %}}) diff --git a/content/en/docs/next/operations/services/backup-classes.md b/content/en/docs/next/operations/services/backup-classes.md new file mode 100644 index 00000000..2d34565f --- /dev/null +++ b/content/en/docs/next/operations/services/backup-classes.md @@ -0,0 +1,196 @@ +--- +title: "Backup Classes" +linkTitle: "Backup Classes" +description: "Default cozy-default BackupClass and the parameters tenants and admins can tune." +weight: 31 +--- + + +Cozystack ships a single platform-managed `BackupClass` named `cozy-default`. It is provisioned automatically when the `backupstrategy-controller` package is installed and references the system-managed S3 bucket `cozy-backups` in the `tenant-root` namespace. + +Tenants reference `cozy-default` from `BackupJob`, `Plan`, and `RestoreJob` resources — they do **not** supply S3 credentials, endpoints, or paths. The platform projects the system-managed credentials Secret into the tenant namespace per BackupJob (or, for long-lived references like Velero's `BackupStorageLocation`, into a fixed list of system namespaces on a periodic tick), and the default strategy templates encode `/` into every S3 path so two tenants with the same application name never collide. + +## Supported applications + +### Bound by `cozy-default` (work out-of-the-box) + +| Application Kind | Driver | Strategy CR | +|----------------------------------|--------------------------------------|----------------------------------------------------------------------------| +| `apps.cozystack.io/Postgres` | CloudNativePG (barman) | `strategy.backups.cozystack.io/CNPG` `cozy-default-cnpg` | +| `apps.cozystack.io/MariaDB` | mariadb-operator dump | `strategy.backups.cozystack.io/MariaDB` `cozy-default-mariadb` | +| `apps.cozystack.io/ClickHouse` | Altinity `clickhouse-backup` sidecar | `strategy.backups.cozystack.io/Altinity` `cozy-default-altinity` | +| `apps.cozystack.io/Etcd` | etcd-operator snapshot | `strategy.backups.cozystack.io/Etcd` `cozy-default-etcd` | +| `apps.cozystack.io/VMInstance` | Velero + kubevirt-velero-plugin | `strategy.backups.cozystack.io/Velero` `cozy-default-velero-vminstance` | +| `apps.cozystack.io/VMDisk` | Velero | `strategy.backups.cozystack.io/Velero` `cozy-default-velero-vmdisk` | + +### Shipped but NOT bound (admin opt-in required) + +| Application Kind | Driver | Strategy CR | +|----------------------------------|--------------------------------------|----------------------------------------------------------------------------| +| `apps.cozystack.io/FoundationDB` | FoundationDB operator backup_agent | `strategy.backups.cozystack.io/FoundationDB` `cozy-default-foundationdb` | + +The FoundationDB strategy CR is rendered by the chart so admins can reference it from a custom BackupClass once the operator-side plumbing (mounting `cozy-backups-creds` into the `cozy-foundationdb-operator` Deployment) is wired manually. See "FoundationDB caveat" below. + +### Endpoint format per driver + +Different operators expect different endpoint shapes; the strategy templates rendered by `backupstrategy-controller` adapt the single `backupStorage.endpoint` value (a full URL like `http://seaweedfs-s3.tenant-root.svc:8333`) to each consumer's contract: + +| Driver | Strategy template field | Form | +|--------|-------------------------|------| +| CNPG (Postgres) | `barmanObjectStore.endpointURL` | full URL (scheme preserved) | +| Etcd | `destination.s3.endpoint` | full URL (scheme preserved) | +| MariaDB | `storage.s3.endpoint` | bare host:port (scheme stripped); `tls.enabled` derived from the scheme | +| FoundationDB | `blobStoreConfiguration.accountName` + `urlParameters.secure_connection` | bare host:port + derived secure flag | +| Velero | `BackupStorageLocation.spec.config.s3Url` | full URL (scheme preserved) | +| ClickHouse sidecar | `S3_ENDPOINT` env | bare host:port (from projected Secret) | + +The projected `cozy-backups-creds.endpoint` key is **stripped of scheme** so chart-emitted sidecars (ClickHouse) consume it directly. Drivers that need the full URL pull from `backupStorage.endpoint` in chart values, not from the Secret. + +VM-driven (Velero) backups land in the same `cozy-backups` bucket under the `velero/` prefix. A `BackupStorageLocation` named `cozy-default` is shipped by the `backupstrategy-controller` chart (`packages/system/backupstrategy-controller/templates/velero-bsl.yaml`) so endpoint/bucket/region come from the same `backupStorage` values block used by Strategy CRs and the projector. + +### FoundationDB caveat + +The strategy CR `cozy-default-foundationdb` is shipped, but it is **not** bound by `cozy-default` yet. Restore runs `fdbrestore` from inside the `cozy-foundationdb-operator` Deployment, which does not yet mount `cozy-backups-creds`. Until the operator deployment is updated to mount the projected Secret, FDB platform-default restore silently fails — admins who need it today should keep using a per-app `Bucket` plus a custom `BackupClass`, or wire the credentials file into the operator deployment themselves. + +**Cleanup gotcha (zombie backup_agent).** Unlike CNPG/MariaDB/Altinity (one-shot operator-side Backup CRs), the FoundationDB driver creates a `foundationdb.org/FoundationDBBackup` CR that drives a **long-lived** `backup_agent` Deployment streaming continuously to S3. Deleting a Cozystack `Backup` (e.g. via retention sweeping) does NOT stop that Deployment — the agent keeps writing until the next BackupJob's `stopOtherFoundationDBBackups` call swaps it out, until an admin invokes `examples/backups/foundationdb/cleanup.sh`, or until the operator-side CR is deleted by hand. If a tenant deletes their last Cozystack Backup and never submits another BackupJob, the agent pods will continue running indefinitely and accumulate S3 PUTs. This is intentional today (the driver has no RBAC verb to stop the operator-side CR on Cozystack-Backup deletion) but admins should be aware of it. + +## ClickHouse: opt-in to the system bucket + +The `clickhouse-backup` sidecar runs inside the ClickHouse Pod itself, so the Helm chart is what wires its S3 credentials. Existing tenants on the legacy `backup.s3*` values continue to work unchanged. To switch a release onto the platform bucket, set: + +```yaml +backup: + enabled: true + useSystemBucket: true +``` + +When `useSystemBucket: true`: + +- The chart-emitted `-backup-s3` Secret is no longer rendered. +- The sidecar consumes `cozy-backups-creds` (projected by the platform). +- `S3_PATH` is set to `/` so two tenants with the same ClickHouse release name never share a prefix. + +`s3Region`, `s3Bucket`, `endpoint`, `s3AccessKey`, `s3SecretKey`, and `s3CredentialsSecret` are ignored in this mode. + +## Inspecting the defaults + +```bash +kubectl get backupclasses +kubectl get backupclass cozy-default -o yaml +kubectl -n tenant-root get bucket cozy-backups +kubectl -n tenant-root get secret bucket-cozy-backups-system-credentials +kubectl -n cozy-velero get backupstoragelocation cozy-default +``` + +The bucket lives in `tenant-root` and is provisioned through the `apps.cozystack.io/Bucket` CR. The system-managed credentials Secret never leaves that namespace. The backupstrategy-controller projects a copy under the name `cozy-backups-creds` into a tenant namespace right before each BackupJob runs, and refreshes the same Secret in `cozy-velero` (and any other namespace listed in `backupStorage.systemNamespaces`) on a 1-minute tick. The projected Secret carries multiple key formats so each driver finds what it needs in one place: + +| Key | Consumer | +|-----------------------------------------------|-------------------------------------------| +| `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY` | CNPG, MariaDB, Etcd | +| `accessKey` / `secretKey` (plus `bucketName`, `endpoint`, `region`) | ClickHouse sidecar | +| `cloud` | Velero (AWS credentials file format) | +| `blob_credentials.json` | FoundationDB backup_agent | + +### Bootstrap window + +On a fresh-cluster install, the Velero `BackupStorageLocation` `cozy-default` is rendered before the credentials projector has had a chance to copy `cozy-backups-creds` into `cozy-velero`. The BSL reports `Unavailable` until the projector's first synchronous round completes (which happens immediately when the `backupstrategy-controller` Pod becomes Ready — typically tens of seconds after `helm install` returns, not minutes). Velero rejects new `Backup` AND `Restore` requests against `storageLocation: cozy-default` during that window. Plan VM backup automation accordingly, or wait for `kubectl -n cozy-velero get bsl cozy-default -o jsonpath='{.status.phase}' = Available` before submitting backups. + +**Note on controller restarts.** The BSL flickers `Unavailable` on every `backupstrategy-controller` pod restart while the projector replays its first synchronous round. The window is short (single-digit seconds) but operators who alert on BSL availability should suppress alerts during the controller's `kube_pod_container_status_restarts_total{container=backupstrategy-controller}` events or use a longer evaluation window than the projector tick (60s). + +### Cozy-default Bucket bootstrap + +`cozy-default` ships an `apps.cozystack.io/Bucket cozy-backups` CR in `tenant-root`, which the bucket-application chart turns into a `BucketClaim`; the COSI driver then assigns the real S3 bucket name and writes it to the BucketClaim's `.status.bucketName`. The strategy templates and the Velero BSL all read that real bucket name (Helm `lookup` against the BucketClaim). On a fresh install the BucketClaim takes a short reconcile cycle to populate its status — until it does, the strategy templates render empty and only the `Bucket` CR + `BackupClass` are present in the cluster. Flux re-renders the HelmRelease on its standard interval (default 10 minutes), at which point the populated BucketClaim status causes the missing strategy templates to materialise. + +If you need the BackupClass functional immediately (e.g. an e2e), trigger a Flux reconcile (`flux reconcile helmrelease backupstrategy-controller`) once you see `kubectl get bucketclaim -n tenant-root bucket-cozy-backups -o jsonpath='{.status.bucketName}'` non-empty. + +### Observability + +The credentials projector emits two Prometheus counters labelled by `namespace` (and `reason` for failures): + +- `cozystack_backup_credentials_projection_successes_total` +- `cozystack_backup_credentials_projection_failures_total` + +Alert on `rate(failures_total) > 0` or `absent_over_time(successes_total[10m])` to catch a stale BSL credential or a malformed source Secret without log scraping. + +## Admin overrides for `cozy-default` + +`cozy-default` is rendered by the `backupstrategy-controller` chart and owned by Flux's helm-controller. **Direct `kubectl edit backupclass cozy-default` is overwritten on the next helm reconcile** — the same applies to its companion `strategy.backups.cozystack.io/*` CRs (`cozy-default-cnpg`, `cozy-default-etcd`, `cozy-default-mariadb`, `cozy-default-altinity`, `cozy-default-foundationdb`, the two `cozy-default-velero-*`). The supported override path is the cozystack `Package` CR, which lets admins inject Helm values into platform components: + +```yaml +apiVersion: cozystack.io/v1alpha1 +kind: Package +metadata: + name: cozystack.cozystack-platform +spec: + components: + backupstrategy-controller: + values: + backupStorage: + provisionBucket: true # default; set false for external S3 + bucketName: cozy-backups # apps.cozystack.io/Bucket release name + endpoint: http://seaweedfs-s3.tenant-root.svc.cozy.local:8333 + region: us-east-1 + forcePathStyle: true + systemSecretName: bucket-cozy-backups-system-credentials + systemNamespaces: + - cozy-velero +``` + +| Knob | Effect | +|---|---| +| `provisionBucket` | Toggle creation of the in-cluster `apps.cozystack.io/Bucket` CR. Set `false` for external S3 (see [Disabling the platform-managed bucket](#disabling-the-platform-managed-bucket)). | +| `bucketName` | K8s name of the Bucket CR + lookup key for the COSI BucketClaim. The actual S3 bucket name is the COSI-assigned UUID, surfaced through `BucketClaim.status.bucketName`. | +| `bucketNameOverride` | Escape hatch for offline `helm template` renders — bypasses the live-cluster BucketClaim lookup. Leave empty in production. | +| `endpoint` | S3 endpoint baked into every default strategy CR + the Velero BSL. Switching to `https://` silently enables TLS in the MariaDB strategy — ensure the CA bundle is reachable to the relevant operator/driver Pods before flipping it. | +| `region` | Re-projected into `cozy-backups-creds` on the next reconcile. Pod-restart required for chart-emitted clients consuming the region via env (ClickHouse sidecar today). | +| `forcePathStyle` | Path-style addressing; SeaweedFS S3 requires it, AWS S3 typically doesn't. | +| `systemSecretName` | Name of the human-friendly Secret produced by the Bucket app (or pre-created manually for external S3). The projector also accepts the raw COSI Secret format. | +| `systemNamespaces` | Namespaces where the controller eagerly projects `cozy-backups-creds` (Velero BSL, FDB operator). Tenants are projected lazily during BackupJob reconcile. | + +When the override needs to go beyond storage coordinates — different retention, different driver→Kind binding, multi-region split — create a **sibling BackupClass** with a unique name (anything but `cozy-default`). Sibling BackupClasses live outside the chart, are admin-owned, and Flux will not touch them. Tenants opt in by setting `backupClassName: ` on their `BackupJob`s. + +## Tuning via a custom BackupClass + +The defaults aim at a reasonable middle (30-day retention, gzip compression where applicable). To override for a specific tenant or workload, create your own `BackupClass` pointing at the same strategy CRs but with tweaked `parameters`, or a fresh strategy CR. Common knobs: + +- **CNPG strategy**: `barmanObjectStore.retentionPolicy`, `data.compression`, `wal.compression`. +- **MariaDB strategy**: `compression`, `maxRetention`, `databases[]`. +- **Altinity strategy**: tune the `clickhouse-backup` sidecar via `backup.*` values on the ClickHouse release; the strategy Pod is a thin HTTP client. +- **FoundationDB strategy**: `snapshotPeriodSeconds`, `agentCount`, `urlParameters[]`. +- **Velero strategy (VMInstance / VMDisk)**: `ttl`, `includedResources[]`, `excludedResources[]`. +- **Etcd strategy**: today the strategy is path-only; combine with `Plan.spec.retentionPolicy` for trim cadence. + +The system-managed credentials Secret is the **only** way for in-cluster strategies to reach `cozy-backups`. Do not embed access keys in `BackupClass.parameters` — the security model relies on Secret references, and `parameters` end up in `Backup.status.underlyingResources`, which tenants can read. + +## Disabling the platform-managed bucket + +If a deployment runs against an external S3 (no SeaweedFS), set `backupStorage.provisionBucket: false` in the `backupstrategy-controller` values and create the source credentials Secret in `tenant-root` manually (flat-key format: `accessKey` / `secretKey` / `endpoint` / `bucketName`; or the raw COSI `BucketInfo` JSON). Update `backupStorage.endpoint`, `backupStorage.region`, and (for VM backups) the chart's Velero BSL settings to point at the external S3. + +## Upgrade notes from chart-managed backups + +> **Postgres `backup.enabled: true` with placeholder credentials no longer renders `barmanObjectStore` on upgrade.** +> +> The pre-v1.4 defaults for `backup.s3AccessKey` / `backup.s3SecretKey` in `packages/apps/postgres/values.yaml` were the literal `""` / `""` placeholders, so the Postgres chart still rendered `spec.backup.barmanObjectStore` on the `cnpg.io/Cluster` (with junk credentials, `archive_command` failing at runtime). After v1.4 those defaults are empty strings and the chart NO LONGER renders the backup block at all when the placeholders are unmodified. Tenants on the legacy chart-managed flow who relied on those placeholders see their `barmanObjectStore` disappear from the live `Cluster` on `helm upgrade`. Action — pick one: +> +> - **Move to the platform flow (recommended).** Set `backup.useSystemBucket: true`; the chart leaves `barmanObjectStore` unset and the CNPG backup driver SSA-patches it onto the live `Cluster` at first BackupJob time. No tenant-side keys required. +> - **Stay on the legacy chart-managed flow.** Supply real `backup.s3AccessKey` / `backup.s3SecretKey` (or a pre-existing `backup.s3CredentialsSecret.name`); the chart renders `barmanObjectStore` exactly as before. +> +> The same `useSystemBucket` opt-in applies to ClickHouse — see [ClickHouse: opt-in to the system bucket](#clickhouse-opt-in-to-the-system-bucket). When `useSystemBucket: true` is set on ClickHouse, the legacy `-backup` CronJob, credential Secret, and backup script are no longer rendered (they are mutually exclusive with the platform flow); migrate scheduled backups to a `backups.cozystack.io/Plan` against `cozy-default`. + +## Tenant workflow + +Tenants only ever see the BackupClass name. Typical apply: + +```yaml +apiVersion: backups.cozystack.io/v1alpha1 +kind: BackupJob +metadata: + name: ad-hoc + namespace: tenant-acme +spec: + backupClassName: cozy-default + applicationRef: + apiGroup: apps.cozystack.io + kind: Postgres + name: orders-db +``` diff --git a/content/en/docs/next/operations/services/managed-app-backup-configuration.md b/content/en/docs/next/operations/services/managed-app-backup-configuration.md deleted file mode 100644 index 77333722..00000000 --- a/content/en/docs/next/operations/services/managed-app-backup-configuration.md +++ /dev/null @@ -1,337 +0,0 @@ ---- -title: "Managed Application Backup Configuration" -linkTitle: "Managed Application Backup Configuration" -description: "Configure strategies and BackupClasses for logical data backups of managed databases (Postgres, MariaDB, ClickHouse)." -weight: 31 ---- - -This guide is for **cluster administrators** who configure backup strategies for Cozystack-managed database applications: Postgres, MariaDB, and ClickHouse. Once strategies and `BackupClass` resources are in place, tenants run backups and restores by creating [BackupJob, Plan, and RestoreJob]({{% ref "/docs/next/applications/backup-and-recovery" %}}) resources with no further admin action. - -{{% alert color="info" %}} -This page covers **data-only** backups driven by each operator's native backup mechanism (CloudNativePG barman, mariadb-operator dumps, Altinity `clickhouse-backup`). The `apps.cozystack.io/*` CR, its `HelmRelease`, chart values, and operator-managed Secrets are **not** captured by these strategies. - -For backups that bundle Helm release + CRs + PVC snapshots (used by VMInstance / VMDisk), see [Velero Backup Configuration]({{% ref "/docs/next/operations/services/velero-backup-configuration" %}}). -{{% /alert %}} - -## Prerequisites - -- Administrator access to the Cozystack (management) cluster. -- The `backup-controller` and `backupstrategy-controller` components are installed and running. -- S3-compatible storage reachable from the management cluster — either the in-cluster SeaweedFS provisioned via the `Bucket` application, or any external S3 endpoint. -- The corresponding upstream operator is deployed for each application Kind you want to back up: CloudNativePG, mariadb-operator, or ClickHouse operator. These ship with Cozystack by default. - -## How a managed-application strategy works - -The flow on every `BackupJob`: - -1. A tenant creates a `BackupJob` (or a `Plan` that materialises one on a cron) that references a `BackupClass` and an `apps.cozystack.io/` application. -2. The core backup controller resolves the `BackupClass` and matches the application Kind to a driver-specific `strategy.backups.cozystack.io/` strategy. -3. The driver renders its strategy template against the live application object (`.Application`) and the BackupClass parameters (`.Parameters`), then creates the operator-native backup CR (`Backup` for mariadb, an HTTP call against the in-pod sidecar for ClickHouse, a barman-driven snapshot in `cnpg.io` for Postgres). -4. On success the driver creates a Cozystack `Backup` artefact in the same namespace; `RestoreJob` resources reference that artefact later. - -`BackupClass` is **cluster-scoped**: a single instance covers every tenant namespace. - -{{% alert color="info" %}} -Tenant users cannot list `BackupClass` resources under their kubeconfig (cluster-scoped resources are not reachable through the tenant `RoleBinding`). Once you create a `BackupClass`, **publish its name to tenants out-of-band** — in the platform handbook, in the ticket that onboards their application, or in your internal Slack channel. Tenants reference the name verbatim in `BackupJob.spec.backupClassName`. -{{% /alert %}} - -## Per-driver setup - -The strategies below are written for the in-cluster SeaweedFS `Bucket` application. If you use external S3 storage, drop the `endpointCA` / TLS sections and point the endpoint at your provider. - -### Postgres (CNPG strategy) - -The CNPG driver delegates to CloudNativePG's native barman backup. Each `BackupJob` is a barman snapshot streamed to S3; `RestoreJob` recreates the `cnpg.io/Cluster` from the archive. - -Create the strategy: - -```yaml -apiVersion: strategy.backups.cozystack.io/v1alpha1 -kind: CNPG -metadata: - name: postgres-data-cnpg-strategy -spec: - template: - serverName: "{{ .Application.metadata.name }}" - barmanObjectStore: - destinationPath: "s3://REPLACE_WITH_COSI_BUCKET_NAME/{{ .Application.metadata.name }}/" - endpointURL: "https://REPLACE_WITH_S3_ENDPOINT" - retentionPolicy: "30d" - endpointCA: - secretRef: - name: "{{ .Application.metadata.name }}-cnpg-backup-ca" - key: "ca.crt" - s3Credentials: - secretRef: - name: "{{ .Application.metadata.name }}-cnpg-backup-creds" - data: - compression: gzip - wal: - compression: gzip -``` - -Bind the application Kind: - -```yaml -apiVersion: backups.cozystack.io/v1alpha1 -kind: BackupClass -metadata: - name: postgres-data-backup -spec: - strategies: - - application: - apiGroup: apps.cozystack.io - kind: Postgres - strategyRef: - apiGroup: strategy.backups.cozystack.io - kind: CNPG - name: postgres-data-cnpg-strategy -``` - -Per-application Secrets the tenant must provision in the application namespace: - -| Secret | Keys | Purpose | -|---|---|---| -| `-cnpg-backup-creds` | `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` | S3 credentials consumed by barman | -| `-cnpg-backup-ca` *(only for self-signed endpoints)* | `ca.crt` | CA bundle the barman client trusts | - -Drop the `endpointCA` block in the strategy when your S3 endpoint has a publicly-trusted certificate. - -### MariaDB - -The MariaDB driver delegates to [mariadb-operator](https://github.com/mariadb-operator/mariadb-operator). Backups materialise as `k8s.mariadb.com/v1alpha1 Backup` CRs (logical `mariadb-dump`); restores materialise as `Restore` CRs that `mariadb-import` the dump back into the live database. - -Create the strategy: - -```yaml -apiVersion: strategy.backups.cozystack.io/v1alpha1 -kind: MariaDB -metadata: - name: mariadb-data-strategy -spec: - template: - storage: - s3: - bucket: "REPLACE_WITH_COSI_BUCKET_NAME" - endpoint: "REPLACE_WITH_S3_ENDPOINT" - prefix: "{{ .Application.metadata.name }}/" - accessKeyIdSecretKeyRef: - name: "{{ .Application.metadata.name }}-mariadb-backup-creds" - key: "AWS_ACCESS_KEY_ID" - secretAccessKeySecretKeyRef: - name: "{{ .Application.metadata.name }}-mariadb-backup-creds" - key: "AWS_SECRET_ACCESS_KEY" - tls: - enabled: true - caSecretKeyRef: - name: "{{ .Application.metadata.name }}-mariadb-backup-ca" - key: "ca.crt" - compression: gzip -``` - -The `endpoint` is **path-style without scheme** (e.g. `seaweedfs-s3..svc:8333` for the default in-cluster SeaweedFS — substitute the namespace where SeaweedFS is deployed in your environment). Drop the `tls` block entirely when the endpoint serves a publicly-trusted certificate. - -Bind the application Kind: - -```yaml -apiVersion: backups.cozystack.io/v1alpha1 -kind: BackupClass -metadata: - name: mariadb-data-backup -spec: - strategies: - - application: - apiGroup: apps.cozystack.io - kind: MariaDB - strategyRef: - apiGroup: strategy.backups.cozystack.io - kind: MariaDB - name: mariadb-data-strategy -``` - -Per-application Secrets the tenant must provision in the application namespace: - -| Secret | Keys | Purpose | -|---|---|---| -| `-mariadb-backup-creds` | `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` | S3 credentials consumed by mariadb-operator | -| `-mariadb-backup-ca` *(only for self-signed endpoints)* | `ca.crt` | CA bundle for TLS verification | - -{{% alert color="info" %}} -The chart-level `backup.*` block in `apps.cozystack.io/MariaDB` (the legacy `mariadb-dump` + `restic` path) is **deprecated** in favour of this BackupClass flow. Existing tenants with `backup.enabled=true` continue to render the legacy resources unchanged. -{{% /alert %}} - -### ClickHouse (Altinity strategy) - -The Altinity driver does **not** template a backup CR. It renders a small `PodTemplateSpec` that runs `curl + jq` against the in-pod [`clickhouse-backup`](https://github.com/Altinity/clickhouse-backup) HTTP API (port 7171) provided by a sidecar inside every `chi-*` Pod. - -{{% alert color="warning" %}} -The Altinity strategy **requires** `backup.enabled=true` on every ClickHouse application instance — that flag is what materialises the in-pod sidecar and the `clickhouse--backup-api-auth` Secret the strategy authenticates with. Unlike MariaDB, ClickHouse's chart-level `backup.*` block is **not** deprecated; the BackupClass flow piggybacks on the same sidecar. -{{% /alert %}} - -Create the strategy. The `template` is a `PodTemplateSpec` driving the sidecar; for the full reference template (with the shell script that POSTs `create_remote` / `restore_remote` and polls the action log) see [`examples/backups/clickhouse/01-create-strategy.sh`](https://github.com/cozystack/cozystack/blob/main/examples/backups/clickhouse/01-create-strategy.sh) in the cozystack repo. - -```yaml -apiVersion: strategy.backups.cozystack.io/v1alpha1 -kind: Altinity -metadata: - name: clickhouse-data-altinity-strategy -spec: - template: - spec: - restartPolicy: Never - containers: - - name: ch-backup-client - image: alpine:3.19 - env: - - name: API_USERNAME - valueFrom: - secretKeyRef: - name: clickhouse-{{ .Release.Name }}-backup-api-auth - key: username - - name: API_PASSWORD - valueFrom: - secretKeyRef: - name: clickhouse-{{ .Release.Name }}-backup-api-auth - key: password - command: ["/bin/sh", "-c"] - args: - # See examples/backups/clickhouse/01-create-strategy.sh for the - # full script: branches on .Mode (backup|restore) and either - # POSTs /backup/create_remote or /backup/restore_remote/, - # then polls /backup/actions for terminal status. - - | - # ... (truncated; see linked example) -``` - -Bind the application Kind. No parameters are required — the strategy template addresses the sidecar by deterministic Pod DNS and reads S3 credentials from the chart-emitted `-backup-s3` Secret directly. - -```yaml -apiVersion: backups.cozystack.io/v1alpha1 -kind: BackupClass -metadata: - name: clickhouse-data-backup -spec: - strategies: - - application: - apiGroup: apps.cozystack.io - kind: ClickHouse - strategyRef: - apiGroup: strategy.backups.cozystack.io - kind: Altinity - name: clickhouse-data-altinity-strategy -``` - -## Apply and verify - -Apply the strategy and `BackupClass` manifests: - -```bash -kubectl apply -f .yaml -kubectl apply -f .yaml -``` - -List the resources: - -```bash -kubectl get cnpgs.strategy.backups.cozystack.io -kubectl get mariadbs.strategy.backups.cozystack.io -kubectl get altinities.strategy.backups.cozystack.io -kubectl get backupclasses -``` - -Each strategy should report no error conditions; each `BackupClass` should list the strategy entries you defined. - -## Tenant onboarding - -Tenant users cannot create `Secret` objects under the standard Cozystack RBAC, and they cannot read `Bucket`-emitted credential Secrets. Before a tenant can run their first `BackupJob`, an administrator must provision per-tenant storage and the per-application credential Secrets each driver expects. Perform these steps once per managed-DB application the tenant wants to back up. Examples use `tenant-user` for the tenant namespace and `my-postgres` / `my-mariadb` / `my-clickhouse` for the application name — substitute as appropriate. - -### Provision the storage Bucket - -If the tenant does not have external S3 coordinates, provision an in-cluster `Bucket` in their namespace: - -```yaml -apiVersion: apps.cozystack.io/v1alpha1 -kind: Bucket -metadata: - name: db-backups - namespace: tenant-user -spec: - users: - backup: - readonly: false -``` - -```bash -kubectl apply -f bucket.yaml -kubectl -n tenant-user wait hr/bucket-db-backups --for=condition=ready --timeout=300s -``` - -The `Bucket` controller materialises a `bucket--backup` Secret in the namespace carrying a `BucketInfo` JSON blob — the S3 endpoint, bucket name, and access keys come from there. - -### Read the bucket credentials - -Run this once per shell session. Every per-driver block below reuses `$ACCESS_KEY`, `$SECRET_KEY`, and `/tmp/bucket.json`: - -```bash -kubectl -n tenant-user get secret bucket-db-backups-backup \ - -o jsonpath='{.data.BucketInfo}' | base64 -d > /tmp/bucket.json -ACCESS_KEY=$(jq -r .spec.secretS3.accessKeyID /tmp/bucket.json) -SECRET_KEY=$(jq -r .spec.secretS3.accessSecretKey /tmp/bucket.json) -``` - -### Create per-application credential Secrets - -Each driver expects per-application credential Secrets in the application namespace — the strategy templates reference them by name (`{{ .Application.metadata.name }}-...`). - -#### Postgres (CNPG) - -Project the credentials in the keys CNPG's barman client expects: - -```bash -kubectl -n tenant-user create secret generic my-postgres-cnpg-backup-creds \ - --from-literal=AWS_ACCESS_KEY_ID="$ACCESS_KEY" \ - --from-literal=AWS_SECRET_ACCESS_KEY="$SECRET_KEY" -``` - -When the S3 endpoint uses a self-signed certificate (the SeaweedFS default), also create a CA Secret: - -```bash -kubectl -n tenant-user create secret generic my-postgres-cnpg-backup-ca \ - --from-file=ca.crt=/path/to/ca.crt -``` - -#### MariaDB - -```bash -kubectl -n tenant-user create secret generic my-mariadb-mariadb-backup-creds \ - --from-literal=AWS_ACCESS_KEY_ID="$ACCESS_KEY" \ - --from-literal=AWS_SECRET_ACCESS_KEY="$SECRET_KEY" -``` - -For self-signed endpoints, add `my-mariadb-mariadb-backup-ca` carrying `ca.crt` the same way. - -#### ClickHouse - -No extra Secret is needed for the BackupClass flow. The Altinity strategy reads S3 credentials from the chart-emitted `-backup-s3` Secret directly. Make sure `backup.enabled: true` is set on every ClickHouse application instance the tenant wants to back up, and that the `backup.*` block in the application values carries the bucket coordinates (see the [ClickHouse application reference]({{% ref "/docs/next/applications/clickhouse" %}})). - -## Handing off to tenants - -Tenants run backups and restores against the `BackupClass` names you created above using `BackupJob`, `Plan`, and `RestoreJob` resources. Walk them through the [Application Backup and Recovery]({{% ref "/docs/next/applications/backup-and-recovery" %}}) guide; they do not need admin permissions to operate against an existing `BackupClass`. Before pointing them at the guide: - -- Communicate the available `BackupClass` names (tenants cannot list them — cluster-scoped resources are not reachable through the tenant `RoleBinding`). -- Ensure that for every managed application the tenant wants to back up, the per-application credential Secret described in [Tenant onboarding](#tenant-onboarding) already exists in their namespace. - -## Tenant escalation: driver-side diagnostics - -When a tenant's `BackupJob` or `RestoreJob` ends in `phase: Failed` and the `status.message` does not pinpoint the cause, the tenant cannot inspect operator-native CRs themselves — their RBAC excludes `cnpg.io`, `k8s.mariadb.com`, and the `pods/log` subresource. Run these commands on their behalf, using the `BackupJob` name they hand you: - -```bash -# Postgres (CloudNativePG) -kubectl -n tenant-user get backups.cnpg.io -# MariaDB -kubectl -n tenant-user get backups.k8s.mariadb.com,restores.k8s.mariadb.com -# ClickHouse — the strategy runs as a one-shot Pod that talks to the in-pod sidecar -kubectl -n tenant-user logs -l backups.cozystack.io/owned-by.BackupJobName=my-clickhouse-adhoc -``` - -For ClickHouse archive purges, the tenant cannot reach the in-pod `clickhouse-backup` sidecar HTTP API directly; on their request, exec into the ClickHouse pod and call `DELETE /backup//remote` against the local sidecar (the chart-emitted `clickhouse--backup-api-auth` Secret carries the credentials). diff --git a/content/en/docs/next/operations/services/velero-backup-configuration.md b/content/en/docs/next/operations/services/velero-backup-configuration.md deleted file mode 100644 index 6b59fc5d..00000000 --- a/content/en/docs/next/operations/services/velero-backup-configuration.md +++ /dev/null @@ -1,319 +0,0 @@ ---- -title: "Velero Backup Configuration" -linkTitle: "Velero Backup Configuration" -description: "Configure backup storage, strategies, and BackupClasses for cluster backups (for cluster administrators)." -weight: 30 ---- - -This guide is for **cluster administrators** who configure the backup infrastructure in Cozystack: S3 storage, Velero locations, backup **strategies**, and **BackupClasses**. Tenant users then use existing BackupClasses to create [BackupJobs and Plans]({{% ref "/docs/next/virtualization/backup-and-recovery" %}}). - -{{% alert color="info" %}} -This page covers **Velero-driven** backups that bundle the application HelmRelease, CRs, and PVC snapshots — the model used for VMInstance / VMDisk. For data-only backups of managed databases (Postgres, MariaDB, ClickHouse, FoundationDB) driven by each operator's native mechanism, see [Managed Application Backup Configuration]({{% ref "/docs/next/operations/services/managed-app-backup-configuration" %}}). -{{% /alert %}} - -## Prerequisites - -- Administrator access to the Cozystack (management) cluster. -- S3-compatible storage: if you want to store backups in Cozy you need enable SeaweedFS and create a Bucket or can use another external S3 service. -- Enable disabled by default component `cozystack.velero` in `bundles.enabledPackages` of the [Platform Package]({{% ref "/docs/next/operations/configuration/platform-package" %}}). And for **tenant clusters**, set `spec.addons.velero.enabled` to `true` in the `Kubernetes` resource. - -## 1. Set up storage credentials and configuration - -Create the following resources in the **management cluster** in the `cozy-velero` namespace so that Velero can store backups and volume snapshots. - -### 1.1 Create a secret with S3 credentials - -```yaml -apiVersion: v1 -kind: Secret -metadata: - name: s3-credentials - namespace: cozy-velero -type: Opaque -stringData: - cloud: | - [default] - aws_access_key_id= - aws_secret_access_key= - - services = seaweed-s3 - [services seaweed-s3] - s3 = - endpoint_url = https://s3.tenant-name.cozystack.example.com -``` - -### 1.2 Configure BackupStorageLocation - -This resource defines where Velero stores backups (S3 bucket). - -```yaml -apiVersion: velero.io/v1 -kind: BackupStorageLocation -metadata: - name: default - namespace: cozy-velero -spec: - provider: aws - objectStorage: - bucket: - config: - checksumAlgorithm: '' - profile: "default" - s3ForcePathStyle: "true" - s3Url: https://s3.tenant-name.cozystack.example.com - credential: - name: s3-credentials - key: cloud -``` - -`BUCKET_NAME` can be found with: -```bash -kubectl get bucketclaim -A -o custom-columns=NAME:.metadata.name,NAMESPACE:.metadata.namespace,BUCKET_NAME:.status.bucketName,READY:.status.bucketReady -``` - -See [BackupStorageLocation](https://velero.io/docs/v1.17/api-types/backupstoragelocation/) in the Velero docs. - -Check that creation was successful: -```bash -k get BackupStorageLocation -n cozy-velero -``` - -Output should be similar to: -```bash -NAME PHASE LAST VALIDATED AGE DEFAULT -default Available 5s 3d9h true -``` - -### 1.3 Configure VolumeSnapshotLocation - -This resource defines the configuration for volume snapshots. - -```yaml -apiVersion: velero.io/v1 -kind: VolumeSnapshotLocation -metadata: - name: default - namespace: cozy-velero -spec: - provider: aws - credential: - name: s3-credentials - key: cloud - config: - region: "us-west-2" - profile: "default" -``` - -See [VolumeSnapshotLocation](https://velero.io/docs/v1.17/api-types/volumesnapshotlocation/) in the Velero docs. - -## 2. Define a backup strategy - -A **strategy** describes [Velero Backup](https://velero.io/docs/v1.17/api-types/backup/) template. It is a reusable template referenced by BackupClasses. - -In a strategy you define: - -- **Scope**: namespaces and resources (e.g. a tenant namespace or resources by label). -- **Volume handling**: whether to snapshot volumes and use `snapshotMoveData`. -- **Retention**: default backup TTL. - -Check the CRD group, version, and kind in your cluster: - -```bash -kubectl get crd | grep -i backup -kubectl explain --recursive -``` - -Example strategy for VMInstance (includes all VM resources and attached volumes): - -```yaml -apiVersion: strategy.backups.cozystack.io/v1alpha1 -kind: Velero -metadata: - name: vminstance-strategy -spec: - template: - restoreSpec: - existingResourcePolicy: update - includedNamespaces: - - '{{ .Application.metadata.namespace }}' - orLabelSelectors: - - matchLabels: - app.kubernetes.io/instance: 'vm-instance-{{ .Application.metadata.name }}' - - matchLabels: - apps.cozystack.io/application.kind: '{{ .Application.kind }}' - apps.cozystack.io/application.name: '{{ .Application.metadata.name }}' - includedResources: - - helmreleases.helm.toolkit.fluxcd.io - - virtualmachines.kubevirt.io - - virtualmachineinstances.kubevirt.io - - pods - - persistentvolumeclaims - - configmaps - - secrets - - controllerrevisions.apps - includeClusterResources: false - excludedResources: - - datavolumes.cdi.kubevirt.io - - spec: - includedNamespaces: - - '{{ .Application.metadata.namespace }}' - orLabelSelectors: - - matchLabels: - app.kubernetes.io/instance: 'vm-instance-{{ .Application.metadata.name }}' - - matchLabels: - apps.cozystack.io/application.kind: '{{ .Application.kind }}' - apps.cozystack.io/application.name: '{{ .Application.metadata.name }}' - includedResources: - - helmreleases.helm.toolkit.fluxcd.io - - virtualmachines.kubevirt.io - - virtualmachineinstances.kubevirt.io - - pods - - datavolumes.cdi.kubevirt.io - - persistentvolumeclaims - - configmaps - - secrets - - controllerrevisions.apps - includeClusterResources: false - storageLocation: '{{ .Parameters.backupStorageLocationName }}' - volumeSnapshotLocations: - - '{{ .Parameters.backupStorageLocationName }}' - snapshotVolumes: true - snapshotMoveData: true - ttl: 720h0m0s - itemOperationTimeout: 24h0m0s -``` - -Example strategy for VMDisk (disk and its volume only): - -```yaml -apiVersion: strategy.backups.cozystack.io/v1alpha1 -kind: Velero -metadata: - name: vmdisk-strategy -spec: - template: - restoreSpec: - existingResourcePolicy: update - includedNamespaces: - - '{{ .Application.metadata.namespace }}' - orLabelSelectors: - - matchLabels: - app.kubernetes.io/instance: 'vm-disk-{{ .Application.metadata.name }}' - - matchLabels: - apps.cozystack.io/application.kind: '{{ .Application.kind }}' - apps.cozystack.io/application.name: '{{ .Application.metadata.name }}' - includedResources: - - helmreleases.helm.toolkit.fluxcd.io - - persistentvolumeclaims - - configmaps - includeClusterResources: false - - spec: - includedNamespaces: - - '{{ .Application.metadata.namespace }}' - orLabelSelectors: - - matchLabels: - app.kubernetes.io/instance: 'vm-disk-{{ .Application.metadata.name }}' - - matchLabels: - apps.cozystack.io/application.kind: '{{ .Application.kind }}' - apps.cozystack.io/application.name: '{{ .Application.metadata.name }}' - includedResources: - - helmreleases.helm.toolkit.fluxcd.io - - persistentvolumeclaims - - configmaps - includeClusterResources: false - storageLocation: '{{ .Parameters.backupStorageLocationName }}' - volumeSnapshotLocations: - - '{{ .Parameters.backupStorageLocationName }}' - snapshotVolumes: true - snapshotMoveData: true - ttl: 720h0m0s - itemOperationTimeout: 24h0m0s -``` - -Template variables (`{{ .Application.* }}` and `{{ .Parameters.* }}`) are resolved from the ApplicationRef in the BackupJob/Plan and the parameters defined in the BackupClass. - -Don't forget to apply it into management cluster: - -```bash -kubectl apply -f velero-backup-strategy.yaml -``` - -## 3. Create a BackupClass - -A **BackupClass** binds a strategy to applications, you can define some Parameters - -Verify the BackupClass CRD in your cluster: - -```bash -kubectl get backupclasses -kubectl explain backupclasses.spec --recursive -``` - -```yaml -apiVersion: backups.cozystack.io/v1alpha1 -kind: BackupClass -metadata: - name: velero -spec: - strategies: - - strategyRef: - apiGroup: strategy.backups.cozystack.io - kind: Velero - name: vminstance-strategy - application: - kind: VMInstance - apiGroup: apps.cozystack.io - parameters: - backupStorageLocationName: default - - strategyRef: - apiGroup: strategy.backups.cozystack.io - kind: Velero - name: vmdisk-strategy - application: - kind: VMDisk - apiGroup: apps.cozystack.io - parameters: - backupStorageLocationName: default -``` - -Apply and list: - -```bash -kubectl apply -f backupclass.yaml -kubectl get backupclasses -``` - -## 4. How users run backups - -Once strategies and BackupClasses are in place, **tenant users** can run backups without touching Velero or storage configuration: - -- **One-off backup**: create a [BackupJob]({{% ref "/docs/next/virtualization/backup-and-recovery#one-off-backup" %}}) that references a BackupClass. -- **Scheduled backups**: create a [Plan]({{% ref "/docs/next/virtualization/backup-and-recovery#scheduled-backup" %}}) with a cron schedule and a BackupClass reference. - -Direct use of Velero CRDs (`Backup`, `Schedule`, `Restore`) remains available for advanced or recovery scenarios: - -```bash -kubectl get backup.velero.io -n cozy-velero -kubectl get schedule.velero.io -n cozy-velero -kubectl get restores.velero.io -n cozy-velero -``` - -If the [Velero CLI](https://velero.io/docs/v1.17/basic-install/#install-the-cli) is installed, you can also run: - -```bash -velero -n cozy-velero backup get -velero -n cozy-velero schedule get -velero -n cozy-velero restore get -``` - -To inspect the Velero logs, use the following command: - -```bash -kubectl logs -n cozy-velero -l app.kubernetes.io/name=velero --tail=100 -``` - -## 5. Restore from a backup - -Once strategies and BackupClasses are in place, tenant users can restore from a backup using **RestoreJob** resources. See the [Backup and Recovery]({{% ref "/docs/next/virtualization/backup-and-recovery" %}}) guide for restore instructions covering VMInstance and VMDisk in-place restores. diff --git a/content/en/docs/next/virtualization/backup-and-recovery.md b/content/en/docs/next/virtualization/backup-and-recovery.md index 3f43c300..4821c378 100644 --- a/content/en/docs/next/virtualization/backup-and-recovery.md +++ b/content/en/docs/next/virtualization/backup-and-recovery.md @@ -8,7 +8,7 @@ aliases: - /docs/next/kubernetes/backup-and-recovery --- -Cluster backup **strategies** and **BackupClasses** are configured by cluster administrators. If your tenant does not have a BackupClass yet, ask your administrator to follow the [Velero Backup Configuration]({{% ref "/docs/next/operations/services/velero-backup-configuration" %}}) guide to set up storage, strategies, and BackupClasses. +Cluster backup **strategies** and **BackupClasses** are configured by cluster administrators. If your tenant does not have a BackupClass yet, ask your administrator to follow the [Backup Classes]({{% ref "/docs/next/operations/services/backup-classes" %}}) guide to set up storage, strategies, and BackupClasses. This guide covers backing up and restoring **VMInstance** and **VMDisk** resources as a tenant user: running one-off and scheduled backups, checking backup status, and restoring from a backup using RestoreJobs. @@ -35,11 +35,15 @@ kubectl get backupclasses Example output: ``` -NAME AGE -velero 14m +NAME AGE +cozy-default 14m ``` -Use the BackupClass name when creating a BackupJob or Plan. +`cozy-default` is the platform-shipped BackupClass; its `strategies[]` array binds the Velero driver for both `VMInstance` and `VMDisk`. Use this name when creating a BackupJob or Plan, or substitute a sibling class name if your administrator has created one. + +{{% alert color="info" %}} +**Fresh-cluster bootstrap window.** On a fresh-cluster install, the Velero `BackupStorageLocation` `cozy-default` reports `Unavailable` for tens of seconds after `helm install` returns, until the platform's credentials projector lands `cozy-backups-creds` into `cozy-velero`. Velero rejects new `Backup` and `Restore` requests against `storageLocation: cozy-default` during that window. If a BackupJob you submit fails immediately with a Velero error referencing storage, wait and retry, or ask your administrator to check `kubectl -n cozy-velero get bsl cozy-default -o jsonpath='{.status.phase}' = Available`. See the [Backup Classes admin guide]({{% ref "/docs/next/operations/services/backup-classes" %}}) for details. +{{% /alert %}} ## Back up a VMInstance @@ -60,7 +64,7 @@ spec: apiGroup: apps.cozystack.io kind: VMInstance name: my-vm - backupClassName: velero + backupClassName: cozy-default ``` Apply it and watch the status: @@ -88,7 +92,7 @@ spec: apiGroup: apps.cozystack.io kind: VMInstance name: my-vm - backupClassName: velero + backupClassName: cozy-default schedule: cron: "0 2 * * *" # Every day at 02:00 ``` @@ -108,7 +112,7 @@ Each scheduled run creates a BackupJob (and, on success, a Backup object) named You can back up a VMDisk independently — for example, to capture a specific disk without the VM configuration. {{% alert color="info" %}} -The BackupClass must include a strategy for `VMDisk`. Ask your administrator to add one if it is missing (see [Velero Backup Configuration]({{% ref "/docs/next/operations/services/velero-backup-configuration" %}})). +The BackupClass must include a strategy for `VMDisk`. Ask your administrator to add one if it is missing (see [Backup Classes]({{% ref "/docs/next/operations/services/backup-classes" %}})). {{% /alert %}} ```yaml @@ -122,7 +126,7 @@ spec: apiGroup: apps.cozystack.io kind: VMDisk name: my-disk - backupClassName: velero + backupClassName: cozy-default ``` Apply and check status: diff --git a/content/en/docs/v1.4/applications/backup-and-recovery.md b/content/en/docs/v1.4/applications/backup-and-recovery.md index 610289dd..cc472743 100644 --- a/content/en/docs/v1.4/applications/backup-and-recovery.md +++ b/content/en/docs/v1.4/applications/backup-and-recovery.md @@ -8,12 +8,11 @@ weight: 4 This guide covers backing up and restoring **Cozystack-managed databases** — Postgres, MariaDB, and ClickHouse — as a tenant user: running one-off and scheduled backups, checking status, and restoring from a backup either in place or into a separate target instance. {{% alert color="info" %}} -**Storage, credentials, and the `BackupClass` are admin-provisioned.** Before you can run a `BackupJob`, an administrator provisions the S3 storage and the per-application credential Secrets your driver expects, and creates the cluster-scoped `BackupClass` you reference. Ask your administrator for: +**Storage and the `BackupClass` are platform-provisioned.** Cozystack ships a single cluster-scoped `BackupClass` named `cozy-default` that covers Postgres, MariaDB, ClickHouse, Etcd, VMInstance, and VMDisk via a per-Kind `strategies[]` array. You reference it by name from `BackupJob` / `Plan` / `RestoreJob` — there is no per-application `BackupClass`, and you do **not** create or supply S3 credentials, endpoints, or paths. The platform projects a credentials Secret (`cozy-backups-creds`) into your tenant namespace automatically right before each BackupJob runs. -- the `BackupClass` name to use for your application Kind (you cannot list `BackupClass` resources under your tenant kubeconfig — they are cluster-scoped); -- confirmation that the per-application credential Secrets exist in your namespace for every managed-DB application you want to back up. +If your administrator has created additional sibling `BackupClass` resources (different retention, different storage, etc.), ask for the name and substitute it for `cozy-default` in the examples below. `BackupClass` is cluster-scoped, so you cannot list it under a tenant kubeconfig — your administrator will tell you which names are valid. -Admins follow the [Managed Application Backup Configuration]({{% ref "/docs/v1.4/operations/services/managed-app-backup-configuration" %}}) guide. +Admins follow the [Backup Classes]({{% ref "/docs/v1.4/operations/services/backup-classes" %}}) guide. {{% /alert %}} {{% alert color="warning" %}} @@ -28,8 +27,8 @@ For backups that include the application's Helm release, CRs, and PVC snapshots ## Prerequisites -- A `BackupClass` name handed to you by your administrator (for example, `postgres-data-backup` for a `Postgres` application). -- An existing managed-DB application (`Postgres`, `MariaDB`, or `ClickHouse`) in your tenant namespace. +- A `BackupClass` name. On a default install this is `cozy-default`, which covers `Postgres`, `MariaDB`, `ClickHouse`, and `Etcd`. If your administrator has created a sibling class, substitute that name everywhere below. +- An existing managed-DB application (`Postgres`, `MariaDB`, `ClickHouse`, or `Etcd`) in your tenant namespace. - `kubectl` and a tenant kubeconfig with the `tenant--admin` role. The examples below assume `tenant-user` for the tenant namespace; substitute your own. @@ -51,7 +50,7 @@ spec: apiGroup: apps.cozystack.io kind: Postgres name: my-postgres - backupClassName: postgres-data-backup + backupClassName: cozy-default ``` ```bash @@ -62,7 +61,7 @@ kubectl -n tenant-user describe backupjob my-postgres-adhoc When the `BackupJob` reaches `phase: Succeeded`, the driver creates a `Backup` object with the same name. That name is what you reference when restoring. -Replace `Postgres` / `postgres-data-backup` with `MariaDB` / `mariadb-data-backup` or `ClickHouse` / `clickhouse-data-backup` for the other drivers. +Replace `Postgres` with `MariaDB`, `ClickHouse`, or `Etcd` for the other drivers — the `BackupClass` (`cozy-default`) is the same; the platform-shipped class binds a strategy for every supported Kind. ### Scheduled backup @@ -79,7 +78,7 @@ spec: apiGroup: apps.cozystack.io kind: Postgres name: my-postgres - backupClassName: postgres-data-backup + backupClassName: cozy-default schedule: type: cron cron: "0 */6 * * *" # every 6 hours @@ -110,7 +109,7 @@ kubectl -n tenant-user describe backupjob my-postgres-adhoc kubectl -n tenant-user get events --field-selector involvedObject.name=my-postgres-adhoc ``` -If `status.message` does not pinpoint the failure, hand the `BackupJob` name to your administrator and they will inspect the operator-native CR the driver created (see [Tenant escalation: driver-side diagnostics]({{% ref "/docs/v1.4/operations/services/managed-app-backup-configuration#tenant-escalation-driver-side-diagnostics" %}}) in the admin guide). +If `status.message` does not pinpoint the failure, hand the `BackupJob` name to your administrator and they will inspect the operator-native CR the driver created (see [Backup Classes]({{% ref "/docs/v1.4/operations/services/backup-classes" %}}) in the admin guide). ## Restore in place @@ -201,10 +200,10 @@ kubectl -n tenant-user describe backupjob my-postgres-adhoc kubectl -n tenant-user get events --field-selector involvedObject.name=my-postgres-adhoc ``` -If those do not explain the failure, the next layer of diagnostics lives on the operator-native CR the driver created (`cnpg.io/Backup`, `k8s.mariadb.com/Backup`, or the ClickHouse strategy `Pod` logs). These resources are not reachable under the tenant kubeconfig — hand the `BackupJob` name to your administrator and they will follow [Tenant escalation: driver-side diagnostics]({{% ref "/docs/v1.4/operations/services/managed-app-backup-configuration#tenant-escalation-driver-side-diagnostics" %}}). +If those do not explain the failure, the next layer of diagnostics lives on the operator-native CR the driver created (`cnpg.io/Backup`, `k8s.mariadb.com/Backup`, or the ClickHouse strategy `Pod` logs). These resources are not reachable under the tenant kubeconfig — hand the `BackupJob` name to your administrator and they will follow [Backup Classes]({{% ref "/docs/v1.4/operations/services/backup-classes" %}}). ## See also -- [Managed Application Backup Configuration]({{% ref "/docs/v1.4/operations/services/managed-app-backup-configuration" %}}) — how administrators define strategies and `BackupClass` resources. +- [Backup Classes]({{% ref "/docs/v1.4/operations/services/backup-classes" %}}) — how administrators define strategies and `BackupClass` resources. - [Backup and Recovery (VMs)]({{% ref "/docs/v1.4/virtualization/backup-and-recovery" %}}) — the parallel guide for VMInstance / VMDisk backups (HelmRelease + CRs + PVC snapshots). -- [Velero Backup Configuration]({{% ref "/docs/v1.4/operations/services/velero-backup-configuration" %}}) — administrator setup for the Velero-driven VM backups. +- [Backup Classes]({{% ref "/docs/v1.4/operations/services/backup-classes" %}}) — administrator setup for the Velero-driven VM backups. diff --git a/content/en/docs/v1.4/kubernetes/backups-with-velero-addon.md b/content/en/docs/v1.4/kubernetes/backups-with-velero-addon.md index 445af080..edeed5e5 100644 --- a/content/en/docs/v1.4/kubernetes/backups-with-velero-addon.md +++ b/content/en/docs/v1.4/kubernetes/backups-with-velero-addon.md @@ -10,7 +10,7 @@ The `velero` addon of the [Managed Kubernetes]({{% ref "/docs/v1.4/kubernetes" % {{% alert color="info" %}} This guide is for the **tenant-side** Velero addon, which runs inside a tenant Kubernetes cluster and is operated by the tenant user. -For the platform-level Velero used by cluster administrators to back up `VMInstance`/`VMDisk` resources from the management cluster, see [Velero Backup Configuration]({{% ref "/docs/v1.4/operations/services/velero-backup-configuration" %}}). +For the platform-level Velero used by cluster administrators to back up `VMInstance`/`VMDisk` resources from the management cluster, see [Backup Classes]({{% ref "/docs/v1.4/operations/services/backup-classes" %}}). {{% /alert %}} ## What the addon installs @@ -172,4 +172,4 @@ The same pattern restores into a **different** tenant Kubernetes cluster as well - [Managed Kubernetes — `addons.velero` parameters]({{% ref "/docs/v1.4/kubernetes#parameters" %}}) - [Buckets and Users]({{% ref "/docs/v1.4/operations/services/object-storage/buckets" %}}) -- [Velero Backup Configuration (platform admin)]({{% ref "/docs/v1.4/operations/services/velero-backup-configuration" %}}) +- [Backup Classes (platform admin)]({{% ref "/docs/v1.4/operations/services/backup-classes" %}}) diff --git a/content/en/docs/v1.4/operations/services/backup-classes.md b/content/en/docs/v1.4/operations/services/backup-classes.md new file mode 100644 index 00000000..2d34565f --- /dev/null +++ b/content/en/docs/v1.4/operations/services/backup-classes.md @@ -0,0 +1,196 @@ +--- +title: "Backup Classes" +linkTitle: "Backup Classes" +description: "Default cozy-default BackupClass and the parameters tenants and admins can tune." +weight: 31 +--- + + +Cozystack ships a single platform-managed `BackupClass` named `cozy-default`. It is provisioned automatically when the `backupstrategy-controller` package is installed and references the system-managed S3 bucket `cozy-backups` in the `tenant-root` namespace. + +Tenants reference `cozy-default` from `BackupJob`, `Plan`, and `RestoreJob` resources — they do **not** supply S3 credentials, endpoints, or paths. The platform projects the system-managed credentials Secret into the tenant namespace per BackupJob (or, for long-lived references like Velero's `BackupStorageLocation`, into a fixed list of system namespaces on a periodic tick), and the default strategy templates encode `/` into every S3 path so two tenants with the same application name never collide. + +## Supported applications + +### Bound by `cozy-default` (work out-of-the-box) + +| Application Kind | Driver | Strategy CR | +|----------------------------------|--------------------------------------|----------------------------------------------------------------------------| +| `apps.cozystack.io/Postgres` | CloudNativePG (barman) | `strategy.backups.cozystack.io/CNPG` `cozy-default-cnpg` | +| `apps.cozystack.io/MariaDB` | mariadb-operator dump | `strategy.backups.cozystack.io/MariaDB` `cozy-default-mariadb` | +| `apps.cozystack.io/ClickHouse` | Altinity `clickhouse-backup` sidecar | `strategy.backups.cozystack.io/Altinity` `cozy-default-altinity` | +| `apps.cozystack.io/Etcd` | etcd-operator snapshot | `strategy.backups.cozystack.io/Etcd` `cozy-default-etcd` | +| `apps.cozystack.io/VMInstance` | Velero + kubevirt-velero-plugin | `strategy.backups.cozystack.io/Velero` `cozy-default-velero-vminstance` | +| `apps.cozystack.io/VMDisk` | Velero | `strategy.backups.cozystack.io/Velero` `cozy-default-velero-vmdisk` | + +### Shipped but NOT bound (admin opt-in required) + +| Application Kind | Driver | Strategy CR | +|----------------------------------|--------------------------------------|----------------------------------------------------------------------------| +| `apps.cozystack.io/FoundationDB` | FoundationDB operator backup_agent | `strategy.backups.cozystack.io/FoundationDB` `cozy-default-foundationdb` | + +The FoundationDB strategy CR is rendered by the chart so admins can reference it from a custom BackupClass once the operator-side plumbing (mounting `cozy-backups-creds` into the `cozy-foundationdb-operator` Deployment) is wired manually. See "FoundationDB caveat" below. + +### Endpoint format per driver + +Different operators expect different endpoint shapes; the strategy templates rendered by `backupstrategy-controller` adapt the single `backupStorage.endpoint` value (a full URL like `http://seaweedfs-s3.tenant-root.svc:8333`) to each consumer's contract: + +| Driver | Strategy template field | Form | +|--------|-------------------------|------| +| CNPG (Postgres) | `barmanObjectStore.endpointURL` | full URL (scheme preserved) | +| Etcd | `destination.s3.endpoint` | full URL (scheme preserved) | +| MariaDB | `storage.s3.endpoint` | bare host:port (scheme stripped); `tls.enabled` derived from the scheme | +| FoundationDB | `blobStoreConfiguration.accountName` + `urlParameters.secure_connection` | bare host:port + derived secure flag | +| Velero | `BackupStorageLocation.spec.config.s3Url` | full URL (scheme preserved) | +| ClickHouse sidecar | `S3_ENDPOINT` env | bare host:port (from projected Secret) | + +The projected `cozy-backups-creds.endpoint` key is **stripped of scheme** so chart-emitted sidecars (ClickHouse) consume it directly. Drivers that need the full URL pull from `backupStorage.endpoint` in chart values, not from the Secret. + +VM-driven (Velero) backups land in the same `cozy-backups` bucket under the `velero/` prefix. A `BackupStorageLocation` named `cozy-default` is shipped by the `backupstrategy-controller` chart (`packages/system/backupstrategy-controller/templates/velero-bsl.yaml`) so endpoint/bucket/region come from the same `backupStorage` values block used by Strategy CRs and the projector. + +### FoundationDB caveat + +The strategy CR `cozy-default-foundationdb` is shipped, but it is **not** bound by `cozy-default` yet. Restore runs `fdbrestore` from inside the `cozy-foundationdb-operator` Deployment, which does not yet mount `cozy-backups-creds`. Until the operator deployment is updated to mount the projected Secret, FDB platform-default restore silently fails — admins who need it today should keep using a per-app `Bucket` plus a custom `BackupClass`, or wire the credentials file into the operator deployment themselves. + +**Cleanup gotcha (zombie backup_agent).** Unlike CNPG/MariaDB/Altinity (one-shot operator-side Backup CRs), the FoundationDB driver creates a `foundationdb.org/FoundationDBBackup` CR that drives a **long-lived** `backup_agent` Deployment streaming continuously to S3. Deleting a Cozystack `Backup` (e.g. via retention sweeping) does NOT stop that Deployment — the agent keeps writing until the next BackupJob's `stopOtherFoundationDBBackups` call swaps it out, until an admin invokes `examples/backups/foundationdb/cleanup.sh`, or until the operator-side CR is deleted by hand. If a tenant deletes their last Cozystack Backup and never submits another BackupJob, the agent pods will continue running indefinitely and accumulate S3 PUTs. This is intentional today (the driver has no RBAC verb to stop the operator-side CR on Cozystack-Backup deletion) but admins should be aware of it. + +## ClickHouse: opt-in to the system bucket + +The `clickhouse-backup` sidecar runs inside the ClickHouse Pod itself, so the Helm chart is what wires its S3 credentials. Existing tenants on the legacy `backup.s3*` values continue to work unchanged. To switch a release onto the platform bucket, set: + +```yaml +backup: + enabled: true + useSystemBucket: true +``` + +When `useSystemBucket: true`: + +- The chart-emitted `-backup-s3` Secret is no longer rendered. +- The sidecar consumes `cozy-backups-creds` (projected by the platform). +- `S3_PATH` is set to `/` so two tenants with the same ClickHouse release name never share a prefix. + +`s3Region`, `s3Bucket`, `endpoint`, `s3AccessKey`, `s3SecretKey`, and `s3CredentialsSecret` are ignored in this mode. + +## Inspecting the defaults + +```bash +kubectl get backupclasses +kubectl get backupclass cozy-default -o yaml +kubectl -n tenant-root get bucket cozy-backups +kubectl -n tenant-root get secret bucket-cozy-backups-system-credentials +kubectl -n cozy-velero get backupstoragelocation cozy-default +``` + +The bucket lives in `tenant-root` and is provisioned through the `apps.cozystack.io/Bucket` CR. The system-managed credentials Secret never leaves that namespace. The backupstrategy-controller projects a copy under the name `cozy-backups-creds` into a tenant namespace right before each BackupJob runs, and refreshes the same Secret in `cozy-velero` (and any other namespace listed in `backupStorage.systemNamespaces`) on a 1-minute tick. The projected Secret carries multiple key formats so each driver finds what it needs in one place: + +| Key | Consumer | +|-----------------------------------------------|-------------------------------------------| +| `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY` | CNPG, MariaDB, Etcd | +| `accessKey` / `secretKey` (plus `bucketName`, `endpoint`, `region`) | ClickHouse sidecar | +| `cloud` | Velero (AWS credentials file format) | +| `blob_credentials.json` | FoundationDB backup_agent | + +### Bootstrap window + +On a fresh-cluster install, the Velero `BackupStorageLocation` `cozy-default` is rendered before the credentials projector has had a chance to copy `cozy-backups-creds` into `cozy-velero`. The BSL reports `Unavailable` until the projector's first synchronous round completes (which happens immediately when the `backupstrategy-controller` Pod becomes Ready — typically tens of seconds after `helm install` returns, not minutes). Velero rejects new `Backup` AND `Restore` requests against `storageLocation: cozy-default` during that window. Plan VM backup automation accordingly, or wait for `kubectl -n cozy-velero get bsl cozy-default -o jsonpath='{.status.phase}' = Available` before submitting backups. + +**Note on controller restarts.** The BSL flickers `Unavailable` on every `backupstrategy-controller` pod restart while the projector replays its first synchronous round. The window is short (single-digit seconds) but operators who alert on BSL availability should suppress alerts during the controller's `kube_pod_container_status_restarts_total{container=backupstrategy-controller}` events or use a longer evaluation window than the projector tick (60s). + +### Cozy-default Bucket bootstrap + +`cozy-default` ships an `apps.cozystack.io/Bucket cozy-backups` CR in `tenant-root`, which the bucket-application chart turns into a `BucketClaim`; the COSI driver then assigns the real S3 bucket name and writes it to the BucketClaim's `.status.bucketName`. The strategy templates and the Velero BSL all read that real bucket name (Helm `lookup` against the BucketClaim). On a fresh install the BucketClaim takes a short reconcile cycle to populate its status — until it does, the strategy templates render empty and only the `Bucket` CR + `BackupClass` are present in the cluster. Flux re-renders the HelmRelease on its standard interval (default 10 minutes), at which point the populated BucketClaim status causes the missing strategy templates to materialise. + +If you need the BackupClass functional immediately (e.g. an e2e), trigger a Flux reconcile (`flux reconcile helmrelease backupstrategy-controller`) once you see `kubectl get bucketclaim -n tenant-root bucket-cozy-backups -o jsonpath='{.status.bucketName}'` non-empty. + +### Observability + +The credentials projector emits two Prometheus counters labelled by `namespace` (and `reason` for failures): + +- `cozystack_backup_credentials_projection_successes_total` +- `cozystack_backup_credentials_projection_failures_total` + +Alert on `rate(failures_total) > 0` or `absent_over_time(successes_total[10m])` to catch a stale BSL credential or a malformed source Secret without log scraping. + +## Admin overrides for `cozy-default` + +`cozy-default` is rendered by the `backupstrategy-controller` chart and owned by Flux's helm-controller. **Direct `kubectl edit backupclass cozy-default` is overwritten on the next helm reconcile** — the same applies to its companion `strategy.backups.cozystack.io/*` CRs (`cozy-default-cnpg`, `cozy-default-etcd`, `cozy-default-mariadb`, `cozy-default-altinity`, `cozy-default-foundationdb`, the two `cozy-default-velero-*`). The supported override path is the cozystack `Package` CR, which lets admins inject Helm values into platform components: + +```yaml +apiVersion: cozystack.io/v1alpha1 +kind: Package +metadata: + name: cozystack.cozystack-platform +spec: + components: + backupstrategy-controller: + values: + backupStorage: + provisionBucket: true # default; set false for external S3 + bucketName: cozy-backups # apps.cozystack.io/Bucket release name + endpoint: http://seaweedfs-s3.tenant-root.svc.cozy.local:8333 + region: us-east-1 + forcePathStyle: true + systemSecretName: bucket-cozy-backups-system-credentials + systemNamespaces: + - cozy-velero +``` + +| Knob | Effect | +|---|---| +| `provisionBucket` | Toggle creation of the in-cluster `apps.cozystack.io/Bucket` CR. Set `false` for external S3 (see [Disabling the platform-managed bucket](#disabling-the-platform-managed-bucket)). | +| `bucketName` | K8s name of the Bucket CR + lookup key for the COSI BucketClaim. The actual S3 bucket name is the COSI-assigned UUID, surfaced through `BucketClaim.status.bucketName`. | +| `bucketNameOverride` | Escape hatch for offline `helm template` renders — bypasses the live-cluster BucketClaim lookup. Leave empty in production. | +| `endpoint` | S3 endpoint baked into every default strategy CR + the Velero BSL. Switching to `https://` silently enables TLS in the MariaDB strategy — ensure the CA bundle is reachable to the relevant operator/driver Pods before flipping it. | +| `region` | Re-projected into `cozy-backups-creds` on the next reconcile. Pod-restart required for chart-emitted clients consuming the region via env (ClickHouse sidecar today). | +| `forcePathStyle` | Path-style addressing; SeaweedFS S3 requires it, AWS S3 typically doesn't. | +| `systemSecretName` | Name of the human-friendly Secret produced by the Bucket app (or pre-created manually for external S3). The projector also accepts the raw COSI Secret format. | +| `systemNamespaces` | Namespaces where the controller eagerly projects `cozy-backups-creds` (Velero BSL, FDB operator). Tenants are projected lazily during BackupJob reconcile. | + +When the override needs to go beyond storage coordinates — different retention, different driver→Kind binding, multi-region split — create a **sibling BackupClass** with a unique name (anything but `cozy-default`). Sibling BackupClasses live outside the chart, are admin-owned, and Flux will not touch them. Tenants opt in by setting `backupClassName: ` on their `BackupJob`s. + +## Tuning via a custom BackupClass + +The defaults aim at a reasonable middle (30-day retention, gzip compression where applicable). To override for a specific tenant or workload, create your own `BackupClass` pointing at the same strategy CRs but with tweaked `parameters`, or a fresh strategy CR. Common knobs: + +- **CNPG strategy**: `barmanObjectStore.retentionPolicy`, `data.compression`, `wal.compression`. +- **MariaDB strategy**: `compression`, `maxRetention`, `databases[]`. +- **Altinity strategy**: tune the `clickhouse-backup` sidecar via `backup.*` values on the ClickHouse release; the strategy Pod is a thin HTTP client. +- **FoundationDB strategy**: `snapshotPeriodSeconds`, `agentCount`, `urlParameters[]`. +- **Velero strategy (VMInstance / VMDisk)**: `ttl`, `includedResources[]`, `excludedResources[]`. +- **Etcd strategy**: today the strategy is path-only; combine with `Plan.spec.retentionPolicy` for trim cadence. + +The system-managed credentials Secret is the **only** way for in-cluster strategies to reach `cozy-backups`. Do not embed access keys in `BackupClass.parameters` — the security model relies on Secret references, and `parameters` end up in `Backup.status.underlyingResources`, which tenants can read. + +## Disabling the platform-managed bucket + +If a deployment runs against an external S3 (no SeaweedFS), set `backupStorage.provisionBucket: false` in the `backupstrategy-controller` values and create the source credentials Secret in `tenant-root` manually (flat-key format: `accessKey` / `secretKey` / `endpoint` / `bucketName`; or the raw COSI `BucketInfo` JSON). Update `backupStorage.endpoint`, `backupStorage.region`, and (for VM backups) the chart's Velero BSL settings to point at the external S3. + +## Upgrade notes from chart-managed backups + +> **Postgres `backup.enabled: true` with placeholder credentials no longer renders `barmanObjectStore` on upgrade.** +> +> The pre-v1.4 defaults for `backup.s3AccessKey` / `backup.s3SecretKey` in `packages/apps/postgres/values.yaml` were the literal `""` / `""` placeholders, so the Postgres chart still rendered `spec.backup.barmanObjectStore` on the `cnpg.io/Cluster` (with junk credentials, `archive_command` failing at runtime). After v1.4 those defaults are empty strings and the chart NO LONGER renders the backup block at all when the placeholders are unmodified. Tenants on the legacy chart-managed flow who relied on those placeholders see their `barmanObjectStore` disappear from the live `Cluster` on `helm upgrade`. Action — pick one: +> +> - **Move to the platform flow (recommended).** Set `backup.useSystemBucket: true`; the chart leaves `barmanObjectStore` unset and the CNPG backup driver SSA-patches it onto the live `Cluster` at first BackupJob time. No tenant-side keys required. +> - **Stay on the legacy chart-managed flow.** Supply real `backup.s3AccessKey` / `backup.s3SecretKey` (or a pre-existing `backup.s3CredentialsSecret.name`); the chart renders `barmanObjectStore` exactly as before. +> +> The same `useSystemBucket` opt-in applies to ClickHouse — see [ClickHouse: opt-in to the system bucket](#clickhouse-opt-in-to-the-system-bucket). When `useSystemBucket: true` is set on ClickHouse, the legacy `-backup` CronJob, credential Secret, and backup script are no longer rendered (they are mutually exclusive with the platform flow); migrate scheduled backups to a `backups.cozystack.io/Plan` against `cozy-default`. + +## Tenant workflow + +Tenants only ever see the BackupClass name. Typical apply: + +```yaml +apiVersion: backups.cozystack.io/v1alpha1 +kind: BackupJob +metadata: + name: ad-hoc + namespace: tenant-acme +spec: + backupClassName: cozy-default + applicationRef: + apiGroup: apps.cozystack.io + kind: Postgres + name: orders-db +``` diff --git a/content/en/docs/v1.4/operations/services/managed-app-backup-configuration.md b/content/en/docs/v1.4/operations/services/managed-app-backup-configuration.md deleted file mode 100644 index 45cca86d..00000000 --- a/content/en/docs/v1.4/operations/services/managed-app-backup-configuration.md +++ /dev/null @@ -1,337 +0,0 @@ ---- -title: "Managed Application Backup Configuration" -linkTitle: "Managed Application Backup Configuration" -description: "Configure strategies and BackupClasses for logical data backups of managed databases (Postgres, MariaDB, ClickHouse)." -weight: 31 ---- - -This guide is for **cluster administrators** who configure backup strategies for Cozystack-managed database applications: Postgres, MariaDB, and ClickHouse. Once strategies and `BackupClass` resources are in place, tenants run backups and restores by creating [BackupJob, Plan, and RestoreJob]({{% ref "/docs/v1.4/applications/backup-and-recovery" %}}) resources with no further admin action. - -{{% alert color="info" %}} -This page covers **data-only** backups driven by each operator's native backup mechanism (CloudNativePG barman, mariadb-operator dumps, Altinity `clickhouse-backup`). The `apps.cozystack.io/*` CR, its `HelmRelease`, chart values, and operator-managed Secrets are **not** captured by these strategies. - -For backups that bundle Helm release + CRs + PVC snapshots (used by VMInstance / VMDisk), see [Velero Backup Configuration]({{% ref "/docs/v1.4/operations/services/velero-backup-configuration" %}}). -{{% /alert %}} - -## Prerequisites - -- Administrator access to the Cozystack (management) cluster. -- The `backup-controller` and `backupstrategy-controller` components are installed and running. -- S3-compatible storage reachable from the management cluster — either the in-cluster SeaweedFS provisioned via the `Bucket` application, or any external S3 endpoint. -- The corresponding upstream operator is deployed for each application Kind you want to back up: CloudNativePG, mariadb-operator, or ClickHouse operator. These ship with Cozystack by default. - -## How a managed-application strategy works - -The flow on every `BackupJob`: - -1. A tenant creates a `BackupJob` (or a `Plan` that materialises one on a cron) that references a `BackupClass` and an `apps.cozystack.io/` application. -2. The core backup controller resolves the `BackupClass` and matches the application Kind to a driver-specific `strategy.backups.cozystack.io/` strategy. -3. The driver renders its strategy template against the live application object (`.Application`) and the BackupClass parameters (`.Parameters`), then creates the operator-native backup CR (`Backup` for mariadb, an HTTP call against the in-pod sidecar for ClickHouse, a barman-driven snapshot in `cnpg.io` for Postgres). -4. On success the driver creates a Cozystack `Backup` artefact in the same namespace; `RestoreJob` resources reference that artefact later. - -`BackupClass` is **cluster-scoped**: a single instance covers every tenant namespace. - -{{% alert color="info" %}} -Tenant users cannot list `BackupClass` resources under their kubeconfig (cluster-scoped resources are not reachable through the tenant `RoleBinding`). Once you create a `BackupClass`, **publish its name to tenants out-of-band** — in the platform handbook, in the ticket that onboards their application, or in your internal Slack channel. Tenants reference the name verbatim in `BackupJob.spec.backupClassName`. -{{% /alert %}} - -## Per-driver setup - -The strategies below are written for the in-cluster SeaweedFS `Bucket` application. If you use external S3 storage, drop the `endpointCA` / TLS sections and point the endpoint at your provider. - -### Postgres (CNPG strategy) - -The CNPG driver delegates to CloudNativePG's native barman backup. Each `BackupJob` is a barman snapshot streamed to S3; `RestoreJob` recreates the `cnpg.io/Cluster` from the archive. - -Create the strategy: - -```yaml -apiVersion: strategy.backups.cozystack.io/v1alpha1 -kind: CNPG -metadata: - name: postgres-data-cnpg-strategy -spec: - template: - serverName: "{{ .Application.metadata.name }}" - barmanObjectStore: - destinationPath: "s3://REPLACE_WITH_COSI_BUCKET_NAME/{{ .Application.metadata.name }}/" - endpointURL: "https://REPLACE_WITH_S3_ENDPOINT" - retentionPolicy: "30d" - endpointCA: - secretRef: - name: "{{ .Application.metadata.name }}-cnpg-backup-ca" - key: "ca.crt" - s3Credentials: - secretRef: - name: "{{ .Application.metadata.name }}-cnpg-backup-creds" - data: - compression: gzip - wal: - compression: gzip -``` - -Bind the application Kind: - -```yaml -apiVersion: backups.cozystack.io/v1alpha1 -kind: BackupClass -metadata: - name: postgres-data-backup -spec: - strategies: - - application: - apiGroup: apps.cozystack.io - kind: Postgres - strategyRef: - apiGroup: strategy.backups.cozystack.io - kind: CNPG - name: postgres-data-cnpg-strategy -``` - -Per-application Secrets the tenant must provision in the application namespace: - -| Secret | Keys | Purpose | -|---|---|---| -| `-cnpg-backup-creds` | `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` | S3 credentials consumed by barman | -| `-cnpg-backup-ca` *(only for self-signed endpoints)* | `ca.crt` | CA bundle the barman client trusts | - -Drop the `endpointCA` block in the strategy when your S3 endpoint has a publicly-trusted certificate. - -### MariaDB - -The MariaDB driver delegates to [mariadb-operator](https://github.com/mariadb-operator/mariadb-operator). Backups materialise as `k8s.mariadb.com/v1alpha1 Backup` CRs (logical `mariadb-dump`); restores materialise as `Restore` CRs that `mariadb-import` the dump back into the live database. - -Create the strategy: - -```yaml -apiVersion: strategy.backups.cozystack.io/v1alpha1 -kind: MariaDB -metadata: - name: mariadb-data-strategy -spec: - template: - storage: - s3: - bucket: "REPLACE_WITH_COSI_BUCKET_NAME" - endpoint: "REPLACE_WITH_S3_ENDPOINT" - prefix: "{{ .Application.metadata.name }}/" - accessKeyIdSecretKeyRef: - name: "{{ .Application.metadata.name }}-mariadb-backup-creds" - key: "AWS_ACCESS_KEY_ID" - secretAccessKeySecretKeyRef: - name: "{{ .Application.metadata.name }}-mariadb-backup-creds" - key: "AWS_SECRET_ACCESS_KEY" - tls: - enabled: true - caSecretKeyRef: - name: "{{ .Application.metadata.name }}-mariadb-backup-ca" - key: "ca.crt" - compression: gzip -``` - -The `endpoint` is **path-style without scheme** (e.g. `seaweedfs-s3..svc:8333` for the default in-cluster SeaweedFS — substitute the namespace where SeaweedFS is deployed in your environment). Drop the `tls` block entirely when the endpoint serves a publicly-trusted certificate. - -Bind the application Kind: - -```yaml -apiVersion: backups.cozystack.io/v1alpha1 -kind: BackupClass -metadata: - name: mariadb-data-backup -spec: - strategies: - - application: - apiGroup: apps.cozystack.io - kind: MariaDB - strategyRef: - apiGroup: strategy.backups.cozystack.io - kind: MariaDB - name: mariadb-data-strategy -``` - -Per-application Secrets the tenant must provision in the application namespace: - -| Secret | Keys | Purpose | -|---|---|---| -| `-mariadb-backup-creds` | `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` | S3 credentials consumed by mariadb-operator | -| `-mariadb-backup-ca` *(only for self-signed endpoints)* | `ca.crt` | CA bundle for TLS verification | - -{{% alert color="info" %}} -The chart-level `backup.*` block in `apps.cozystack.io/MariaDB` (the legacy `mariadb-dump` + `restic` path) is **deprecated** in favour of this BackupClass flow. Existing tenants with `backup.enabled=true` continue to render the legacy resources unchanged. -{{% /alert %}} - -### ClickHouse (Altinity strategy) - -The Altinity driver does **not** template a backup CR. It renders a small `PodTemplateSpec` that runs `curl + jq` against the in-pod [`clickhouse-backup`](https://github.com/Altinity/clickhouse-backup) HTTP API (port 7171) provided by a sidecar inside every `chi-*` Pod. - -{{% alert color="warning" %}} -The Altinity strategy **requires** `backup.enabled=true` on every ClickHouse application instance — that flag is what materialises the in-pod sidecar and the `clickhouse--backup-api-auth` Secret the strategy authenticates with. Unlike MariaDB, ClickHouse's chart-level `backup.*` block is **not** deprecated; the BackupClass flow piggybacks on the same sidecar. -{{% /alert %}} - -Create the strategy. The `template` is a `PodTemplateSpec` driving the sidecar; for the full reference template (with the shell script that POSTs `create_remote` / `restore_remote` and polls the action log) see [`examples/backups/clickhouse/01-create-strategy.sh`](https://github.com/cozystack/cozystack/blob/main/examples/backups/clickhouse/01-create-strategy.sh) in the cozystack repo. - -```yaml -apiVersion: strategy.backups.cozystack.io/v1alpha1 -kind: Altinity -metadata: - name: clickhouse-data-altinity-strategy -spec: - template: - spec: - restartPolicy: Never - containers: - - name: ch-backup-client - image: alpine:3.19 - env: - - name: API_USERNAME - valueFrom: - secretKeyRef: - name: clickhouse-{{ .Release.Name }}-backup-api-auth - key: username - - name: API_PASSWORD - valueFrom: - secretKeyRef: - name: clickhouse-{{ .Release.Name }}-backup-api-auth - key: password - command: ["/bin/sh", "-c"] - args: - # See examples/backups/clickhouse/01-create-strategy.sh for the - # full script: branches on .Mode (backup|restore) and either - # POSTs /backup/create_remote or /backup/restore_remote/, - # then polls /backup/actions for terminal status. - - | - # ... (truncated; see linked example) -``` - -Bind the application Kind. No parameters are required — the strategy template addresses the sidecar by deterministic Pod DNS and reads S3 credentials from the chart-emitted `-backup-s3` Secret directly. - -```yaml -apiVersion: backups.cozystack.io/v1alpha1 -kind: BackupClass -metadata: - name: clickhouse-data-backup -spec: - strategies: - - application: - apiGroup: apps.cozystack.io - kind: ClickHouse - strategyRef: - apiGroup: strategy.backups.cozystack.io - kind: Altinity - name: clickhouse-data-altinity-strategy -``` - -## Apply and verify - -Apply the strategy and `BackupClass` manifests: - -```bash -kubectl apply -f .yaml -kubectl apply -f .yaml -``` - -List the resources: - -```bash -kubectl get cnpgs.strategy.backups.cozystack.io -kubectl get mariadbs.strategy.backups.cozystack.io -kubectl get altinities.strategy.backups.cozystack.io -kubectl get backupclasses -``` - -Each strategy should report no error conditions; each `BackupClass` should list the strategy entries you defined. - -## Tenant onboarding - -Tenant users cannot create `Secret` objects under the standard Cozystack RBAC, and they cannot read `Bucket`-emitted credential Secrets. Before a tenant can run their first `BackupJob`, an administrator must provision per-tenant storage and the per-application credential Secrets each driver expects. Perform these steps once per managed-DB application the tenant wants to back up. Examples use `tenant-user` for the tenant namespace and `my-postgres` / `my-mariadb` / `my-clickhouse` for the application name — substitute as appropriate. - -### Provision the storage Bucket - -If the tenant does not have external S3 coordinates, provision an in-cluster `Bucket` in their namespace: - -```yaml -apiVersion: apps.cozystack.io/v1alpha1 -kind: Bucket -metadata: - name: db-backups - namespace: tenant-user -spec: - users: - backup: - readonly: false -``` - -```bash -kubectl apply -f bucket.yaml -kubectl -n tenant-user wait hr/bucket-db-backups --for=condition=ready --timeout=300s -``` - -The `Bucket` controller materialises a `bucket--backup` Secret in the namespace carrying a `BucketInfo` JSON blob — the S3 endpoint, bucket name, and access keys come from there. - -### Read the bucket credentials - -Run this once per shell session. Every per-driver block below reuses `$ACCESS_KEY`, `$SECRET_KEY`, and `/tmp/bucket.json`: - -```bash -kubectl -n tenant-user get secret bucket-db-backups-backup \ - -o jsonpath='{.data.BucketInfo}' | base64 -d > /tmp/bucket.json -ACCESS_KEY=$(jq -r .spec.secretS3.accessKeyID /tmp/bucket.json) -SECRET_KEY=$(jq -r .spec.secretS3.accessSecretKey /tmp/bucket.json) -``` - -### Create per-application credential Secrets - -Each driver expects per-application credential Secrets in the application namespace — the strategy templates reference them by name (`{{ .Application.metadata.name }}-...`). - -#### Postgres (CNPG) - -Project the credentials in the keys CNPG's barman client expects: - -```bash -kubectl -n tenant-user create secret generic my-postgres-cnpg-backup-creds \ - --from-literal=AWS_ACCESS_KEY_ID="$ACCESS_KEY" \ - --from-literal=AWS_SECRET_ACCESS_KEY="$SECRET_KEY" -``` - -When the S3 endpoint uses a self-signed certificate (the SeaweedFS default), also create a CA Secret: - -```bash -kubectl -n tenant-user create secret generic my-postgres-cnpg-backup-ca \ - --from-file=ca.crt=/path/to/ca.crt -``` - -#### MariaDB - -```bash -kubectl -n tenant-user create secret generic my-mariadb-mariadb-backup-creds \ - --from-literal=AWS_ACCESS_KEY_ID="$ACCESS_KEY" \ - --from-literal=AWS_SECRET_ACCESS_KEY="$SECRET_KEY" -``` - -For self-signed endpoints, add `my-mariadb-mariadb-backup-ca` carrying `ca.crt` the same way. - -#### ClickHouse - -No extra Secret is needed for the BackupClass flow. The Altinity strategy reads S3 credentials from the chart-emitted `-backup-s3` Secret directly. Make sure `backup.enabled: true` is set on every ClickHouse application instance the tenant wants to back up, and that the `backup.*` block in the application values carries the bucket coordinates (see the [ClickHouse application reference]({{% ref "/docs/v1.4/applications/clickhouse" %}})). - -## Handing off to tenants - -Tenants run backups and restores against the `BackupClass` names you created above using `BackupJob`, `Plan`, and `RestoreJob` resources. Walk them through the [Application Backup and Recovery]({{% ref "/docs/v1.4/applications/backup-and-recovery" %}}) guide; they do not need admin permissions to operate against an existing `BackupClass`. Before pointing them at the guide: - -- Communicate the available `BackupClass` names (tenants cannot list them — cluster-scoped resources are not reachable through the tenant `RoleBinding`). -- Ensure that for every managed application the tenant wants to back up, the per-application credential Secret described in [Tenant onboarding](#tenant-onboarding) already exists in their namespace. - -## Tenant escalation: driver-side diagnostics - -When a tenant's `BackupJob` or `RestoreJob` ends in `phase: Failed` and the `status.message` does not pinpoint the cause, the tenant cannot inspect operator-native CRs themselves — their RBAC excludes `cnpg.io`, `k8s.mariadb.com`, and the `pods/log` subresource. Run these commands on their behalf, using the `BackupJob` name they hand you: - -```bash -# Postgres (CloudNativePG) -kubectl -n tenant-user get backups.cnpg.io -# MariaDB -kubectl -n tenant-user get backups.k8s.mariadb.com,restores.k8s.mariadb.com -# ClickHouse — the strategy runs as a one-shot Pod that talks to the in-pod sidecar -kubectl -n tenant-user logs -l backups.cozystack.io/owned-by.BackupJobName=my-clickhouse-adhoc -``` - -For ClickHouse archive purges, the tenant cannot reach the in-pod `clickhouse-backup` sidecar HTTP API directly; on their request, exec into the ClickHouse pod and call `DELETE /backup//remote` against the local sidecar (the chart-emitted `clickhouse--backup-api-auth` Secret carries the credentials). diff --git a/content/en/docs/v1.4/operations/services/velero-backup-configuration.md b/content/en/docs/v1.4/operations/services/velero-backup-configuration.md deleted file mode 100644 index b141d2be..00000000 --- a/content/en/docs/v1.4/operations/services/velero-backup-configuration.md +++ /dev/null @@ -1,319 +0,0 @@ ---- -title: "Velero Backup Configuration" -linkTitle: "Velero Backup Configuration" -description: "Configure backup storage, strategies, and BackupClasses for cluster backups (for cluster administrators)." -weight: 30 ---- - -This guide is for **cluster administrators** who configure the backup infrastructure in Cozystack: S3 storage, Velero locations, backup **strategies**, and **BackupClasses**. Tenant users then use existing BackupClasses to create [BackupJobs and Plans]({{% ref "/docs/v1.4/virtualization/backup-and-recovery" %}}). - -{{% alert color="info" %}} -This page covers **Velero-driven** backups that bundle the application HelmRelease, CRs, and PVC snapshots — the model used for VMInstance / VMDisk. For data-only backups of managed databases (Postgres, MariaDB, ClickHouse, FoundationDB) driven by each operator's native mechanism, see [Managed Application Backup Configuration]({{% ref "/docs/v1.4/operations/services/managed-app-backup-configuration" %}}). -{{% /alert %}} - -## Prerequisites - -- Administrator access to the Cozystack (management) cluster. -- S3-compatible storage: if you want to store backups in Cozy you need enable SeaweedFS and create a Bucket or can use another external S3 service. -- Enable disabled by default component `cozystack.velero` in `bundles.enabledPackages` of the [Platform Package]({{% ref "/docs/v1.4/operations/configuration/platform-package" %}}). And for **tenant clusters**, set `spec.addons.velero.enabled` to `true` in the `Kubernetes` resource. - -## 1. Set up storage credentials and configuration - -Create the following resources in the **management cluster** in the `cozy-velero` namespace so that Velero can store backups and volume snapshots. - -### 1.1 Create a secret with S3 credentials - -```yaml -apiVersion: v1 -kind: Secret -metadata: - name: s3-credentials - namespace: cozy-velero -type: Opaque -stringData: - cloud: | - [default] - aws_access_key_id= - aws_secret_access_key= - - services = seaweed-s3 - [services seaweed-s3] - s3 = - endpoint_url = https://s3.tenant-name.cozystack.example.com -``` - -### 1.2 Configure BackupStorageLocation - -This resource defines where Velero stores backups (S3 bucket). - -```yaml -apiVersion: velero.io/v1 -kind: BackupStorageLocation -metadata: - name: default - namespace: cozy-velero -spec: - provider: aws - objectStorage: - bucket: - config: - checksumAlgorithm: '' - profile: "default" - s3ForcePathStyle: "true" - s3Url: https://s3.tenant-name.cozystack.example.com - credential: - name: s3-credentials - key: cloud -``` - -`BUCKET_NAME` can be found with: -```bash -kubectl get bucketclaim -A -o custom-columns=NAME:.metadata.name,NAMESPACE:.metadata.namespace,BUCKET_NAME:.status.bucketName,READY:.status.bucketReady -``` - -See [BackupStorageLocation](https://velero.io/docs/v1.17/api-types/backupstoragelocation/) in the Velero docs. - -Check that creation was successful: -```bash -k get BackupStorageLocation -n cozy-velero -``` - -Output should be similar to: -```bash -NAME PHASE LAST VALIDATED AGE DEFAULT -default Available 5s 3d9h true -``` - -### 1.3 Configure VolumeSnapshotLocation - -This resource defines the configuration for volume snapshots. - -```yaml -apiVersion: velero.io/v1 -kind: VolumeSnapshotLocation -metadata: - name: default - namespace: cozy-velero -spec: - provider: aws - credential: - name: s3-credentials - key: cloud - config: - region: "us-west-2" - profile: "default" -``` - -See [VolumeSnapshotLocation](https://velero.io/docs/v1.17/api-types/volumesnapshotlocation/) in the Velero docs. - -## 2. Define a backup strategy - -A **strategy** describes [Velero Backup](https://velero.io/docs/v1.17/api-types/backup/) template. It is a reusable template referenced by BackupClasses. - -In a strategy you define: - -- **Scope**: namespaces and resources (e.g. a tenant namespace or resources by label). -- **Volume handling**: whether to snapshot volumes and use `snapshotMoveData`. -- **Retention**: default backup TTL. - -Check the CRD group, version, and kind in your cluster: - -```bash -kubectl get crd | grep -i backup -kubectl explain --recursive -``` - -Example strategy for VMInstance (includes all VM resources and attached volumes): - -```yaml -apiVersion: strategy.backups.cozystack.io/v1alpha1 -kind: Velero -metadata: - name: vminstance-strategy -spec: - template: - restoreSpec: - existingResourcePolicy: update - includedNamespaces: - - '{{ .Application.metadata.namespace }}' - orLabelSelectors: - - matchLabels: - app.kubernetes.io/instance: 'vm-instance-{{ .Application.metadata.name }}' - - matchLabels: - apps.cozystack.io/application.kind: '{{ .Application.kind }}' - apps.cozystack.io/application.name: '{{ .Application.metadata.name }}' - includedResources: - - helmreleases.helm.toolkit.fluxcd.io - - virtualmachines.kubevirt.io - - virtualmachineinstances.kubevirt.io - - pods - - persistentvolumeclaims - - configmaps - - secrets - - controllerrevisions.apps - includeClusterResources: false - excludedResources: - - datavolumes.cdi.kubevirt.io - - spec: - includedNamespaces: - - '{{ .Application.metadata.namespace }}' - orLabelSelectors: - - matchLabels: - app.kubernetes.io/instance: 'vm-instance-{{ .Application.metadata.name }}' - - matchLabels: - apps.cozystack.io/application.kind: '{{ .Application.kind }}' - apps.cozystack.io/application.name: '{{ .Application.metadata.name }}' - includedResources: - - helmreleases.helm.toolkit.fluxcd.io - - virtualmachines.kubevirt.io - - virtualmachineinstances.kubevirt.io - - pods - - datavolumes.cdi.kubevirt.io - - persistentvolumeclaims - - configmaps - - secrets - - controllerrevisions.apps - includeClusterResources: false - storageLocation: '{{ .Parameters.backupStorageLocationName }}' - volumeSnapshotLocations: - - '{{ .Parameters.backupStorageLocationName }}' - snapshotVolumes: true - snapshotMoveData: true - ttl: 720h0m0s - itemOperationTimeout: 24h0m0s -``` - -Example strategy for VMDisk (disk and its volume only): - -```yaml -apiVersion: strategy.backups.cozystack.io/v1alpha1 -kind: Velero -metadata: - name: vmdisk-strategy -spec: - template: - restoreSpec: - existingResourcePolicy: update - includedNamespaces: - - '{{ .Application.metadata.namespace }}' - orLabelSelectors: - - matchLabels: - app.kubernetes.io/instance: 'vm-disk-{{ .Application.metadata.name }}' - - matchLabels: - apps.cozystack.io/application.kind: '{{ .Application.kind }}' - apps.cozystack.io/application.name: '{{ .Application.metadata.name }}' - includedResources: - - helmreleases.helm.toolkit.fluxcd.io - - persistentvolumeclaims - - configmaps - includeClusterResources: false - - spec: - includedNamespaces: - - '{{ .Application.metadata.namespace }}' - orLabelSelectors: - - matchLabels: - app.kubernetes.io/instance: 'vm-disk-{{ .Application.metadata.name }}' - - matchLabels: - apps.cozystack.io/application.kind: '{{ .Application.kind }}' - apps.cozystack.io/application.name: '{{ .Application.metadata.name }}' - includedResources: - - helmreleases.helm.toolkit.fluxcd.io - - persistentvolumeclaims - - configmaps - includeClusterResources: false - storageLocation: '{{ .Parameters.backupStorageLocationName }}' - volumeSnapshotLocations: - - '{{ .Parameters.backupStorageLocationName }}' - snapshotVolumes: true - snapshotMoveData: true - ttl: 720h0m0s - itemOperationTimeout: 24h0m0s -``` - -Template variables (`{{ .Application.* }}` and `{{ .Parameters.* }}`) are resolved from the ApplicationRef in the BackupJob/Plan and the parameters defined in the BackupClass. - -Don't forget to apply it into management cluster: - -```bash -kubectl apply -f velero-backup-strategy.yaml -``` - -## 3. Create a BackupClass - -A **BackupClass** binds a strategy to applications, you can define some Parameters - -Verify the BackupClass CRD in your cluster: - -```bash -kubectl get backupclasses -kubectl explain backupclasses.spec --recursive -``` - -```yaml -apiVersion: backups.cozystack.io/v1alpha1 -kind: BackupClass -metadata: - name: velero -spec: - strategies: - - strategyRef: - apiGroup: strategy.backups.cozystack.io - kind: Velero - name: vminstance-strategy - application: - kind: VMInstance - apiGroup: apps.cozystack.io - parameters: - backupStorageLocationName: default - - strategyRef: - apiGroup: strategy.backups.cozystack.io - kind: Velero - name: vmdisk-strategy - application: - kind: VMDisk - apiGroup: apps.cozystack.io - parameters: - backupStorageLocationName: default -``` - -Apply and list: - -```bash -kubectl apply -f backupclass.yaml -kubectl get backupclasses -``` - -## 4. How users run backups - -Once strategies and BackupClasses are in place, **tenant users** can run backups without touching Velero or storage configuration: - -- **One-off backup**: create a [BackupJob]({{% ref "/docs/v1.4/virtualization/backup-and-recovery#one-off-backup" %}}) that references a BackupClass. -- **Scheduled backups**: create a [Plan]({{% ref "/docs/v1.4/virtualization/backup-and-recovery#scheduled-backup" %}}) with a cron schedule and a BackupClass reference. - -Direct use of Velero CRDs (`Backup`, `Schedule`, `Restore`) remains available for advanced or recovery scenarios: - -```bash -kubectl get backup.velero.io -n cozy-velero -kubectl get schedule.velero.io -n cozy-velero -kubectl get restores.velero.io -n cozy-velero -``` - -If the [Velero CLI](https://velero.io/docs/v1.17/basic-install/#install-the-cli) is installed, you can also run: - -```bash -velero -n cozy-velero backup get -velero -n cozy-velero schedule get -velero -n cozy-velero restore get -``` - -To inspect the Velero logs, use the following command: - -```bash -kubectl logs -n cozy-velero -l app.kubernetes.io/name=velero --tail=100 -``` - -## 5. Restore from a backup - -Once strategies and BackupClasses are in place, tenant users can restore from a backup using **RestoreJob** resources. See the [Backup and Recovery]({{% ref "/docs/v1.4/virtualization/backup-and-recovery" %}}) guide for restore instructions covering VMInstance and VMDisk in-place restores. diff --git a/content/en/docs/v1.4/virtualization/backup-and-recovery.md b/content/en/docs/v1.4/virtualization/backup-and-recovery.md index 93696943..2f1250ea 100644 --- a/content/en/docs/v1.4/virtualization/backup-and-recovery.md +++ b/content/en/docs/v1.4/virtualization/backup-and-recovery.md @@ -8,7 +8,7 @@ aliases: - /docs/v1.4/kubernetes/backup-and-recovery --- -Cluster backup **strategies** and **BackupClasses** are configured by cluster administrators. If your tenant does not have a BackupClass yet, ask your administrator to follow the [Velero Backup Configuration]({{% ref "/docs/v1.4/operations/services/velero-backup-configuration" %}}) guide to set up storage, strategies, and BackupClasses. +Cluster backup **strategies** and **BackupClasses** are configured by cluster administrators. If your tenant does not have a BackupClass yet, ask your administrator to follow the [Backup Classes]({{% ref "/docs/v1.4/operations/services/backup-classes" %}}) guide to set up storage, strategies, and BackupClasses. This guide covers backing up and restoring **VMInstance** and **VMDisk** resources as a tenant user: running one-off and scheduled backups, checking backup status, and restoring from a backup using RestoreJobs. @@ -35,11 +35,15 @@ kubectl get backupclasses Example output: ``` -NAME AGE -velero 14m +NAME AGE +cozy-default 14m ``` -Use the BackupClass name when creating a BackupJob or Plan. +`cozy-default` is the platform-shipped BackupClass; its `strategies[]` array binds the Velero driver for both `VMInstance` and `VMDisk`. Use this name when creating a BackupJob or Plan, or substitute a sibling class name if your administrator has created one. + +{{% alert color="info" %}} +**Fresh-cluster bootstrap window.** On a fresh-cluster install, the Velero `BackupStorageLocation` `cozy-default` reports `Unavailable` for tens of seconds after `helm install` returns, until the platform's credentials projector lands `cozy-backups-creds` into `cozy-velero`. Velero rejects new `Backup` and `Restore` requests against `storageLocation: cozy-default` during that window. If a BackupJob you submit fails immediately with a Velero error referencing storage, wait and retry, or ask your administrator to check `kubectl -n cozy-velero get bsl cozy-default -o jsonpath='{.status.phase}' = Available`. See the [Backup Classes admin guide]({{% ref "/docs/v1.4/operations/services/backup-classes" %}}) for details. +{{% /alert %}} ## Back up a VMInstance @@ -60,7 +64,7 @@ spec: apiGroup: apps.cozystack.io kind: VMInstance name: my-vm - backupClassName: velero + backupClassName: cozy-default ``` Apply it and watch the status: @@ -88,7 +92,7 @@ spec: apiGroup: apps.cozystack.io kind: VMInstance name: my-vm - backupClassName: velero + backupClassName: cozy-default schedule: cron: "0 2 * * *" # Every day at 02:00 ``` @@ -108,7 +112,7 @@ Each scheduled run creates a BackupJob (and, on success, a Backup object) named You can back up a VMDisk independently — for example, to capture a specific disk without the VM configuration. {{% alert color="info" %}} -The BackupClass must include a strategy for `VMDisk`. Ask your administrator to add one if it is missing (see [Velero Backup Configuration]({{% ref "/docs/v1.4/operations/services/velero-backup-configuration" %}})). +The BackupClass must include a strategy for `VMDisk`. Ask your administrator to add one if it is missing (see [Backup Classes]({{% ref "/docs/v1.4/operations/services/backup-classes" %}})). {{% /alert %}} ```yaml @@ -122,7 +126,7 @@ spec: apiGroup: apps.cozystack.io kind: VMDisk name: my-disk - backupClassName: velero + backupClassName: cozy-default ``` Apply and check status: