Skip to content

fix(main): add support peer-auto-tls#330

Merged
Andrey Kolkov (androndo) merged 1 commit into
mainfrom
fix/peer-auto-tls
Jun 12, 2026
Merged

fix(main): add support peer-auto-tls#330
Andrey Kolkov (androndo) merged 1 commit into
mainfrom
fix/peer-auto-tls

Conversation

@androndo

@androndo Andrey Kolkov (androndo) commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Summary by CodeRabbit

  • New Features

    • EtcdMember supports a peer auto‑TLS mode and cluster adoption preserves legacy peer-auto-TLS.
    • Migration output now surfaces security warnings per-resource and provides a post‑apply security summary.
  • Documentation

    • Added documentation describing peer auto‑TLS detection, migration behavior, and operational implications.
  • Bug Fixes

    • Narrowed TLS peer validation error messaging for the exactly-one peer rule.
  • Tests

    • Added tests covering peer auto‑TLS rendering, adoption, and precedence.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@androndo, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 14 minutes and 45 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more credits in the billing tab to continue.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9cc34636-706c-4156-87c6-1f26bd08cf38

📥 Commits

Reviewing files that changed from the base of the PR and between f4afff4 and 109bfcc.

📒 Files selected for processing (12)
  • api/v1alpha2/cel_validation_test.go
  • api/v1alpha2/etcdmember_types.go
  • charts/etcd-operator/crd-bases/etcd-operator.cozystack.io_etcdmembers.yaml
  • cmd/etcd-migrate/main.go
  • cmd/etcd-migrate/output.go
  • controllers/etcdmember_controller.go
  • controllers/etcdmember_controller_test.go
  • controllers/helpers.go
  • docs/migration.md
  • internal/migrate/adopt.go
  • internal/migrate/adopt_test.go
  • internal/migrate/plan.go
📝 Walkthrough

Walkthrough

Adds peer auto-TLS support: new EtcdMemberTLS.PeerAutoTLS and CRD field, detection of legacy --peer-auto-tls during adoption with SecurityWarnings, propagation into adopted members, pod rendering to emit --peer-auto-tls, tests, CLI output, and documentation updates.

Changes

Peer Auto-TLS for Legacy Adoption

Layer / File(s) Summary
API types and schema updates
api/v1alpha2/etcdmember_types.go, charts/etcd-operator/crd-bases/..., api/v1alpha2/cel_validation_test.go
EtcdMemberTLS adds PeerAutoTLS boolean field marked as operator-managed annotation plumbing and mutually exclusive with PeerSecretRef; CRD schema adds matching spec.tls.peerAutoTLS property; CEL validation test narrowed to reference only spec.tls.peer.secretRef in exactly-one rule assertion.
TLS detection and pod building
controllers/helpers.go, controllers/etcdmember_controller.go
New reserved annotation constant AnnPeerAutoTLS enables legacy cluster-level peer auto-TLS detection; clusterPeerScheme returns HTTPS when either typed peer TLS or legacy annotation is present (with annotation ignored when typed TLS exists); memberPeerScheme treats member PeerAutoTLS as equivalent to PeerSecretRef for HTTPS selection; deriveMemberTLS propagates cluster-level auto-TLS to members when no explicit peer secret is configured; buildPod adds switch case for peer auto-TLS appending --peer-auto-tls without peer cert/key/CA flags or secret volume mount.
Controller unit tests
controllers/etcdmember_controller_test.go
TestBuildPod_PeerAutoTLS validates HTTPS peer listener with --peer-auto-tls flag, absence of BYO peer TLS flags, and no peer secret volume; TestDeriveMemberTLS extended with peerAutoTLS expectation field and test cases covering legacy annotation projection and precedence where explicit SecretRef overrides annotation; assertions now verify got.PeerAutoTLS matches expected value.
Legacy adoption detection and projection
internal/migrate/adopt.go
BuildAdoption detects legacy clusters with HTTPS peer URLs but no peer TLS configuration, setting reserved cluster annotation and appending security warning for unauthenticated peer encryption; deriveAdoptedMemberTLS reads cluster annotation and propagates PeerAutoTLS to adopted members alongside existing secret-ref and client-TLS projection modes.
Adoption test coverage
internal/migrate/adopt_test.go
TestBuildAdoption_PeerAutoTLS validates auto-TLS translation path setting cluster annotation and member PeerAutoTLS with security warning, and BYO peer-TLS path propagating legacy secret into spec.tls.peer.secretRef without warning.
Migration output and security warnings
internal/migrate/plan.go, cmd/etcd-migrate/output.go, cmd/etcd-migrate/main.go
ResourcePlan adds SecurityWarnings field for adoption downgrades; render prints per-plan security warnings with ⚠️ SECURITY: prefix immediately after action header; renderSecuritySummary re-surfaces adopted-plan warnings in post-apply summary; main wires renderSecuritySummary invocation after adoption stats.
Migration documentation
docs/migration.md
New "Peer auto-TLS" section details legacy --peer-auto-tls detection during migration, carry-forward via reserved cluster annotation, precedence rules with explicit spec.tls.peer.* config, security posture (encrypted but NOT authenticated), and off-ramp options (delete-and-recreate or manual rolling restart with temporary no-quorum window).

Sequence Diagram

sequenceDiagram
  participant Migrate as etcd-migrate
  participant BuildAdopt as BuildAdoption
  participant LiveCluster as Live Member Facts
  participant Plan as ResourcePlan
  participant Render as Migration Output

  Migrate->>BuildAdopt: start adoption detection
  BuildAdopt->>LiveCluster: scan member peer URLs for "https://"
  LiveCluster-->>BuildAdopt: found HTTPS peer URL
  BuildAdopt->>Plan: set AnnPeerAutoTLS annotation
  BuildAdopt->>Plan: append SecurityWarning ("peer-auto-tls")
  Migrate->>Render: render dry-run plan
  Render->>Render: print per-plan "⚠️  SECURITY:" warnings
  Migrate->>Migrate: --apply (adopt)
  Migrate->>Render: render post-apply summary
  Render->>Render: re-surface SecurityWarnings for adopted plans
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

In burrows where the cluster nodes play,
A rabbit nudges flags to show the way,
"Auto-TLS" whispers through legacy trees,
Encrypted footsteps, but trust on its knees,
I hop and note warnings—tread careful, please. 🐇

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(main): add support peer-auto-tls' accurately summarizes the main change: adding support for legacy peer auto-TLS mode in the etcd operator.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/peer-auto-tls

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for etcd's --peer-auto-tls (AutoTLS) configuration to enable peer TLS with self-signed certificates and no shared CA, which is particularly useful for adopting legacy clusters. The changes include updating API types, adding CEL validation rules for mutual exclusion, modifying the controller to apply the --peer-auto-tls flag, and implementing migration logic to detect and carry forward this setting from legacy clusters. The review feedback suggests improving test assertions in cel_validation_test.go to check for the full validation error message and adding a CEL validation rule on EtcdMemberTLS to enforce the mutual exclusion of peerAutoTLS and peerSecretRef.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread api/v1alpha2/cel_validation_test.go
Comment thread api/v1alpha2/cel_validation_test.go Outdated
Comment thread api/v1alpha2/etcdmember_types.go Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
charts/etcd-operator/crd-bases/etcd-operator.cozystack.io_etcdmembers.yaml (1)

1339-1345: ⚡ Quick win

Consider adding CEL validation for mutual exclusivity.

While EtcdMembers are internal resources created by the EtcdCluster controller (and validation happens at the cluster level), adding a schema-level CEL rule here would provide defense-in-depth against misconfiguration. The documentation mentions mutual exclusivity with PeerSecretRef, but nothing prevents both fields from being set if a user directly creates an EtcdMember.

🛡️ Optional CEL validation to enforce mutual exclusivity

Add this x-kubernetes-validations rule under the tls property (around line 1363):

                type: object
                x-kubernetes-validations:
                - message: spec.tls.peerAutoTLS and spec.tls.peerSecretRef are mutually exclusive
                  rule: '!(has(self.peerAutoTLS) && self.peerAutoTLS && has(self.peerSecretRef))'

Also consider updating the description to clarify this mode is primarily intended for legacy cluster adoption where --peer-auto-tls was already in use, as the security trade-off (unauthenticated peers) makes it unsuitable for new deployments.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@charts/etcd-operator/crd-bases/etcd-operator.cozystack.io_etcdmembers.yaml`
around lines 1339 - 1345, Add a CEL validation to the tls schema to enforce
mutual exclusivity between peerAutoTLS and peerSecretRef: update the tls
property (where peerAutoTLS and peerSecretRef are defined) to include an
x-kubernetes-validations entry that fails when both are present/true,
referencing the symbols peerAutoTLS and peerSecretRef; also update the
peerAutoTLS description to clarify it’s for legacy/--peer-auto-tls use and
highlights the unauthenticated peer security trade-off.
api/v1alpha2/cel_validation_test.go (1)

485-513: ⚡ Quick win

Add test case for autoTLS + certManager combination.

The test covers autoTLS alone (accepted) and autoTLS + secretRef (rejected), but does not verify that autoTLS + certManager is also rejected by the exactly-one-of rule. Without this coverage, a faulty CEL rule that only validates secretRef/autoTLS pairs could pass.

📝 Add autoTLS + certManager test case
 	if !strings.Contains(err.Error(), "exactly one of spec.tls.peer.secretRef") {
 		t.Fatalf("error did not mention peer mutual exclusion: %v", err)
 	}
+
+	// autoTLS + certManager — rejected.
+	badCM := validCluster("tls-peer-autotls-certmanager")
+	badCM.Spec.TLS = &lll.EtcdClusterTLS{Peer: &lll.PeerTLS{
+		AutoTLS:     &lll.PeerAutoTLS{},
+		CertManager: &lll.PeerCertManagerTLS{IssuerRef: lll.IssuerReference{Name: "my-peer-ca"}},
+	}}
+	err = k8s.Create(ctx, badCM)
+	if err == nil {
+		_ = k8s.Delete(ctx, badCM)
+		t.Fatalf("apiserver accepted peer.autoTLS + peer.certManager; expected rejection")
+	}
+	if !strings.Contains(err.Error(), "exactly one of spec.tls.peer.secretRef") {
+		t.Fatalf("error did not mention peer mutual exclusion for certManager combination: %v", err)
+	}
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/v1alpha2/cel_validation_test.go` around lines 485 - 513, Extend
TestCEL_TLSPeerAutoTLS to include a case that constructs a cluster with
Spec.TLS.Peer having both AutoTLS set (lll.PeerAutoTLS{}) and the cert-manager
option populated (e.g., Spec.TLS.Peer.CertManager = &lll.PeerCertManager{...}),
attempt to create it via k8s.Create(ctx, ...), assert creation is rejected (err
!= nil), and assert the error message mentions the same "exactly one of
spec.tls.peer..." mutual-exclusion text; update TestCEL_TLSPeerAutoTLS (and
reuse validCluster, ok/bad variables) so the new branch mirrors the existing
autoTLS+secretRef negative test but uses the CertManager field instead of
SecretRef.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/v1alpha2/etcdcluster_types.go`:
- Around line 216-228: Update the top-level comment for the peer TLS
configuration (the struct/field describing spec.tls.peer) to mention the new
unauthenticated autoTLS mode and its security implications: explicitly state
that spec.tls.peer can be mTLS (authenticated via SecretRef/CertManager) or
autoTLS (PeerAutoTLS/AutoTLS) which provides encryption without peer
authentication, and note mutual exclusivity with SecretRef/CertManager and that
autoTLS is insecure for untrusted networks; ensure the same wording/clarity is
applied to the CRD description that documents spec.tls.peer so both the
etcdcluster_types.go peer TLS doc and the generated CRD docs align with the
AutoTLS/PeerAutoTLS warning.

In `@internal/migrate/adopt.go`:
- Around line 188-213: The current logic unconditionally sets
cluster.Spec.TLS.Peer.AutoTLS when any member has an https:// PeerURL, which
incorrectly assumes a uniform peer mode; update the adoption logic (the block
using peerTLSDeclared and iterating members) to first scan all members' PeerURL
values and detect mixed schemes (both http:// and https://). If mixed schemes
are found, do not set spec.tls.peer.autoTLS and instead surface a clear adoption
error (e.g. append to plan.Errors or return a failure) so deriveAdoptedMemberTLS
and buildPod are not forced into a cluster-wide --peer-auto-tls mode; only set
spec.tls.peer.autoTLS when every observed PeerURL is https://. Ensure
plan.Warnings remains unchanged for the pure-https path and reference
deriveAdoptedMemberTLS and buildPod in the error message so callers know why
adoption was rejected.

---

Nitpick comments:
In `@api/v1alpha2/cel_validation_test.go`:
- Around line 485-513: Extend TestCEL_TLSPeerAutoTLS to include a case that
constructs a cluster with Spec.TLS.Peer having both AutoTLS set
(lll.PeerAutoTLS{}) and the cert-manager option populated (e.g.,
Spec.TLS.Peer.CertManager = &lll.PeerCertManager{...}), attempt to create it via
k8s.Create(ctx, ...), assert creation is rejected (err != nil), and assert the
error message mentions the same "exactly one of spec.tls.peer..."
mutual-exclusion text; update TestCEL_TLSPeerAutoTLS (and reuse validCluster,
ok/bad variables) so the new branch mirrors the existing autoTLS+secretRef
negative test but uses the CertManager field instead of SecretRef.

In `@charts/etcd-operator/crd-bases/etcd-operator.cozystack.io_etcdmembers.yaml`:
- Around line 1339-1345: Add a CEL validation to the tls schema to enforce
mutual exclusivity between peerAutoTLS and peerSecretRef: update the tls
property (where peerAutoTLS and peerSecretRef are defined) to include an
x-kubernetes-validations entry that fails when both are present/true,
referencing the symbols peerAutoTLS and peerSecretRef; also update the
peerAutoTLS description to clarify it’s for legacy/--peer-auto-tls use and
highlights the unauthenticated peer security trade-off.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4c9a5582-be3f-4638-a1f9-ed410799d6a1

📥 Commits

Reviewing files that changed from the base of the PR and between d7dcf6a and 10b0bd1.

📒 Files selected for processing (11)
  • api/v1alpha2/cel_validation_test.go
  • api/v1alpha2/etcdcluster_types.go
  • api/v1alpha2/etcdmember_types.go
  • api/v1alpha2/zz_generated.deepcopy.go
  • charts/etcd-operator/crd-bases/etcd-operator.cozystack.io_etcdclusters.yaml
  • charts/etcd-operator/crd-bases/etcd-operator.cozystack.io_etcdmembers.yaml
  • controllers/etcdmember_controller.go
  • controllers/etcdmember_controller_test.go
  • controllers/helpers.go
  • internal/migrate/adopt.go
  • internal/migrate/adopt_test.go

Comment thread api/v1alpha2/etcdcluster_types.go Outdated
Comment thread internal/migrate/adopt.go

@lllamnyp Timofei Larkin (lllamnyp) left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The capability is the right one — adopting a legacy --peer-auto-tls cluster needs replacement/scaled members to speak the same insecure peer mode, or they can't rejoin. Three changes before merge; the first is an API-shape consistency point, the other two are about making an insecure mode impossible to adopt unknowingly.

1. Don't expose this on the typed spec — make it a reserved annotation

spec.tls.peer.autoTLS adds a user-settable, first-class, insecure peer-TLS mode sitting right next to secretRef and certManager. Its only sanctioned use is legacy adoption (the field doc says so), yet nothing stops someone picking it for a brand-new cluster and getting an unauthenticated peer plane. "Migration-only" is enforced by a comment, not the API.

This is the same call we made for headlessServiceName / dataDirSubPath: migration-only knobs a user should never choose belong on a reserved annotation the operator interprets, not the typed spec. The mechanics differ here — peer-auto-tls is cluster-wide and must propagate to every member the operator creates (no shared CA ⇒ a replacement must also run --peer-auto-tls), so it doesn't self-wipe like those two. So the right home is a reserved cluster-level annotation, e.g. etcd-operator.cozystack.io/peer-auto-tls, that:

  • the cluster controller reads (in deriveMemberTLS, in place of spec.tls.peer.autoTLS) and propagates to every member it builds;
  • buildPod consumes (the existing internal member-level signal is fine as operator-managed plumbing) to render --peer-auto-tls;
  • takes effect only when no explicit peer secretRef/certManager is set, and is ignored/superseded otherwise (precedence has to live in code now, see below).

Then drop PeerAutoTLS from the public types and revert the CEL rule back to the two-way has(secretRef) != has(certManager) — auto-tls is no longer part of the typed union.

Acknowledged tradeoff (and it's the point): as an annotation you can't CEL-validate exclusivity or hard-prevent a user from setting it. That's acceptable — an undocumented reserved key is far less of a footgun than a discoverable typed autoTLS option, and it keeps the public TLS API to the two real, secure modes.

2. Document it in docs/migration.md

This PR ships an entire insecure peer mode plus migration carry-forward with no doc coverage (migration.md doesn't mention peer-auto-tls at all). Add a subsection under the TLS material covering:

  • what --peer-auto-tls is — per-member self-signed certs, no shared CA, encrypted but not authenticated (any TLS-capable workload reaching :2380 can peer);
  • that adoption detects and carries it forward (now via the reserved annotation) so replacement/scale keeps working — and why (no CA exists to do real mTLS, a plaintext-peer replacement could never join);
  • that the only off-ramp to real mTLS is delete-and-recreate (or the careful manual rolling restart, given persistent storage — strict-mTLS and auto-tls members can't peer, so there's a brief no-quorum window at the cutover);
  • that this is a distinct scenario from the explicit-mTLS SAN coverage issue already documented — don't conflate them.

3. Make the warning loud

The adopt.go warning is correct in content but renders identically to routine warning: lines in the plan and is easy to miss. For a security-posture downgrade, elevate it: a distinct marker (e.g. a SECURITY: / callout), and re-surface it in the post---apply summary, not only in the pre-apply plan — so an operator cannot finish a migration without seeing that they just adopted an unauthenticated peer plane. (Update its spec.tls.peer.autoTLS references to the new annotation while you're there.)

Andrey Kolkov (androndo) added a commit that referenced this pull request Jun 12, 2026
Address review feedback (PR #330): an unauthenticated peer plane must not
be a discoverable, CEL-blessed option for new clusters.

- Drop spec.tls.peer.autoTLS (and the PeerAutoTLS type); revert the PeerTLS
  CEL rule to the two-way has(secretRef) != has(certManager) union.
- Carry legacy --peer-auto-tls forward on a cluster-level reserved annotation
  etcd-operator.cozystack.io/peer-auto-tls instead. deriveMemberTLS reads it
  (superseded by an explicit peer secretRef/certManager) and propagates it to
  every member it builds; clusterPeerScheme treats it as https.
- etcd-migrate stamps the annotation and raises it as a SecurityWarning,
  rendered with a loud marker in the plan AND re-surfaced in the post-apply
  summary so an unauthenticated peer plane can't be adopted unnoticed.
- Document the mode, carry-forward and off-ramp in docs/migration.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Jun 12, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
internal/migrate/adopt.go (1)

201-215: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Reject mixed peer URL schemes before inferring cluster-wide auto-TLS.

The past review comment is still valid. This code unconditionally sets AnnPeerAutoTLS="true" after seeing the first https:// peer URL, but does not verify that ALL members use the same scheme. If the live cluster has mixed schemes (some http://, some https://), the annotation forces every future replacement or scale-out member into https + --peer-auto-tls mode (via deriveAdoptedMemberTLSbuildPod), which cannot interoperate with the http:// peers that were also adopted.

Scan all members first, detect mixed schemes, and block adoption with a clear error when schemes are inconsistent.

🛡️ Suggested guard (from past review)
 	peerTLSDeclared := cluster.Spec.TLS != nil && cluster.Spec.TLS.Peer != nil
 	if !peerTLSDeclared {
+		sawHTTP := false
+		sawHTTPS := false
 		for _, m := range members {
-			if strings.HasPrefix(m.PeerURL, "https://") {
-				if cluster.Annotations == nil {
-					cluster.Annotations = map[string]string{}
-				}
-				cluster.Annotations[controllers.AnnPeerAutoTLS] = "true"
-				plan.SecurityWarnings = append(plan.SecurityWarnings, fmt.Sprintf(
-					"cluster runs etcd --peer-auto-tls (member %q advertises %s; no peerSecret in the legacy spec): carried forward via the reserved %s annotation so members keep interoperating across replacement/scale. "+
-						"The peer plane is encrypted but NOT authenticated (no shared CA) — any TLS-capable workload that reaches :2380 can peer. Move to real mTLS (spec.tls.peer.secretRef or certManager) when you can; that is a delete-and-recreate since spec.tls is immutable.",
-					m.Name, m.PeerURL, controllers.AnnPeerAutoTLS))
-				break
-			}
+			switch {
+			case strings.HasPrefix(m.PeerURL, "https://"):
+				sawHTTPS = true
+			case strings.HasPrefix(m.PeerURL, "http://"):
+				sawHTTP = true
+			}
+		}
+		if sawHTTP && sawHTTPS {
+			plan.Errors = append(plan.Errors,
+				"legacy cluster advertises mixed peer URL schemes; adoption only supports a single cluster-wide peer mode")
+		} else if sawHTTPS {
+			if cluster.Annotations == nil {
+				cluster.Annotations = map[string]string{}
+			}
+			cluster.Annotations[controllers.AnnPeerAutoTLS] = "true"
+			plan.SecurityWarnings = append(plan.SecurityWarnings, "...")
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/migrate/adopt.go` around lines 201 - 215, The code currently sets
controllers.AnnPeerAutoTLS="true" after seeing the first member with PeerURL
starting "https://" (in the peerTLSDeclared check over members), which can force
--peer-auto-tls for future pods that cannot interoperate with any existing
http:// peers; instead, scan all members first to detect their schemes (iterate
members and record whether any PeerURL starts with "https://" and whether any
starts with "http://"), and if both schemes are present append a clear
adoption-blocking error to plan (or return an error) indicating mixed peer URL
schemes and refuse to set the annotation; only when all discovered members
consistently use https should you set
cluster.Annotations[controllers.AnnPeerAutoTLS]="true" and add the security
warning (the involved code paths include deriveAdoptedMemberTLS and buildPod
which will assume auto-TLS).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@internal/migrate/adopt.go`:
- Around line 201-215: The code currently sets controllers.AnnPeerAutoTLS="true"
after seeing the first member with PeerURL starting "https://" (in the
peerTLSDeclared check over members), which can force --peer-auto-tls for future
pods that cannot interoperate with any existing http:// peers; instead, scan all
members first to detect their schemes (iterate members and record whether any
PeerURL starts with "https://" and whether any starts with "http://"), and if
both schemes are present append a clear adoption-blocking error to plan (or
return an error) indicating mixed peer URL schemes and refuse to set the
annotation; only when all discovered members consistently use https should you
set cluster.Annotations[controllers.AnnPeerAutoTLS]="true" and add the security
warning (the involved code paths include deriveAdoptedMemberTLS and buildPod
which will assume auto-TLS).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3e080970-618e-4837-a5f6-63c7bb0755bb

📥 Commits

Reviewing files that changed from the base of the PR and between 10b0bd1 and 4dab563.

📒 Files selected for processing (12)
  • api/v1alpha2/cel_validation_test.go
  • api/v1alpha2/etcdmember_types.go
  • charts/etcd-operator/crd-bases/etcd-operator.cozystack.io_etcdmembers.yaml
  • cmd/etcd-migrate/main.go
  • cmd/etcd-migrate/output.go
  • controllers/etcdmember_controller.go
  • controllers/etcdmember_controller_test.go
  • controllers/helpers.go
  • docs/migration.md
  • internal/migrate/adopt.go
  • internal/migrate/adopt_test.go
  • internal/migrate/plan.go
💤 Files with no reviewable changes (1)
  • api/v1alpha2/cel_validation_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • controllers/etcdmember_controller.go
  • api/v1alpha2/etcdmember_types.go
  • internal/migrate/adopt_test.go
  • controllers/etcdmember_controller_test.go
  • charts/etcd-operator/crd-bases/etcd-operator.cozystack.io_etcdmembers.yaml
  • controllers/helpers.go

Signed-off-by: Andrey Kolkov <androndo@gmail.com>

@lllamnyp Timofei Larkin (lllamnyp) left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All three change-request items are addressed, cleanly. Approving.

  • Off the typed spec → reserved annotation. spec.tls.peer.autoTLS / the PeerAutoTLS struct are gone from the public types (the PeerTLS CEL rule is back to the two-way secretRef != certManager), replaced by the reserved cluster annotation etcd-operator.cozystack.io/peer-auto-tls. Precedence is enforced in code: clusterPeerAutoTLS returns false whenever any explicit spec.tls.peer is set, so real mTLS always wins and the annotation only applies when there's no typed peer TLS.
  • Documented. New docs/migration.md "Peer auto-TLS" section — what it is (per-member self-signed, no shared CA, encrypted-not-authenticated), that adoption carries it forward via the annotation to keep replacement/scale working and why, and that the off-ramp to real mTLS is delete-and-recreate.
  • Loud warning. A distinct SecurityWarnings channel (separate from routine Warnings), rendered inline as ⚠️ SECURITY: and re-surfaced after --apply via renderSecuritySummary — so an unauthenticated-peer adoption can't be missed.

What I checked beyond that:

  • Peer-URL scheme consistency (the main correctness risk): both memberPeerScheme and clusterPeerScheme return https for the auto-tls case, and the seed / pending / existing peer URLs all route through clusterPeerScheme, so advertised URLs, --initial-cluster entries, and the listener in buildPod agree. A mismatch here would prevent quorum; it's consistent.
  • No nil-deref in buildPod — the new switch takes the PeerAutoTLS branch before peerTLS, and the PeerSecretRef deref only runs when it's non-nil.
  • Mutual exclusion of PeerAutoTLS vs PeerSecretRef holds by construction in both deriveMemberTLS and deriveAdoptedMemberTLS.
  • Detection stamps the annotation only when no peer TLS is declared and a live member advertises https:// — matching the legacy operator's unconditional---peer-auto-tls-unless-PeerSecret behavior.
  • Build, full test suite, and make manifests generate (no drift) all pass.

One non-blocking, cosmetic nit: api/v1alpha2/cel_validation_test.go:480 loosened its assertion from the full CEL message to a prefix. The rule message in etcdcluster_types.go is unchanged and still carries the full string, so the original full-message match would still pass — this loosening is unrelated to the feature and weakens the test slightly for no gain. Worth reverting to the full-message assertion, but not a blocker.

LGTM.

@androndo Andrey Kolkov (androndo) merged commit 7d20b1a into main Jun 12, 2026
9 of 10 checks passed
@androndo Andrey Kolkov (androndo) deleted the fix/peer-auto-tls branch June 12, 2026 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-change bugfix controllers documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants