Skip to content

docs(replication): document AP consistency model for Push/PushDelta (PILOT-280)#21

Open
matthew-pilot wants to merge 1 commit into
mainfrom
openclaw/pilot-280-20260530-080345
Open

docs(replication): document AP consistency model for Push/PushDelta (PILOT-280)#21
matthew-pilot wants to merge 1 commit into
mainfrom
openclaw/pilot-280-20260530-080345

Conversation

@matthew-pilot
Copy link
Copy Markdown
Collaborator

What failed

The rendezvous replication Push() and PushDelta() methods use fire-and-forget broadcast — mutations are committed to the local WAL but standbys are not acknowledged before the primary returns. If the primary crashes before a standby has received the latest deltas, those mutations are lost. This tradeoff was undocumented, leaving operators and clients unable to reason about data-loss risk on failover.

Why this fix

Document the AP (Available / Partition-Tolerant) consistency model explicitly in the replication package doc comment. This is consistent with the actual runtime behavior — the primary remains available under partition at the cost of potential data loss on failover. The documentation cites the specific mechanisms (Push/PushDelta semantics, 1 s replicaPushInterval).

Sync-replication mode (primary waits for ≥1 standby ack before acknowledging the caller) is noted as not-yet-implemented, with a reference to PILOT-280 for tracking.

Verification

  • go build ./... — clean
  • go vet ./... — clean
  • go test ./... — all packages pass
  • Change is comment-only (1 file, +18 lines), no runtime behavior change

Closes PILOT-280

…PILOT-280)

The replication Manager uses fire-and-forget broadcast: Push() and
PushDelta() commit writes locally first, then send snapshots/deltas
to all standbys without waiting for acknowledgment.  If the primary
crashes before a standby receives the latest deltas, mutations are
lost.  This is intentional AP design — the rendezvous stays available
under partition at the cost of potential data loss on failover.

This commit documents the tradeoff explicitly in the package doc
comment.  Sync-replication mode (primary waits for ≥1 standby ack)
is not yet implemented — see PILOT-280 for discussion.

Closes PILOT-280
@codecov
Copy link
Copy Markdown

codecov Bot commented May 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@matthew-pilot
Copy link
Copy Markdown
Collaborator Author

Matthew PR Status — #21

Title: docs(replication): document AP consistency model for Push/PushDelta (PILOT-280)
Status: OPEN | Mergeable: MERGEABLE
Author: @matthew-pilot (matthew-pilot bot)
Created: 2026-05-30T08:04:33Z
Branch: openclaw/pilot-280-20260530-080345 -> main
Changes: +18/-0 across 1 file

Tickets

Labels

None

Files Changed

  • replication/replication.go (+18/-0)

Next Actions

  • Explain: command /pr explain #21 — detailed analysis
  • Canary retry: command /pr retry-canary #21 (if CI failed)
  • Fix & update: command /pr fix #21 <instructions>
  • Rebase: command /pr rebase #21
  • Close: command /pr close #21 <reason>

Auto-generated status check by matthew-pr-worker

@matthew-pilot
Copy link
Copy Markdown
Collaborator Author

Matthew PR Explain — #21

What this PR does

docs(replication): document AP consistency model for Push/PushDelta (PILOT-280)

Scope

  • Files: 1 file
  • Delta: +18/-0 lines
  • Labels: none
  • Mergeable: MERGEABLE

Tickets

Files

  • replication/replication.go (+18/-0)

Review Notes

  • This is an automated code-maintenance PR from matthew-pilot
  • Operator review required before merge
  • Check CI status and canary results above

Auto-generated explain by matthew-pr-worker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant