PG doc: add note on statistics and snapshotting in PG considerations by martykulma · Pull Request #35804 · MaterializeInc/materialize

martykulma · 2026-03-31T13:45:53Z

Adds a section in PostgreSQL source considerations to highlight the relationship between parallel snapshot, console snapshot reporting, and up to date PostgreSQL table statistics.

github-actions · 2026-03-31T13:46:04Z

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

The PR title is descriptive and will make sense in the git log.
This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

def-

Thanks!

kay-kim

left a suggestion (feel free to ignore if I missed the point) and a question.

kay-kim · 2026-03-31T15:11:44Z

doc/user/data/postgres_source_details.yml

+    using ranges of
+    [`CTID`](https://www.postgresql.org/docs/current/ddl-system-columns.html#DDL-SYSTEM-COLUMNS-CTID).
+    Materialize uses [estimates](https://www.postgresql.org/docs/current/row-estimation-examples.html)
+    for the amount of data and rows that will be read. Missing or stale statistics will result in


Not sure but maybe?

The PostgreSQL source performs parallel snapshotting of tables by distributing rows among workers using ranges of [`CTID`](https://www.postgresql.org/docs/current/ddl-system-columns.html#DDL-SYSTEM-COLUMNS-CTID). Materialize uses [PostgreSQL statistics to estimate](https://www.postgresql.org/docs/current/row-estimation-examples.html) the amount of data and number of rows to read. Missing or stale statistics can result in uneven work distribution, reducing snapshot performance. They can also cause incorrect snapshot progress reporting in the Console. To avoid this situation, before creating the source in Materialize, ensure statistics are up to date by running PostgreSQL `ANALYZE` command.

Also, do you think in the actual postgres ingest tutorials, we should mention this Analyze step?

Are the tutorials you're referencing the vendor specific steps (e.g. https://preview.materialize.com/materialize/35804/ingest-data/postgres/alloydb/)? I put it in considerations as that section appears for each of the vendors. If there's another tutorials area, seems like we may want to put it there as well!

PG doc: add note on statistics and snapshotting in PG considerations

2e72144

martykulma force-pushed the maz-pg-src-doc-parallel-snapshot branch from 5aa9b79 to 2e72144 Compare March 31, 2026 13:49

martykulma marked this pull request as ready for review March 31, 2026 14:00

martykulma requested a review from a team as a code owner March 31, 2026 14:00

def- approved these changes Mar 31, 2026

View reviewed changes

kay-kim reviewed Mar 31, 2026

View reviewed changes

Reword from Kay

35c9ce5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PG doc: add note on statistics and snapshotting in PG considerations#35804

PG doc: add note on statistics and snapshotting in PG considerations#35804
martykulma wants to merge 2 commits intoMaterializeInc:mainfrom
martykulma:maz-pg-src-doc-parallel-snapshot

martykulma commented Mar 31, 2026

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

def- left a comment

Uh oh!

kay-kim left a comment

Uh oh!

kay-kim Mar 31, 2026

Uh oh!

martykulma Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

martykulma commented Mar 31, 2026

Uh oh!

github-actions bot commented Mar 31, 2026

PR title guidelines

Pre-merge checklist

Uh oh!

def- left a comment

Choose a reason for hiding this comment

Uh oh!

kay-kim left a comment

Choose a reason for hiding this comment

Uh oh!

kay-kim Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

martykulma Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants