Skip to content

feat: per-DataFile.first_row_id for v3 row lineage#2579

Open
Shekharrajak wants to merge 4 commits into
apache:mainfrom
Shekharrajak:feature/row-lineage-phase1-datafile-first-row-id
Open

feat: per-DataFile.first_row_id for v3 row lineage#2579
Shekharrajak wants to merge 4 commits into
apache:mainfrom
Shekharrajak:feature/row-lineage-phase1-datafile-first-row-id

Conversation

@Shekharrajak
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

What changes are included in this PR?

Closes the writer + reader gap for v3 row lineage: every ADDED DataFile in a v3 data manifest now gets a first_row_id stamped on write, and foreign-written manifests get one inherited on read. Combined with the existing TableMetadata.next_row_id / ManifestFile.first_row_id plumbing, this makes iceberg-rust spec-compliant for v3 row-id assignment end-to-end.

Java reference: ManifestReader.idAssigner (per-file inheritance) + SnapshotProducer row-id seeding.

Are these changes tested?

Unit tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

iceberg v3 has to set first-row-id

1 participant