Skip to content

DAOS-18615 bio: never rollback si_unused_id (#17601)#17640

Open
NiuYawei wants to merge 2 commits intorelease/2.6from
niu/release-26/DAOS-18615
Open

DAOS-18615 bio: never rollback si_unused_id (#17601)#17640
NiuYawei wants to merge 2 commits intorelease/2.6from
niu/release-26/DAOS-18615

Conversation

@NiuYawei
Copy link
Contributor

@NiuYawei NiuYawei commented Mar 4, 2026

The initial WAL implementation allowed the upper layer to handle WAL commit failures via UNDO operations. This included rolling back the 'si_unused_id' to prevent gaps in WAL. However, current architecture no longer supports UNDO and instead excludes targets upon WAL commit failure. Consequently, the legacy si_unused_id rollback now violates the core assumption: "New transaction ID must be greater than the last checkpointed ID"

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

The initial WAL implementation allowed the upper layer to handle WAL
commit failures via UNDO operations. This included rolling back the
'si_unused_id' to prevent gaps in WAL. However, current architecture
no longer supports UNDO and instead excludes targets upon WAL commit
failure. Consequently, the legacy si_unused_id rollback now violates
the core assumption: "New transaction ID must be greater than the
last checkpointed ID"

Signed-off-by: Niu Yawei <yawei.niu@hpe.com>
@NiuYawei NiuYawei requested review from a team as code owners March 4, 2026 02:02
@NiuYawei NiuYawei added the clean-cherry-pick Cherry-pick from another branch that did not require additional edits label Mar 4, 2026
@github-actions
Copy link

github-actions bot commented Mar 4, 2026

Ticket title is 'Aurora: Node seg faults with Assertion 'pinfo->pi_last_checkpoint == 0 || store->stor_ops->so_wal_id_cmp(store, wr_tx, pinfo->pi_last_checkpoint) > 0' '
Status is 'Awaiting backport'
Labels: 'request_for_2.6.5,test_2.8'
https://daosio.atlassian.net/browse/DAOS-18615

@daosbuild3
Copy link
Collaborator

Test stage Functional Hardware Medium Verbs Provider completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-17640/2/testReport/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clean-cherry-pick Cherry-pick from another branch that did not require additional edits

Development

Successfully merging this pull request may close these issues.

2 participants