Skip to content

DAOS-18487 rebuild: asynchronous discard handler (#17621)#17646

Open
gnailzenh wants to merge 1 commit intorelease/2.6from
liang/rebuild/b2_6_discard_serial
Open

DAOS-18487 rebuild: asynchronous discard handler (#17621)#17646
gnailzenh wants to merge 1 commit intorelease/2.6from
liang/rebuild/b2_6_discard_serial

Conversation

@gnailzenh
Copy link
Contributor

  • pool_discard doesn't wait for completion of discard anymore, instead if create discard ULT and return immediately.
  • Fix a rance and make sure no concurrent discards, so rebuild system doesn't start multiple discard ULTs even if there is resend discard.

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

- pool_discard doesn't wait for completion of discard anymore,
  instead if create discard ULT and return immediately.
- Fix a rance and make sure no concurrent discards, so rebuild
  system doesn't start multiple discard ULTs even if there is
  resend discard.

Signed-off-by: Liang Zhen <gnailzenh@gmail.com>
@gnailzenh gnailzenh requested review from a team as code owners March 5, 2026 04:23
@github-actions
Copy link

github-actions bot commented Mar 5, 2026

Errors are Unable to load ticket data
https://daosio.atlassian.net/browse/DAOS-18487

@daosbuild3
Copy link
Collaborator

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17646/1/execution/node/404/log

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

@daosbuild3
Copy link
Collaborator

rc = ds_iv_ns_reint_prep(pool->sp_iv_ns); /* cleanup IV cache */
ABT_mutex_unlock(pool->sp_mutex);

D_INFO(DF_UUID " discard is scheduled\n", DP_UUID(arg->pool_uuid));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might want to save that D_INFO for when rc == 0 case?

if (tls->mpt_fini)
D_GOTO(free_notls, rc);

ABT_mutex_lock(pool->sp_mutex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why check pool->sp_discard_status inside the loop, since it will only ever become a nonzero value at the same time that pool->sp_discarding will be set to 0, at least from what it looks like in ds_pool_tgt_discard_ult() ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants